using the psych package to generate and test structural models · 2020. 8. 22. · 1 the psych...

50
Using the psych package to generate and test structural models William Revelle August 22, 2020 Contents 1 The psych package 2 1.1 Preface ...................................... 2 1.2 Creating and modeling structural relations ................... 2 2 Functions for generating correlational matrices with a particular struc- ture 3 2.1 sim.congeneric ................................. 4 2.2 sim.hierarchical ................................ 6 2.3 sim.item and sim.circ ............................. 7 2.4 sim.structure .................................. 7 2.4.1 f x is a vector implies a congeneric model ............... 7 2.4.2 f x is a matrix implies an independent factors model: ......... 10 2.4.3 f x is a matrix and Phi 6= I is a correlated factors model ........ 10 2.4.4 f x and f y are matrices, and Phi 6= I represents their correlations ... 15 2.4.5 A hierarchical structure among the latent predictors. ......... 15 3 Exploratory functions for analyzing structure 15 3.1 Exploratory simple structure models ...................... 18 3.2 Exploratory hierarchical models ......................... 22 3.2.1 A bifactor solution ............................ 22 3.2.2 A hierarchical solution .......................... 22 4 Exploratory Structural Equation Modeling (ESEM) 26 5 Confirmatory models 28 5.1 Using psych as a front end for the sem package ................ 30 5.2 Testing a congeneric model versus a tau equivalent model .......... 30 1

Upload: others

Post on 09-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Using the psych package to generate and test structural

models

William Revelle

August 22 2020

Contents

1 The psych package 211 Preface 212 Creating and modeling structural relations 2

2 Functions for generating correlational matrices with a particular struc-ture 321 simcongeneric 422 simhierarchical 623 simitem and simcirc 724 simstructure 7

241 fx is a vector implies a congeneric model 7242 fx is a matrix implies an independent factors model 10243 fx is a matrix and Phi 6= I is a correlated factors model 10244 fx and fy are matrices and Phi 6= I represents their correlations 15245 A hierarchical structure among the latent predictors 15

3 Exploratory functions for analyzing structure 1531 Exploratory simple structure models 1832 Exploratory hierarchical models 22

321 A bifactor solution 22322 A hierarchical solution 22

4 Exploratory Structural Equation Modeling (ESEM) 26

5 Confirmatory models 2851 Using psych as a front end for the sem package 3052 Testing a congeneric model versus a tau equivalent model 30

1

53 Testing the dimensionality of a hierarchical data set by creating the model 3254 Testing the dimensionality based upon an exploratory analysis 3355 Specifying a three factor model 3456 Allowing for an oblique solution 3557 Extract a bifactor solution using omega and then test that model using sem 37

571 sem of Thurstone 9 variable problem 3758 Examining a hierarchical solution 4059 Estimating Omega using EFA followed by CFA 43

6 Summary and conclusion 45

2

1 The psych package

11 Preface

The psych package (Revelle 2020) has been developed to include those functions mostuseful for teaching and learning basic psychometrics and personality theory Functionshave been developed for many parts of the analysis of test data including basic de-scriptive statistics (describe and pairspanels) dimensionality analysis (ICLUST VSSprincipal factorpa) reliability analysis (omega guttman) and eventual scale construc-tion (clustercor scoreitems) The use of these and other functions is describedin more detail in the accompanying vignette (overviewpdf) as well as in the completeuserrsquos manual and the relevant help pages (These vignettes are also available at httpspersonality-projectorgroverviewpdf) and httpspersonality-projectorg

rpsych_for_sempdf)

This vignette is concerned with the problem of modeling structural data and using thepsych package as a front end for the much more powerful sem package of John Fox (Fox2006 2009 Fox et al 2013) Future releases of this vignette will include examples forusing the lavaan package of Yves Rosseel (Rosseel 2012)

The first section discusses how to simulate particular latent variable structures The secondconsiders several Exploratory Factor Analysis (EFA) solutions to these problems Thethird section considers how to do confirmatory factor analysis and structural equationmodeling using the sem package but with the input prepared using functions in the psychpackage

12 Creating and modeling structural relations

One common application of psych is the creation of simulated data matrices with particularstructures to use as examples for principal components analysis factor analysis clusteranalysis and structural equation modeling This vignette describes some of the functionsused for creating analyzing and displaying such data sets The examples use two otherpackages Rgraphviz and sem Although not required to use the psych package sem isrequired for these examples Although Rgraphviz had been used for the graphical displaysit has now been replaced with graphical functions within psych The analyses themselvesrequire only the sem package to do the structural modeling

Since writing this vignette the power of the lavaan has allowed me to replace many of thesem commands with calls to lavaan lavaan is loaded along with psych

3

2 Functions for generating correlational matrices with a par-ticular structure

The sim family of functions create data sets with particular structure Most of these func-tions have default values that will produce useful examples Although graphical summariesof these structures will be shown here some of the options of the graphical displays will bediscussed in a later section

The sim functions include

simstructure A function to combine a measurement and structural model into onedata matrix Useful for understanding structural equation models Combined withstructurediagram to see the proposed structure

simcongeneric A function to create congeneric itemstests for demonstrating classicaltest theory This is just a special case of simstructure

simhierarchical A function to create data with a hierarchical (bifactor) structure

simgeneral A function to simulate a general factor and multiple group factors This isdone in a somewhat more obvious although less general method than simhierarchical

simCor A generalized function to simulate data from a specified correlation matrix

simitem A function to create items that either have a simple structure or a circumplexstructure

simcirc Create data with a circumplex structure

simdichot Create dichotomous item data with a simple or circumplex structure

simminor Create a factor structure for nvar variables defined by nfact major factors andnvar

2 ldquominorrdquo factors for n observations

simparallel Create a number of simulated data sets using simminor to show how parallelanalysis works

simrasch Create IRT data following a Rasch model

simirt Create a two parameter IRT logistic (2PL) model

simanova Simulate a 3 way balanced ANOVA or linear model with or without repeatedmeasures Useful for teaching courses in research methods

To make these examples replicable for readers all simulations are prefaced by setting therandom seed to a fixed (and for some memorable) number (Adams 1980) For normal useof the simulations this is not necessary

4

21 simcongeneric

Classical test theory considers tests to be tau equivalent if they have the same covariancewith a vector of latent true scores but perhaps different error variances Tests are consid-ered congeneric if they each have the same true score component (perhaps to a differentdegree) and independent error components The simcongeneric function may be usedto generate either structure

The first example considers four tests with equal loadings on a latent factor (that is aτ equivalent model) If the number of subjects is not specified a population correlationmatrix will be generated If N is specified then the sample correlation matrix is returnedIf the ldquoshortrdquo option is FALSE then the population matrix sample matrix and sampledata are all returned as elements of a list

gt library(psych)

gt library(psychTools)

gt setseed(42)

gt tau lt- simcongeneric(loads=c(8888)) population values

gt tausamp lt- simcongeneric(loads=c(8888)N=100) sample correlation matrix for 100 cases

gt round(tausamp2)

V1 V2 V3 V4

V1 100 068 072 066

V2 068 100 065 067

V3 072 065 100 076

V4 066 067 076 100

gt tausamp lt- simcongeneric(loads=c(8888)N=100 short=FALSE)

gt tausamp

Call NULL

$model (Population correlation matrix)

V1 V2 V3 V4

V1 100 064 064 064

V2 064 100 064 064

V3 064 064 100 064

V4 064 064 064 100

$r (Sample correlation matrix for sample size = 100 )

V1 V2 V3 V4

V1 100 070 062 058

V2 070 100 065 064

V3 062 065 100 059

V4 058 064 059 100

gt dim(tausamp$observed)

[1] 100 4

In this last case the generated data are retrieved from tausamp$observed Congenericdata are created by specifying unequal loading values The default values are loadings ofc(8765) As seen in Figure 1 tau equivalence is the special case where all paths areequal

5

gt cong lt- simcongeneric(N=100)

gt round(cong2)

V1 V2 V3 V4

V1 100 057 053 046

V2 057 100 035 041

V3 053 035 100 043

V4 046 041 043 100

gt plotnew()

gt m1 lt- structurediagram(c(abcd))

Structural model

x1

x2

x3

x4

X1

a

b

c

d

Figure 1 Tau equivalent tests are special cases of congeneric tests Tau equivalence assumesa=b=c=d

6

22 simhierarchical

The previous function simcongeneric is used when one factor accounts for the patternof correlations A slightly more complicated model is when one broad factor and severalnarrower factors are observed An example of this structure might be the structure ofmental abilities where there is a broad factor of general ability and several narrower factors(eg spatial ability verbal ability working memory capacity) Another example is in themeasure of psychopathology where a broad general factor of neuroticism is seen along withmore specific anxiety depression and aggression factors This kind of structure may besimulated with simhierarchical specifying the loadings of each sub factor on a generalfactor (the g-loadings) as well as the loadings of individual items on the lower order factors(the f-loadings) An early paper describing a bifactor structure was by Holzinger andSwineford (1937) A helpful description of what makes a good general factor is that ofJensen and Weng (1994)

For those who prefer real data to simulated data six data sets are included in the bifac-

tor data set One is the original 14 variable problem of Holzinger and Swineford (1937)(holzinger) a second is a nine variable problem adapted by Bechtoldt (1961) from Thur-stone and Thurstone (1941) (the data set is used as an example in the SAS manual anddiscussed in great detail by McDonald (1999)) a third is from a recent paper by Reiseet al (2007) with 16 measures of patient reports of interactions with their health careprovider

gt setseed(42)

gt gload=matrix(c(987)nrow=3)

gt fload lt- matrix(c(876rep(09)765

+ rep(09)764) ncol=3)

gt fload echo it to see the structureSw

[1] [2] [3]

[1] 08 00 00

[2] 07 00 00

[3] 06 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 07

[8] 00 00 06

[9] 00 00 04

gt bifact lt- simhierarchical(gload=gloadfload=fload)

gt round(bifact2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 056 048 040 035 029 035 030 020

V2 056 100 042 035 030 025 031 026 018

V3 048 042 100 030 026 022 026 023 015

V4 040 035 030 100 042 035 027 024 016

V5 035 030 026 042 100 030 024 020 013

V6 029 025 022 035 030 100 020 017 011

V7 035 031 026 027 024 020 100 042 028

7

V8 030 026 023 024 020 017 042 100 024

V9 020 018 015 016 013 011 028 024 100

These data can be represented as either a bifactor (Figure 2 panel A) or hierarchical(Figure 2 Panel B) factor solution The analysis was done with the omega function

23 simitem and simcirc

Many personality questionnaires are thought to represent multiple independent factors Aparticularly interesting case is when there are two factors and the items either have simplestructure or circumplex structure Examples of such items with a circumplex structure aremeasures of emotion (Rafaeli and Revelle 2006) where many different emotion terms canbe arranged in a two dimensional space but where there is no obvious clustering of itemsTypical personality scales are constructed to have simple structure where items load onone and only one factor

An additional challenge to measurement with emotion or personality items is that the itemscan be highly skewed and are assessed with a small number of discrete categories (do notagree somewhat agree strongly agree)

The more general simitem function and the more specific simcirc functions simulateitems with a two dimensional structure with or without skew and varying the number ofcategories for the items An example of a circumplex structure is shown in Figure 3

24 simstructure

A more general case is to consider three matrices fxφxy fy which describe in turn ameasurement model of x variables fx a measurement model of y variables fx and acovariance matrix between and within the two sets of factors If fx is a vector and fy andphixy are NULL then this is just the congeneric model If fx is a matrix of loadings withn rows and c columns then this is a measurement model for n variables across c factorsIf phixy is not null but fy is NULL then the factors in fx are correlated Finally if allthree matrices are not NULL then the data show the standard linear structural relations(LISREL) structure

Consider the following examples

241 fx is a vector implies a congeneric model

gt setseed(42)

gt fx lt- c(9876)

gt cong1 lt- simstructure(fx)

gt cong1

Call simstructure(fx = fx)

$model (Population correlation matrix)

8

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 2: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

53 Testing the dimensionality of a hierarchical data set by creating the model 3254 Testing the dimensionality based upon an exploratory analysis 3355 Specifying a three factor model 3456 Allowing for an oblique solution 3557 Extract a bifactor solution using omega and then test that model using sem 37

571 sem of Thurstone 9 variable problem 3758 Examining a hierarchical solution 4059 Estimating Omega using EFA followed by CFA 43

6 Summary and conclusion 45

2

1 The psych package

11 Preface

The psych package (Revelle 2020) has been developed to include those functions mostuseful for teaching and learning basic psychometrics and personality theory Functionshave been developed for many parts of the analysis of test data including basic de-scriptive statistics (describe and pairspanels) dimensionality analysis (ICLUST VSSprincipal factorpa) reliability analysis (omega guttman) and eventual scale construc-tion (clustercor scoreitems) The use of these and other functions is describedin more detail in the accompanying vignette (overviewpdf) as well as in the completeuserrsquos manual and the relevant help pages (These vignettes are also available at httpspersonality-projectorgroverviewpdf) and httpspersonality-projectorg

rpsych_for_sempdf)

This vignette is concerned with the problem of modeling structural data and using thepsych package as a front end for the much more powerful sem package of John Fox (Fox2006 2009 Fox et al 2013) Future releases of this vignette will include examples forusing the lavaan package of Yves Rosseel (Rosseel 2012)

The first section discusses how to simulate particular latent variable structures The secondconsiders several Exploratory Factor Analysis (EFA) solutions to these problems Thethird section considers how to do confirmatory factor analysis and structural equationmodeling using the sem package but with the input prepared using functions in the psychpackage

12 Creating and modeling structural relations

One common application of psych is the creation of simulated data matrices with particularstructures to use as examples for principal components analysis factor analysis clusteranalysis and structural equation modeling This vignette describes some of the functionsused for creating analyzing and displaying such data sets The examples use two otherpackages Rgraphviz and sem Although not required to use the psych package sem isrequired for these examples Although Rgraphviz had been used for the graphical displaysit has now been replaced with graphical functions within psych The analyses themselvesrequire only the sem package to do the structural modeling

Since writing this vignette the power of the lavaan has allowed me to replace many of thesem commands with calls to lavaan lavaan is loaded along with psych

3

2 Functions for generating correlational matrices with a par-ticular structure

The sim family of functions create data sets with particular structure Most of these func-tions have default values that will produce useful examples Although graphical summariesof these structures will be shown here some of the options of the graphical displays will bediscussed in a later section

The sim functions include

simstructure A function to combine a measurement and structural model into onedata matrix Useful for understanding structural equation models Combined withstructurediagram to see the proposed structure

simcongeneric A function to create congeneric itemstests for demonstrating classicaltest theory This is just a special case of simstructure

simhierarchical A function to create data with a hierarchical (bifactor) structure

simgeneral A function to simulate a general factor and multiple group factors This isdone in a somewhat more obvious although less general method than simhierarchical

simCor A generalized function to simulate data from a specified correlation matrix

simitem A function to create items that either have a simple structure or a circumplexstructure

simcirc Create data with a circumplex structure

simdichot Create dichotomous item data with a simple or circumplex structure

simminor Create a factor structure for nvar variables defined by nfact major factors andnvar

2 ldquominorrdquo factors for n observations

simparallel Create a number of simulated data sets using simminor to show how parallelanalysis works

simrasch Create IRT data following a Rasch model

simirt Create a two parameter IRT logistic (2PL) model

simanova Simulate a 3 way balanced ANOVA or linear model with or without repeatedmeasures Useful for teaching courses in research methods

To make these examples replicable for readers all simulations are prefaced by setting therandom seed to a fixed (and for some memorable) number (Adams 1980) For normal useof the simulations this is not necessary

4

21 simcongeneric

Classical test theory considers tests to be tau equivalent if they have the same covariancewith a vector of latent true scores but perhaps different error variances Tests are consid-ered congeneric if they each have the same true score component (perhaps to a differentdegree) and independent error components The simcongeneric function may be usedto generate either structure

The first example considers four tests with equal loadings on a latent factor (that is aτ equivalent model) If the number of subjects is not specified a population correlationmatrix will be generated If N is specified then the sample correlation matrix is returnedIf the ldquoshortrdquo option is FALSE then the population matrix sample matrix and sampledata are all returned as elements of a list

gt library(psych)

gt library(psychTools)

gt setseed(42)

gt tau lt- simcongeneric(loads=c(8888)) population values

gt tausamp lt- simcongeneric(loads=c(8888)N=100) sample correlation matrix for 100 cases

gt round(tausamp2)

V1 V2 V3 V4

V1 100 068 072 066

V2 068 100 065 067

V3 072 065 100 076

V4 066 067 076 100

gt tausamp lt- simcongeneric(loads=c(8888)N=100 short=FALSE)

gt tausamp

Call NULL

$model (Population correlation matrix)

V1 V2 V3 V4

V1 100 064 064 064

V2 064 100 064 064

V3 064 064 100 064

V4 064 064 064 100

$r (Sample correlation matrix for sample size = 100 )

V1 V2 V3 V4

V1 100 070 062 058

V2 070 100 065 064

V3 062 065 100 059

V4 058 064 059 100

gt dim(tausamp$observed)

[1] 100 4

In this last case the generated data are retrieved from tausamp$observed Congenericdata are created by specifying unequal loading values The default values are loadings ofc(8765) As seen in Figure 1 tau equivalence is the special case where all paths areequal

5

gt cong lt- simcongeneric(N=100)

gt round(cong2)

V1 V2 V3 V4

V1 100 057 053 046

V2 057 100 035 041

V3 053 035 100 043

V4 046 041 043 100

gt plotnew()

gt m1 lt- structurediagram(c(abcd))

Structural model

x1

x2

x3

x4

X1

a

b

c

d

Figure 1 Tau equivalent tests are special cases of congeneric tests Tau equivalence assumesa=b=c=d

6

22 simhierarchical

The previous function simcongeneric is used when one factor accounts for the patternof correlations A slightly more complicated model is when one broad factor and severalnarrower factors are observed An example of this structure might be the structure ofmental abilities where there is a broad factor of general ability and several narrower factors(eg spatial ability verbal ability working memory capacity) Another example is in themeasure of psychopathology where a broad general factor of neuroticism is seen along withmore specific anxiety depression and aggression factors This kind of structure may besimulated with simhierarchical specifying the loadings of each sub factor on a generalfactor (the g-loadings) as well as the loadings of individual items on the lower order factors(the f-loadings) An early paper describing a bifactor structure was by Holzinger andSwineford (1937) A helpful description of what makes a good general factor is that ofJensen and Weng (1994)

For those who prefer real data to simulated data six data sets are included in the bifac-

tor data set One is the original 14 variable problem of Holzinger and Swineford (1937)(holzinger) a second is a nine variable problem adapted by Bechtoldt (1961) from Thur-stone and Thurstone (1941) (the data set is used as an example in the SAS manual anddiscussed in great detail by McDonald (1999)) a third is from a recent paper by Reiseet al (2007) with 16 measures of patient reports of interactions with their health careprovider

gt setseed(42)

gt gload=matrix(c(987)nrow=3)

gt fload lt- matrix(c(876rep(09)765

+ rep(09)764) ncol=3)

gt fload echo it to see the structureSw

[1] [2] [3]

[1] 08 00 00

[2] 07 00 00

[3] 06 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 07

[8] 00 00 06

[9] 00 00 04

gt bifact lt- simhierarchical(gload=gloadfload=fload)

gt round(bifact2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 056 048 040 035 029 035 030 020

V2 056 100 042 035 030 025 031 026 018

V3 048 042 100 030 026 022 026 023 015

V4 040 035 030 100 042 035 027 024 016

V5 035 030 026 042 100 030 024 020 013

V6 029 025 022 035 030 100 020 017 011

V7 035 031 026 027 024 020 100 042 028

7

V8 030 026 023 024 020 017 042 100 024

V9 020 018 015 016 013 011 028 024 100

These data can be represented as either a bifactor (Figure 2 panel A) or hierarchical(Figure 2 Panel B) factor solution The analysis was done with the omega function

23 simitem and simcirc

Many personality questionnaires are thought to represent multiple independent factors Aparticularly interesting case is when there are two factors and the items either have simplestructure or circumplex structure Examples of such items with a circumplex structure aremeasures of emotion (Rafaeli and Revelle 2006) where many different emotion terms canbe arranged in a two dimensional space but where there is no obvious clustering of itemsTypical personality scales are constructed to have simple structure where items load onone and only one factor

An additional challenge to measurement with emotion or personality items is that the itemscan be highly skewed and are assessed with a small number of discrete categories (do notagree somewhat agree strongly agree)

The more general simitem function and the more specific simcirc functions simulateitems with a two dimensional structure with or without skew and varying the number ofcategories for the items An example of a circumplex structure is shown in Figure 3

24 simstructure

A more general case is to consider three matrices fxφxy fy which describe in turn ameasurement model of x variables fx a measurement model of y variables fx and acovariance matrix between and within the two sets of factors If fx is a vector and fy andphixy are NULL then this is just the congeneric model If fx is a matrix of loadings withn rows and c columns then this is a measurement model for n variables across c factorsIf phixy is not null but fy is NULL then the factors in fx are correlated Finally if allthree matrices are not NULL then the data show the standard linear structural relations(LISREL) structure

Consider the following examples

241 fx is a vector implies a congeneric model

gt setseed(42)

gt fx lt- c(9876)

gt cong1 lt- simstructure(fx)

gt cong1

Call simstructure(fx = fx)

$model (Population correlation matrix)

8

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 3: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

1 The psych package

11 Preface

The psych package (Revelle 2020) has been developed to include those functions mostuseful for teaching and learning basic psychometrics and personality theory Functionshave been developed for many parts of the analysis of test data including basic de-scriptive statistics (describe and pairspanels) dimensionality analysis (ICLUST VSSprincipal factorpa) reliability analysis (omega guttman) and eventual scale construc-tion (clustercor scoreitems) The use of these and other functions is describedin more detail in the accompanying vignette (overviewpdf) as well as in the completeuserrsquos manual and the relevant help pages (These vignettes are also available at httpspersonality-projectorgroverviewpdf) and httpspersonality-projectorg

rpsych_for_sempdf)

This vignette is concerned with the problem of modeling structural data and using thepsych package as a front end for the much more powerful sem package of John Fox (Fox2006 2009 Fox et al 2013) Future releases of this vignette will include examples forusing the lavaan package of Yves Rosseel (Rosseel 2012)

The first section discusses how to simulate particular latent variable structures The secondconsiders several Exploratory Factor Analysis (EFA) solutions to these problems Thethird section considers how to do confirmatory factor analysis and structural equationmodeling using the sem package but with the input prepared using functions in the psychpackage

12 Creating and modeling structural relations

One common application of psych is the creation of simulated data matrices with particularstructures to use as examples for principal components analysis factor analysis clusteranalysis and structural equation modeling This vignette describes some of the functionsused for creating analyzing and displaying such data sets The examples use two otherpackages Rgraphviz and sem Although not required to use the psych package sem isrequired for these examples Although Rgraphviz had been used for the graphical displaysit has now been replaced with graphical functions within psych The analyses themselvesrequire only the sem package to do the structural modeling

Since writing this vignette the power of the lavaan has allowed me to replace many of thesem commands with calls to lavaan lavaan is loaded along with psych

3

2 Functions for generating correlational matrices with a par-ticular structure

The sim family of functions create data sets with particular structure Most of these func-tions have default values that will produce useful examples Although graphical summariesof these structures will be shown here some of the options of the graphical displays will bediscussed in a later section

The sim functions include

simstructure A function to combine a measurement and structural model into onedata matrix Useful for understanding structural equation models Combined withstructurediagram to see the proposed structure

simcongeneric A function to create congeneric itemstests for demonstrating classicaltest theory This is just a special case of simstructure

simhierarchical A function to create data with a hierarchical (bifactor) structure

simgeneral A function to simulate a general factor and multiple group factors This isdone in a somewhat more obvious although less general method than simhierarchical

simCor A generalized function to simulate data from a specified correlation matrix

simitem A function to create items that either have a simple structure or a circumplexstructure

simcirc Create data with a circumplex structure

simdichot Create dichotomous item data with a simple or circumplex structure

simminor Create a factor structure for nvar variables defined by nfact major factors andnvar

2 ldquominorrdquo factors for n observations

simparallel Create a number of simulated data sets using simminor to show how parallelanalysis works

simrasch Create IRT data following a Rasch model

simirt Create a two parameter IRT logistic (2PL) model

simanova Simulate a 3 way balanced ANOVA or linear model with or without repeatedmeasures Useful for teaching courses in research methods

To make these examples replicable for readers all simulations are prefaced by setting therandom seed to a fixed (and for some memorable) number (Adams 1980) For normal useof the simulations this is not necessary

4

21 simcongeneric

Classical test theory considers tests to be tau equivalent if they have the same covariancewith a vector of latent true scores but perhaps different error variances Tests are consid-ered congeneric if they each have the same true score component (perhaps to a differentdegree) and independent error components The simcongeneric function may be usedto generate either structure

The first example considers four tests with equal loadings on a latent factor (that is aτ equivalent model) If the number of subjects is not specified a population correlationmatrix will be generated If N is specified then the sample correlation matrix is returnedIf the ldquoshortrdquo option is FALSE then the population matrix sample matrix and sampledata are all returned as elements of a list

gt library(psych)

gt library(psychTools)

gt setseed(42)

gt tau lt- simcongeneric(loads=c(8888)) population values

gt tausamp lt- simcongeneric(loads=c(8888)N=100) sample correlation matrix for 100 cases

gt round(tausamp2)

V1 V2 V3 V4

V1 100 068 072 066

V2 068 100 065 067

V3 072 065 100 076

V4 066 067 076 100

gt tausamp lt- simcongeneric(loads=c(8888)N=100 short=FALSE)

gt tausamp

Call NULL

$model (Population correlation matrix)

V1 V2 V3 V4

V1 100 064 064 064

V2 064 100 064 064

V3 064 064 100 064

V4 064 064 064 100

$r (Sample correlation matrix for sample size = 100 )

V1 V2 V3 V4

V1 100 070 062 058

V2 070 100 065 064

V3 062 065 100 059

V4 058 064 059 100

gt dim(tausamp$observed)

[1] 100 4

In this last case the generated data are retrieved from tausamp$observed Congenericdata are created by specifying unequal loading values The default values are loadings ofc(8765) As seen in Figure 1 tau equivalence is the special case where all paths areequal

5

gt cong lt- simcongeneric(N=100)

gt round(cong2)

V1 V2 V3 V4

V1 100 057 053 046

V2 057 100 035 041

V3 053 035 100 043

V4 046 041 043 100

gt plotnew()

gt m1 lt- structurediagram(c(abcd))

Structural model

x1

x2

x3

x4

X1

a

b

c

d

Figure 1 Tau equivalent tests are special cases of congeneric tests Tau equivalence assumesa=b=c=d

6

22 simhierarchical

The previous function simcongeneric is used when one factor accounts for the patternof correlations A slightly more complicated model is when one broad factor and severalnarrower factors are observed An example of this structure might be the structure ofmental abilities where there is a broad factor of general ability and several narrower factors(eg spatial ability verbal ability working memory capacity) Another example is in themeasure of psychopathology where a broad general factor of neuroticism is seen along withmore specific anxiety depression and aggression factors This kind of structure may besimulated with simhierarchical specifying the loadings of each sub factor on a generalfactor (the g-loadings) as well as the loadings of individual items on the lower order factors(the f-loadings) An early paper describing a bifactor structure was by Holzinger andSwineford (1937) A helpful description of what makes a good general factor is that ofJensen and Weng (1994)

For those who prefer real data to simulated data six data sets are included in the bifac-

tor data set One is the original 14 variable problem of Holzinger and Swineford (1937)(holzinger) a second is a nine variable problem adapted by Bechtoldt (1961) from Thur-stone and Thurstone (1941) (the data set is used as an example in the SAS manual anddiscussed in great detail by McDonald (1999)) a third is from a recent paper by Reiseet al (2007) with 16 measures of patient reports of interactions with their health careprovider

gt setseed(42)

gt gload=matrix(c(987)nrow=3)

gt fload lt- matrix(c(876rep(09)765

+ rep(09)764) ncol=3)

gt fload echo it to see the structureSw

[1] [2] [3]

[1] 08 00 00

[2] 07 00 00

[3] 06 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 07

[8] 00 00 06

[9] 00 00 04

gt bifact lt- simhierarchical(gload=gloadfload=fload)

gt round(bifact2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 056 048 040 035 029 035 030 020

V2 056 100 042 035 030 025 031 026 018

V3 048 042 100 030 026 022 026 023 015

V4 040 035 030 100 042 035 027 024 016

V5 035 030 026 042 100 030 024 020 013

V6 029 025 022 035 030 100 020 017 011

V7 035 031 026 027 024 020 100 042 028

7

V8 030 026 023 024 020 017 042 100 024

V9 020 018 015 016 013 011 028 024 100

These data can be represented as either a bifactor (Figure 2 panel A) or hierarchical(Figure 2 Panel B) factor solution The analysis was done with the omega function

23 simitem and simcirc

Many personality questionnaires are thought to represent multiple independent factors Aparticularly interesting case is when there are two factors and the items either have simplestructure or circumplex structure Examples of such items with a circumplex structure aremeasures of emotion (Rafaeli and Revelle 2006) where many different emotion terms canbe arranged in a two dimensional space but where there is no obvious clustering of itemsTypical personality scales are constructed to have simple structure where items load onone and only one factor

An additional challenge to measurement with emotion or personality items is that the itemscan be highly skewed and are assessed with a small number of discrete categories (do notagree somewhat agree strongly agree)

The more general simitem function and the more specific simcirc functions simulateitems with a two dimensional structure with or without skew and varying the number ofcategories for the items An example of a circumplex structure is shown in Figure 3

24 simstructure

A more general case is to consider three matrices fxφxy fy which describe in turn ameasurement model of x variables fx a measurement model of y variables fx and acovariance matrix between and within the two sets of factors If fx is a vector and fy andphixy are NULL then this is just the congeneric model If fx is a matrix of loadings withn rows and c columns then this is a measurement model for n variables across c factorsIf phixy is not null but fy is NULL then the factors in fx are correlated Finally if allthree matrices are not NULL then the data show the standard linear structural relations(LISREL) structure

Consider the following examples

241 fx is a vector implies a congeneric model

gt setseed(42)

gt fx lt- c(9876)

gt cong1 lt- simstructure(fx)

gt cong1

Call simstructure(fx = fx)

$model (Population correlation matrix)

8

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 4: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

2 Functions for generating correlational matrices with a par-ticular structure

The sim family of functions create data sets with particular structure Most of these func-tions have default values that will produce useful examples Although graphical summariesof these structures will be shown here some of the options of the graphical displays will bediscussed in a later section

The sim functions include

simstructure A function to combine a measurement and structural model into onedata matrix Useful for understanding structural equation models Combined withstructurediagram to see the proposed structure

simcongeneric A function to create congeneric itemstests for demonstrating classicaltest theory This is just a special case of simstructure

simhierarchical A function to create data with a hierarchical (bifactor) structure

simgeneral A function to simulate a general factor and multiple group factors This isdone in a somewhat more obvious although less general method than simhierarchical

simCor A generalized function to simulate data from a specified correlation matrix

simitem A function to create items that either have a simple structure or a circumplexstructure

simcirc Create data with a circumplex structure

simdichot Create dichotomous item data with a simple or circumplex structure

simminor Create a factor structure for nvar variables defined by nfact major factors andnvar

2 ldquominorrdquo factors for n observations

simparallel Create a number of simulated data sets using simminor to show how parallelanalysis works

simrasch Create IRT data following a Rasch model

simirt Create a two parameter IRT logistic (2PL) model

simanova Simulate a 3 way balanced ANOVA or linear model with or without repeatedmeasures Useful for teaching courses in research methods

To make these examples replicable for readers all simulations are prefaced by setting therandom seed to a fixed (and for some memorable) number (Adams 1980) For normal useof the simulations this is not necessary

4

21 simcongeneric

Classical test theory considers tests to be tau equivalent if they have the same covariancewith a vector of latent true scores but perhaps different error variances Tests are consid-ered congeneric if they each have the same true score component (perhaps to a differentdegree) and independent error components The simcongeneric function may be usedto generate either structure

The first example considers four tests with equal loadings on a latent factor (that is aτ equivalent model) If the number of subjects is not specified a population correlationmatrix will be generated If N is specified then the sample correlation matrix is returnedIf the ldquoshortrdquo option is FALSE then the population matrix sample matrix and sampledata are all returned as elements of a list

gt library(psych)

gt library(psychTools)

gt setseed(42)

gt tau lt- simcongeneric(loads=c(8888)) population values

gt tausamp lt- simcongeneric(loads=c(8888)N=100) sample correlation matrix for 100 cases

gt round(tausamp2)

V1 V2 V3 V4

V1 100 068 072 066

V2 068 100 065 067

V3 072 065 100 076

V4 066 067 076 100

gt tausamp lt- simcongeneric(loads=c(8888)N=100 short=FALSE)

gt tausamp

Call NULL

$model (Population correlation matrix)

V1 V2 V3 V4

V1 100 064 064 064

V2 064 100 064 064

V3 064 064 100 064

V4 064 064 064 100

$r (Sample correlation matrix for sample size = 100 )

V1 V2 V3 V4

V1 100 070 062 058

V2 070 100 065 064

V3 062 065 100 059

V4 058 064 059 100

gt dim(tausamp$observed)

[1] 100 4

In this last case the generated data are retrieved from tausamp$observed Congenericdata are created by specifying unequal loading values The default values are loadings ofc(8765) As seen in Figure 1 tau equivalence is the special case where all paths areequal

5

gt cong lt- simcongeneric(N=100)

gt round(cong2)

V1 V2 V3 V4

V1 100 057 053 046

V2 057 100 035 041

V3 053 035 100 043

V4 046 041 043 100

gt plotnew()

gt m1 lt- structurediagram(c(abcd))

Structural model

x1

x2

x3

x4

X1

a

b

c

d

Figure 1 Tau equivalent tests are special cases of congeneric tests Tau equivalence assumesa=b=c=d

6

22 simhierarchical

The previous function simcongeneric is used when one factor accounts for the patternof correlations A slightly more complicated model is when one broad factor and severalnarrower factors are observed An example of this structure might be the structure ofmental abilities where there is a broad factor of general ability and several narrower factors(eg spatial ability verbal ability working memory capacity) Another example is in themeasure of psychopathology where a broad general factor of neuroticism is seen along withmore specific anxiety depression and aggression factors This kind of structure may besimulated with simhierarchical specifying the loadings of each sub factor on a generalfactor (the g-loadings) as well as the loadings of individual items on the lower order factors(the f-loadings) An early paper describing a bifactor structure was by Holzinger andSwineford (1937) A helpful description of what makes a good general factor is that ofJensen and Weng (1994)

For those who prefer real data to simulated data six data sets are included in the bifac-

tor data set One is the original 14 variable problem of Holzinger and Swineford (1937)(holzinger) a second is a nine variable problem adapted by Bechtoldt (1961) from Thur-stone and Thurstone (1941) (the data set is used as an example in the SAS manual anddiscussed in great detail by McDonald (1999)) a third is from a recent paper by Reiseet al (2007) with 16 measures of patient reports of interactions with their health careprovider

gt setseed(42)

gt gload=matrix(c(987)nrow=3)

gt fload lt- matrix(c(876rep(09)765

+ rep(09)764) ncol=3)

gt fload echo it to see the structureSw

[1] [2] [3]

[1] 08 00 00

[2] 07 00 00

[3] 06 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 07

[8] 00 00 06

[9] 00 00 04

gt bifact lt- simhierarchical(gload=gloadfload=fload)

gt round(bifact2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 056 048 040 035 029 035 030 020

V2 056 100 042 035 030 025 031 026 018

V3 048 042 100 030 026 022 026 023 015

V4 040 035 030 100 042 035 027 024 016

V5 035 030 026 042 100 030 024 020 013

V6 029 025 022 035 030 100 020 017 011

V7 035 031 026 027 024 020 100 042 028

7

V8 030 026 023 024 020 017 042 100 024

V9 020 018 015 016 013 011 028 024 100

These data can be represented as either a bifactor (Figure 2 panel A) or hierarchical(Figure 2 Panel B) factor solution The analysis was done with the omega function

23 simitem and simcirc

Many personality questionnaires are thought to represent multiple independent factors Aparticularly interesting case is when there are two factors and the items either have simplestructure or circumplex structure Examples of such items with a circumplex structure aremeasures of emotion (Rafaeli and Revelle 2006) where many different emotion terms canbe arranged in a two dimensional space but where there is no obvious clustering of itemsTypical personality scales are constructed to have simple structure where items load onone and only one factor

An additional challenge to measurement with emotion or personality items is that the itemscan be highly skewed and are assessed with a small number of discrete categories (do notagree somewhat agree strongly agree)

The more general simitem function and the more specific simcirc functions simulateitems with a two dimensional structure with or without skew and varying the number ofcategories for the items An example of a circumplex structure is shown in Figure 3

24 simstructure

A more general case is to consider three matrices fxφxy fy which describe in turn ameasurement model of x variables fx a measurement model of y variables fx and acovariance matrix between and within the two sets of factors If fx is a vector and fy andphixy are NULL then this is just the congeneric model If fx is a matrix of loadings withn rows and c columns then this is a measurement model for n variables across c factorsIf phixy is not null but fy is NULL then the factors in fx are correlated Finally if allthree matrices are not NULL then the data show the standard linear structural relations(LISREL) structure

Consider the following examples

241 fx is a vector implies a congeneric model

gt setseed(42)

gt fx lt- c(9876)

gt cong1 lt- simstructure(fx)

gt cong1

Call simstructure(fx = fx)

$model (Population correlation matrix)

8

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 5: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

21 simcongeneric

Classical test theory considers tests to be tau equivalent if they have the same covariancewith a vector of latent true scores but perhaps different error variances Tests are consid-ered congeneric if they each have the same true score component (perhaps to a differentdegree) and independent error components The simcongeneric function may be usedto generate either structure

The first example considers four tests with equal loadings on a latent factor (that is aτ equivalent model) If the number of subjects is not specified a population correlationmatrix will be generated If N is specified then the sample correlation matrix is returnedIf the ldquoshortrdquo option is FALSE then the population matrix sample matrix and sampledata are all returned as elements of a list

gt library(psych)

gt library(psychTools)

gt setseed(42)

gt tau lt- simcongeneric(loads=c(8888)) population values

gt tausamp lt- simcongeneric(loads=c(8888)N=100) sample correlation matrix for 100 cases

gt round(tausamp2)

V1 V2 V3 V4

V1 100 068 072 066

V2 068 100 065 067

V3 072 065 100 076

V4 066 067 076 100

gt tausamp lt- simcongeneric(loads=c(8888)N=100 short=FALSE)

gt tausamp

Call NULL

$model (Population correlation matrix)

V1 V2 V3 V4

V1 100 064 064 064

V2 064 100 064 064

V3 064 064 100 064

V4 064 064 064 100

$r (Sample correlation matrix for sample size = 100 )

V1 V2 V3 V4

V1 100 070 062 058

V2 070 100 065 064

V3 062 065 100 059

V4 058 064 059 100

gt dim(tausamp$observed)

[1] 100 4

In this last case the generated data are retrieved from tausamp$observed Congenericdata are created by specifying unequal loading values The default values are loadings ofc(8765) As seen in Figure 1 tau equivalence is the special case where all paths areequal

5

gt cong lt- simcongeneric(N=100)

gt round(cong2)

V1 V2 V3 V4

V1 100 057 053 046

V2 057 100 035 041

V3 053 035 100 043

V4 046 041 043 100

gt plotnew()

gt m1 lt- structurediagram(c(abcd))

Structural model

x1

x2

x3

x4

X1

a

b

c

d

Figure 1 Tau equivalent tests are special cases of congeneric tests Tau equivalence assumesa=b=c=d

6

22 simhierarchical

The previous function simcongeneric is used when one factor accounts for the patternof correlations A slightly more complicated model is when one broad factor and severalnarrower factors are observed An example of this structure might be the structure ofmental abilities where there is a broad factor of general ability and several narrower factors(eg spatial ability verbal ability working memory capacity) Another example is in themeasure of psychopathology where a broad general factor of neuroticism is seen along withmore specific anxiety depression and aggression factors This kind of structure may besimulated with simhierarchical specifying the loadings of each sub factor on a generalfactor (the g-loadings) as well as the loadings of individual items on the lower order factors(the f-loadings) An early paper describing a bifactor structure was by Holzinger andSwineford (1937) A helpful description of what makes a good general factor is that ofJensen and Weng (1994)

For those who prefer real data to simulated data six data sets are included in the bifac-

tor data set One is the original 14 variable problem of Holzinger and Swineford (1937)(holzinger) a second is a nine variable problem adapted by Bechtoldt (1961) from Thur-stone and Thurstone (1941) (the data set is used as an example in the SAS manual anddiscussed in great detail by McDonald (1999)) a third is from a recent paper by Reiseet al (2007) with 16 measures of patient reports of interactions with their health careprovider

gt setseed(42)

gt gload=matrix(c(987)nrow=3)

gt fload lt- matrix(c(876rep(09)765

+ rep(09)764) ncol=3)

gt fload echo it to see the structureSw

[1] [2] [3]

[1] 08 00 00

[2] 07 00 00

[3] 06 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 07

[8] 00 00 06

[9] 00 00 04

gt bifact lt- simhierarchical(gload=gloadfload=fload)

gt round(bifact2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 056 048 040 035 029 035 030 020

V2 056 100 042 035 030 025 031 026 018

V3 048 042 100 030 026 022 026 023 015

V4 040 035 030 100 042 035 027 024 016

V5 035 030 026 042 100 030 024 020 013

V6 029 025 022 035 030 100 020 017 011

V7 035 031 026 027 024 020 100 042 028

7

V8 030 026 023 024 020 017 042 100 024

V9 020 018 015 016 013 011 028 024 100

These data can be represented as either a bifactor (Figure 2 panel A) or hierarchical(Figure 2 Panel B) factor solution The analysis was done with the omega function

23 simitem and simcirc

Many personality questionnaires are thought to represent multiple independent factors Aparticularly interesting case is when there are two factors and the items either have simplestructure or circumplex structure Examples of such items with a circumplex structure aremeasures of emotion (Rafaeli and Revelle 2006) where many different emotion terms canbe arranged in a two dimensional space but where there is no obvious clustering of itemsTypical personality scales are constructed to have simple structure where items load onone and only one factor

An additional challenge to measurement with emotion or personality items is that the itemscan be highly skewed and are assessed with a small number of discrete categories (do notagree somewhat agree strongly agree)

The more general simitem function and the more specific simcirc functions simulateitems with a two dimensional structure with or without skew and varying the number ofcategories for the items An example of a circumplex structure is shown in Figure 3

24 simstructure

A more general case is to consider three matrices fxφxy fy which describe in turn ameasurement model of x variables fx a measurement model of y variables fx and acovariance matrix between and within the two sets of factors If fx is a vector and fy andphixy are NULL then this is just the congeneric model If fx is a matrix of loadings withn rows and c columns then this is a measurement model for n variables across c factorsIf phixy is not null but fy is NULL then the factors in fx are correlated Finally if allthree matrices are not NULL then the data show the standard linear structural relations(LISREL) structure

Consider the following examples

241 fx is a vector implies a congeneric model

gt setseed(42)

gt fx lt- c(9876)

gt cong1 lt- simstructure(fx)

gt cong1

Call simstructure(fx = fx)

$model (Population correlation matrix)

8

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 6: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt cong lt- simcongeneric(N=100)

gt round(cong2)

V1 V2 V3 V4

V1 100 057 053 046

V2 057 100 035 041

V3 053 035 100 043

V4 046 041 043 100

gt plotnew()

gt m1 lt- structurediagram(c(abcd))

Structural model

x1

x2

x3

x4

X1

a

b

c

d

Figure 1 Tau equivalent tests are special cases of congeneric tests Tau equivalence assumesa=b=c=d

6

22 simhierarchical

The previous function simcongeneric is used when one factor accounts for the patternof correlations A slightly more complicated model is when one broad factor and severalnarrower factors are observed An example of this structure might be the structure ofmental abilities where there is a broad factor of general ability and several narrower factors(eg spatial ability verbal ability working memory capacity) Another example is in themeasure of psychopathology where a broad general factor of neuroticism is seen along withmore specific anxiety depression and aggression factors This kind of structure may besimulated with simhierarchical specifying the loadings of each sub factor on a generalfactor (the g-loadings) as well as the loadings of individual items on the lower order factors(the f-loadings) An early paper describing a bifactor structure was by Holzinger andSwineford (1937) A helpful description of what makes a good general factor is that ofJensen and Weng (1994)

For those who prefer real data to simulated data six data sets are included in the bifac-

tor data set One is the original 14 variable problem of Holzinger and Swineford (1937)(holzinger) a second is a nine variable problem adapted by Bechtoldt (1961) from Thur-stone and Thurstone (1941) (the data set is used as an example in the SAS manual anddiscussed in great detail by McDonald (1999)) a third is from a recent paper by Reiseet al (2007) with 16 measures of patient reports of interactions with their health careprovider

gt setseed(42)

gt gload=matrix(c(987)nrow=3)

gt fload lt- matrix(c(876rep(09)765

+ rep(09)764) ncol=3)

gt fload echo it to see the structureSw

[1] [2] [3]

[1] 08 00 00

[2] 07 00 00

[3] 06 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 07

[8] 00 00 06

[9] 00 00 04

gt bifact lt- simhierarchical(gload=gloadfload=fload)

gt round(bifact2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 056 048 040 035 029 035 030 020

V2 056 100 042 035 030 025 031 026 018

V3 048 042 100 030 026 022 026 023 015

V4 040 035 030 100 042 035 027 024 016

V5 035 030 026 042 100 030 024 020 013

V6 029 025 022 035 030 100 020 017 011

V7 035 031 026 027 024 020 100 042 028

7

V8 030 026 023 024 020 017 042 100 024

V9 020 018 015 016 013 011 028 024 100

These data can be represented as either a bifactor (Figure 2 panel A) or hierarchical(Figure 2 Panel B) factor solution The analysis was done with the omega function

23 simitem and simcirc

Many personality questionnaires are thought to represent multiple independent factors Aparticularly interesting case is when there are two factors and the items either have simplestructure or circumplex structure Examples of such items with a circumplex structure aremeasures of emotion (Rafaeli and Revelle 2006) where many different emotion terms canbe arranged in a two dimensional space but where there is no obvious clustering of itemsTypical personality scales are constructed to have simple structure where items load onone and only one factor

An additional challenge to measurement with emotion or personality items is that the itemscan be highly skewed and are assessed with a small number of discrete categories (do notagree somewhat agree strongly agree)

The more general simitem function and the more specific simcirc functions simulateitems with a two dimensional structure with or without skew and varying the number ofcategories for the items An example of a circumplex structure is shown in Figure 3

24 simstructure

A more general case is to consider three matrices fxφxy fy which describe in turn ameasurement model of x variables fx a measurement model of y variables fx and acovariance matrix between and within the two sets of factors If fx is a vector and fy andphixy are NULL then this is just the congeneric model If fx is a matrix of loadings withn rows and c columns then this is a measurement model for n variables across c factorsIf phixy is not null but fy is NULL then the factors in fx are correlated Finally if allthree matrices are not NULL then the data show the standard linear structural relations(LISREL) structure

Consider the following examples

241 fx is a vector implies a congeneric model

gt setseed(42)

gt fx lt- c(9876)

gt cong1 lt- simstructure(fx)

gt cong1

Call simstructure(fx = fx)

$model (Population correlation matrix)

8

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 7: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

22 simhierarchical

The previous function simcongeneric is used when one factor accounts for the patternof correlations A slightly more complicated model is when one broad factor and severalnarrower factors are observed An example of this structure might be the structure ofmental abilities where there is a broad factor of general ability and several narrower factors(eg spatial ability verbal ability working memory capacity) Another example is in themeasure of psychopathology where a broad general factor of neuroticism is seen along withmore specific anxiety depression and aggression factors This kind of structure may besimulated with simhierarchical specifying the loadings of each sub factor on a generalfactor (the g-loadings) as well as the loadings of individual items on the lower order factors(the f-loadings) An early paper describing a bifactor structure was by Holzinger andSwineford (1937) A helpful description of what makes a good general factor is that ofJensen and Weng (1994)

For those who prefer real data to simulated data six data sets are included in the bifac-

tor data set One is the original 14 variable problem of Holzinger and Swineford (1937)(holzinger) a second is a nine variable problem adapted by Bechtoldt (1961) from Thur-stone and Thurstone (1941) (the data set is used as an example in the SAS manual anddiscussed in great detail by McDonald (1999)) a third is from a recent paper by Reiseet al (2007) with 16 measures of patient reports of interactions with their health careprovider

gt setseed(42)

gt gload=matrix(c(987)nrow=3)

gt fload lt- matrix(c(876rep(09)765

+ rep(09)764) ncol=3)

gt fload echo it to see the structureSw

[1] [2] [3]

[1] 08 00 00

[2] 07 00 00

[3] 06 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 07

[8] 00 00 06

[9] 00 00 04

gt bifact lt- simhierarchical(gload=gloadfload=fload)

gt round(bifact2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 056 048 040 035 029 035 030 020

V2 056 100 042 035 030 025 031 026 018

V3 048 042 100 030 026 022 026 023 015

V4 040 035 030 100 042 035 027 024 016

V5 035 030 026 042 100 030 024 020 013

V6 029 025 022 035 030 100 020 017 011

V7 035 031 026 027 024 020 100 042 028

7

V8 030 026 023 024 020 017 042 100 024

V9 020 018 015 016 013 011 028 024 100

These data can be represented as either a bifactor (Figure 2 panel A) or hierarchical(Figure 2 Panel B) factor solution The analysis was done with the omega function

23 simitem and simcirc

Many personality questionnaires are thought to represent multiple independent factors Aparticularly interesting case is when there are two factors and the items either have simplestructure or circumplex structure Examples of such items with a circumplex structure aremeasures of emotion (Rafaeli and Revelle 2006) where many different emotion terms canbe arranged in a two dimensional space but where there is no obvious clustering of itemsTypical personality scales are constructed to have simple structure where items load onone and only one factor

An additional challenge to measurement with emotion or personality items is that the itemscan be highly skewed and are assessed with a small number of discrete categories (do notagree somewhat agree strongly agree)

The more general simitem function and the more specific simcirc functions simulateitems with a two dimensional structure with or without skew and varying the number ofcategories for the items An example of a circumplex structure is shown in Figure 3

24 simstructure

A more general case is to consider three matrices fxφxy fy which describe in turn ameasurement model of x variables fx a measurement model of y variables fx and acovariance matrix between and within the two sets of factors If fx is a vector and fy andphixy are NULL then this is just the congeneric model If fx is a matrix of loadings withn rows and c columns then this is a measurement model for n variables across c factorsIf phixy is not null but fy is NULL then the factors in fx are correlated Finally if allthree matrices are not NULL then the data show the standard linear structural relations(LISREL) structure

Consider the following examples

241 fx is a vector implies a congeneric model

gt setseed(42)

gt fx lt- c(9876)

gt cong1 lt- simstructure(fx)

gt cong1

Call simstructure(fx = fx)

$model (Population correlation matrix)

8

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 8: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

V8 030 026 023 024 020 017 042 100 024

V9 020 018 015 016 013 011 028 024 100

These data can be represented as either a bifactor (Figure 2 panel A) or hierarchical(Figure 2 Panel B) factor solution The analysis was done with the omega function

23 simitem and simcirc

Many personality questionnaires are thought to represent multiple independent factors Aparticularly interesting case is when there are two factors and the items either have simplestructure or circumplex structure Examples of such items with a circumplex structure aremeasures of emotion (Rafaeli and Revelle 2006) where many different emotion terms canbe arranged in a two dimensional space but where there is no obvious clustering of itemsTypical personality scales are constructed to have simple structure where items load onone and only one factor

An additional challenge to measurement with emotion or personality items is that the itemscan be highly skewed and are assessed with a small number of discrete categories (do notagree somewhat agree strongly agree)

The more general simitem function and the more specific simcirc functions simulateitems with a two dimensional structure with or without skew and varying the number ofcategories for the items An example of a circumplex structure is shown in Figure 3

24 simstructure

A more general case is to consider three matrices fxφxy fy which describe in turn ameasurement model of x variables fx a measurement model of y variables fx and acovariance matrix between and within the two sets of factors If fx is a vector and fy andphixy are NULL then this is just the congeneric model If fx is a matrix of loadings withn rows and c columns then this is a measurement model for n variables across c factorsIf phixy is not null but fy is NULL then the factors in fx are correlated Finally if allthree matrices are not NULL then the data show the standard linear structural relations(LISREL) structure

Consider the following examples

241 fx is a vector implies a congeneric model

gt setseed(42)

gt fx lt- c(9876)

gt cong1 lt- simstructure(fx)

gt cong1

Call simstructure(fx = fx)

$model (Population correlation matrix)

8

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 9: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt op lt- par(mfrow=c(12))

gt mbi lt- omega(bifacttitle=A bifactor model)

gt mhi lt- omega(bifactsl=FALSEtitle=A hierarchical model)

gt op lt- par(mfrow = c(11))

A bifactor model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

A hierarchical model

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 2 (Left panel) A bifactor solution represents each test in terms of a general factorand a residualized group factor (Right Panel) A hierarchical factor solution has g as asecond order factor accounting for the correlations between the first order factors

9

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 10: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt circ lt- simcirc(16)

gt f2 lt- fa(circ2)

gt plot(f2title=16 simulated variables in a circumplex pattern)

minus06 minus04 minus02 00 02 04 06

minus0

6minus

04

minus0

20

00

20

40

6

16 simulated variables in a circumplex pattern

MR1

MR

2

1

2

34

5

6

7

8

9

10

11 1213

14

15

16

Figure 3 Emotion items or interpersonal items frequently show a circumplex structureData generated by simcirc and factor loadings found by the principal axis algorithm usingfactorpa

10

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 11: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

V1 V2 V3 V4

V1 100 072 063 054

V2 072 100 056 048

V3 063 056 100 042

V4 054 048 042 100

$reliability (population reliability)

[1] 081 064 049 036

242 fx is a matrix implies an independent factors model

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt threefact lt- simstructure(fx)

gt threefact

Call simstructure(fx = fx)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 072 063 000 000 000 000 00 000

V2 072 100 056 000 000 000 000 00 000

V3 063 056 100 000 000 000 000 00 000

V4 000 000 000 100 042 035 000 00 000

V5 000 000 000 042 100 030 000 00 000

V6 000 000 000 035 030 100 000 00 000

V7 000 000 000 000 000 000 100 03 024

V8 000 000 000 000 000 000 030 10 020

V9 000 000 000 000 000 000 024 02 100

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

243 fx is a matrix and Phi 6= I is a correlated factors model

gt Phi = matrix(c(153512321) ncol=3)

gt corf3 lt- simstructure(fxPhi)

gt fx

[1] [2] [3]

[1] 09 00 00

[2] 08 00 00

[3] 07 00 00

[4] 00 07 00

[5] 00 06 00

[6] 00 05 00

[7] 00 00 06

[8] 00 00 05

[9] 00 00 04

gt Phi

[1] [2] [3]

[1] 10 05 03

[2] 05 10 02

[3] 03 02 10

gt corf3

11

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 12: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

09

0807

X207

0605

X306

05

04

Figure 4 Three uncorrelated factors generated using the simstructure function and drawnusing structurediagram

12

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 13: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Call simstructure(fx = fx Phi = Phi)

$model (Population correlation matrix)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 100 0720 0630 0315 0270 023 0162 014 0108

V2 072 1000 0560 0280 0240 020 0144 012 0096

V3 063 0560 1000 0245 0210 017 0126 010 0084

V4 032 0280 0245 1000 0420 035 0084 007 0056

V5 027 0240 0210 0420 1000 030 0072 006 0048

V6 023 0200 0175 0350 0300 100 0060 005 0040

V7 016 0144 0126 0084 0072 006 1000 030 0240

V8 014 0120 0105 0070 0060 005 0300 100 0200

V9 011 0096 0084 0056 0048 004 0240 020 1000

$reliability (population reliability)

[1] 081 064 049 049 036 025 036 025 016

Using symbolic loadings and path coefficients For some purposes it is helpful notto specify particular values for the paths but rather to think of them symbolically Thiscan be shown with symbolic loadings and path coefficients by using the structurelist

and philist functions to create the fx and Phi matrices (Figure 5)

gt fxs lt- structurelist(9list(F1=c(123)F2=c(456)F3=c(789)))

gt Phis lt- philist(3list(F1=c(23)F2=c(13)F3=c(12)))

gt fxs show the matrix

F1 F2 F3

[1] a1 0 0

[2] a2 0 0

[3] a3 0 0

[4] 0 b4 0

[5] 0 b5 0

[6] 0 b6 0

[7] 0 0 c7

[8] 0 0 c8

[9] 0 0 c9

gt Phis show this one as well

F1 F2 F3

F1 1 rba rca

F2 rab 1 rcb

F3 rac rbc 1

The structurelist and philist functions allow for creation of fx Phi and fy matricesin a very compact form just by specifying the relevant variables

13

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 14: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt plotnew()

gt corf3mod lt- structurediagram(fxsPhis)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

F1

a1

a2a3

F2b4

b5b6

F3c7

c8

c9

rab

rac

rbc

Figure 5 Three correlated factors with symbolic paths Created using structurediagramand structurelist and philist for ease of input

14

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 15: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Drawing path models from Exploratory Factor Analysis solutions Alternativelythis result can represent the estimated factor loadings and oblique correlations found us-ing factanal (Maximum Likelihood factoring) or fa (Principal axis or minimum residual(minres) factoring) followed by a promax rotation using the Promax function (Figure 6Comparing this figure with the previous one (Figure 5) it will be seen that one path wasdropped because it was less than the arbitrary ldquocutrdquo value of 2

gt f3p lt- Promax(fa(corf3$model3))

gt plotnew()

gt modf3p lt- structurediagram(f3pcut=2)

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

09

0807

MR207

0605

MR306

05

04

Figure 6 The empirically fitted structural model Paths less than cut (2 in this case thedefault is 3) are not shown

15

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 16: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

244 fx and fy are matrices and Phi 6= I represents their correlations

A more complicated model is when there is a fy vector or matrix representing a set of Ylatent variables that are associated with the a set of y variables In this case the Phimatrix is a set of correlations within the X set and between the X and Y set

gt setseed(42)

gt fx lt- matrix(c(987rep(09)765rep(09)654) ncol=3)

gt fy lt- c(654)

gt Phi lt- matrix(c(1483244813233232124321) ncol=4)

gt twelveV lt- simstructure(fxPhi fy)$model

gt colnames(twelveV) lt-rownames(twelveV) lt- c(paste(x19sep=)paste(y13sep=))

gt round(twelveV2)

x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3

x1 100 072 063 030 026 022 017 014 012 022 018 014

x2 072 100 056 027 023 019 015 013 010 019 016 013

x3 063 056 100 024 020 017 013 011 009 017 014 011

x4 030 027 024 100 042 035 013 011 009 013 010 008

x5 026 023 020 042 100 030 012 010 008 011 009 007

x6 022 019 017 035 030 100 010 008 006 009 007 006

x7 017 015 013 013 012 010 100 030 024 007 006 005

x8 014 013 011 011 010 008 030 100 020 006 005 004

x9 012 010 009 009 008 006 024 020 100 005 004 003

y1 022 019 017 013 011 009 007 006 005 100 030 024

y2 018 016 014 010 009 007 006 005 004 030 100 020

y3 014 013 011 008 007 006 005 004 003 024 020 100

Data with this structure may be created using the simstructure function and showneither with the numeric values or symbolically using the structurediagram function(Figure 7)

gt fxs lt- structurelist(9list(X1=c(123) X2 =c(456)X3 = c(789)))

gt phi lt- philist(4list(F1=c(4)F2=c(4)F3=c(4)F4=c(123)))

gt fyx lt- structurelist(3list(Y=c(123))Y)

245 A hierarchical structure among the latent predictors

Measures of intelligence and psychopathology frequently have a general factor as well asmultiple group factors The general factor then is thought to predict some dependent latentvariable Compare this with the previous model (see Figure 7)

These two models can be compared using structural modeling procedures (see below)

3 Exploratory functions for analyzing structure

Given correlation matrices such as those seen above for congeneric or bifactor models thequestion becomes how best to estimate the underlying structure Because these data setswere generated from a known model the question becomes how well does a particularmodel recover the underlying structure

16

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 17: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt plotnew()

gt sg3 lt- structurediagram(fxsphifyx)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2a3

X2b4

b5b6

X3c7

c8

c9

y1

y2

y3

Y

Ya1

Ya2

Ya3

rad

rbd

rcd

Figure 7 A symbolic structural model Three independent latent variables are regressedon a latent Y

17

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 18: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt fxh lt- structurelist(9list(X1=c(13)X2=c(46)X3=c(79)g=NULL))

gt fy lt- structurelist(3list(Y=c(123)))

gt Phi lt- diag(155)

gt Phi[4c(13)] lt- letters[13]

gt Phi[54] lt- r

gt plotnew()

gt himod lt-structurediagram(fxhPhi fy)

Structural model

x1

x2

x3

x4

x5

x6

x7

x8

x9

X1

a1

a2

a3

X2b4

b5 b6

X3

c7c8

c9g

y1

y2

y3

Y

a1

a2

a3

a

b

cr

Figure 8 A symbolic structural model with a general factor and three group factors Thegeneral factor is regressed on the latent Y variable

18

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 19: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

31 Exploratory simple structure models

The technique of principal components provides a set of weighted linear composites thatbest approximates a particular correlation or covariance matrix If these are then rotatedto provide a more interpretable solution the components are no longer the principal com-ponents The principal function will extract the first n principal components (defaultvalue is 1) and if ngt1 rotate to simple structure using a varimax quartimin or Promax

criterion

gt principal(cong1$model)

Principal Components Analysis

Call principal(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

PC1 h2 u2 com

V1 089 080 020 1

V2 085 073 027 1

V3 080 064 036 1

V4 073 053 047 1

PC1

SS loadings 269

Proportion Var 067

Mean item complexity = 1

Test of the hypothesis that 1 component is sufficient

The root mean square of the residuals (RMSR) is 011

Fit based upon off diagonal values = 096

gt fa(cong1$model)

Factor Analysis using method = minres

Call fa(r = cong1$model)

Standardized loadings (pattern matrix) based upon correlation matrix

MR1 h2 u2 com

V1 09 081 019 1

V2 08 064 036 1

V3 07 049 051 1

V4 06 036 064 1

MR1

SS loadings 230

Proportion Var 057

Mean item complexity = 1

Test of the hypothesis that 1 factor is sufficient

The degrees of freedom for the null model are 6 and the objective function was 165

The degrees of freedom for the model are 2 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

19

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 20: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

MR1

Correlation of (regression) scores with factors 094

Multiple R square of scores with factors 088

Minimum correlation of possible factor scores 077

It is important to note that although the principal components function does not exactlyreproduce the model parameters the factorpa function implementing principal axes orminimum residual (minres) factor analysis does

Consider the case of three underlying factors as seen in the bifact example above Be-cause the number of observations is not specified there is no associated χ2 value Thefactorcongruence function reports the cosine of the angle between the factors

gt pc3 lt- principal(bifact3)

gt pa3 lt- fa(bifact3fm=pa)

gt ml3 lt- fa(bifact3fm=ml)

gt pc3

Principal Components Analysis

Call principal(r = bifact nfactors = 3)

Standardized loadings (pattern matrix) based upon correlation matrix

RC1 RC3 RC2 h2 u2 com

V1 075 027 021 069 031 14

V2 076 021 016 064 036 12

V3 078 011 010 063 037 11

V4 029 069 015 059 041 15

V5 020 071 011 056 044 12

V6 007 076 008 059 041 10

V7 026 016 070 058 042 14

V8 020 011 071 055 045 12

V9 000 006 073 053 047 10

RC1 RC3 RC2

SS loadings 199 173 164

Proportion Var 022 019 018

Cumulative Var 022 041 060

Proportion Explained 037 032 031

Cumulative Proportion 037 069 100

Mean item complexity = 12

Test of the hypothesis that 3 components are sufficient

The root mean square of the residuals (RMSR) is 01

Fit based upon off diagonal values = 088

gt pa3

Factor Analysis using method = pa

Call fa(r = bifact nfactors = 3 fm = pa)

Standardized loadings (pattern matrix) based upon correlation matrix

PA1 PA3 PA2 h2 u2 com

V1 08 00 000 064 036 1

V2 07 00 000 049 051 1

V3 06 00 000 036 064 1

V4 00 07 000 049 051 1

V5 00 06 000 036 064 1

20

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 21: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

V6 00 05 000 025 075 1

V7 00 00 069 048 052 1

V8 00 00 061 036 064 1

V9 00 00 040 016 084 1

PA1 PA3 PA2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

PA1 PA3 PA2

PA1 100 072 063

PA3 072 100 056

PA2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

PA1 PA3 PA2

Correlation of (regression) scores with factors 09 085 083

Multiple R square of scores with factors 08 072 069

Minimum correlation of possible factor scores 06 045 038

gt ml3

Factor Analysis using method = ml

Call fa(r = bifact nfactors = 3 fm = ml)

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2 com

V1 08 00 00 064 036 1

V2 07 00 00 049 051 1

V3 06 00 00 036 064 1

V4 00 07 00 049 051 1

V5 00 06 00 036 064 1

V6 00 05 00 025 075 1

V7 00 00 07 049 051 1

V8 00 00 06 036 064 1

V9 00 00 04 016 084 1

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

With factor correlations of

21

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 22: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

ML1 ML3 ML2

ML1 100 072 063

ML3 072 100 056

ML2 063 056 100

Mean item complexity = 1

Test of the hypothesis that 3 factors are sufficient

The degrees of freedom for the null model are 36 and the objective function was 188

The degrees of freedom for the model are 12 and the objective function was 0

The root mean square of the residuals (RMSR) is 0

The df corrected root mean square of the residuals is 0

Fit based upon off diagonal values = 1

Measures of factor score adequacy

ML1 ML3 ML2

Correlation of (regression) scores with factors 090 085 083

Multiple R square of scores with factors 080 072 069

Minimum correlation of possible factor scores 061 045 038

gt factorcongruence(list(pc3pa3ml3))

RC1 RC3 RC2 PA1 PA3 PA2 ML1 ML3 ML2

RC1 100 049 042 093 024 021 093 024 021

RC3 049 100 035 027 094 015 027 093 015

RC2 042 035 100 022 016 094 022 016 094

PA1 093 027 022 100 000 000 100 000 000

PA3 024 094 016 000 100 000 000 100 000

PA2 021 015 094 000 000 100 000 000 100

ML1 093 027 022 100 000 000 100 000 000

ML3 024 093 016 000 100 000 000 100 000

ML2 021 015 094 000 000 100 000 000 100

By default all three of these procedures use the varimax rotation criterion Perhaps it isuseful to apply an oblique transformation such as Promax or oblimin to the results ThePromax function in psych differs slightly from the standard promax in that it reports thefactor intercorrelations

gt ml3p lt- Promax(ml3)

gt ml3p

Call NULL

Standardized loadings (pattern matrix) based upon correlation matrix

ML1 ML3 ML2 h2 u2

V1 08 00 00 064 036

V2 07 00 00 049 051

V3 06 00 00 036 064

V4 00 07 00 049 051

V5 00 06 00 036 064

V6 00 05 00 025 075

V7 00 00 07 049 051

V8 00 00 06 036 064

V9 00 00 04 016 084

ML1 ML3 ML2

SS loadings 149 110 101

Proportion Var 017 012 011

Cumulative Var 017 029 040

22

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 23: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Proportion Explained 041 031 028

Cumulative Proportion 041 072 100

ML1 ML3 ML2

ML1 1 0 0

ML3 0 1 0

ML2 0 0 1

32 Exploratory hierarchical models

In addition to the conventional oblique factor model an alternative model is to consider thecorrelations between the factors to represent a higher order factor This can be shown eitheras a bifactor solution Holzinger and Swineford (1937) Schmid and Leiman (1957) with ageneral factor for all variables and a set of residualized group factors or as a hierarchicalstructure An exploratory hierarchical model can be applied to this kind of data structureusing the omega function Graphic options include drawing a Schmid - Leiman bifactorsolution (Figure 9) or drawing a hierarchical factor solution f(Figure 10)

321 A bifactor solution

The bifactor solution has a general factor loading for each variable as well as a set of residualgroup factors This approach has been used extensively in the measurement of ability andhas more recently been used in the measure of psychopathology (Reise et al 2007) Datasets included in the bifactor data include the original (Holzinger and Swineford 1937)data set (holzinger) as well as a set from Reise et al (2007) (reise) and a nine variableproblem from Thurstone

322 A hierarchical solution

Both of these graphical representations are reflected in the output of the omega functionThe first was done using a Schmid-Leiman transformation the second was not As will beseen later the objects returned from these two analyses may be used as models for a sem

analysis It is also useful to examine the estimates of reliability reported by omega

gt ombi

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 078

G6 078

Omega Hierarchical 07

Omega H asymptotic 085

23

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 24: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt ombi lt- omega(bifact)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

03

03

03

04

04

03

05

04

03

07

06

05

06

05

04

05

04

03

Figure 9 An exploratory bifactor solution to the nine variable problem

24

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 25: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt omhi lt- omega(bifactsl=FALSE)

Omega

F1

F2

F3

g

V1

V2

V3

V4

V5

V6

V7

V8

V9

08

07

06

07

06

05

07

06

04

09

08

07

Figure 10 An exploratory hierarchical solution to the nine variable problem

25

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 26: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Omega Total 082

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

V1 072 035 064 036 081

V2 063 031 049 051 081

V3 054 026 036 064 081

V4 056 042 049 051 064

V5 048 036 036 064 064

V6 040 030 025 075 064

V7 049 050 049 051 049

V8 042 043 036 064 049

V9 028 029 016 084 049

With eigenvalues of

g F1 F2 F3

241 028 040 052

generalmax 467 maxmin = 182

mean percent general = 065 with sd = 014 and cv of 021

Explained Common Variance of the general factor = 067

The degrees of freedom are 12 and the fit is 0

The root mean square of the residuals is 0

The df corrected root mean square of the residuals is 0

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 023

The root mean square of the residuals is 007

The df corrected root mean square of the residuals is 008

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 047 057 064

Multiple R square of scores with factors 074 022 033 041

Minimum correlation of factor score estimates 047 -056 -035 -018

Total General and Subset omega for each subset

g F1 F2 F3

26

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 27: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Omega total for total scores and subscales 082 074 063 059

Omega general for total scores and subscales 070 060 040 029

Omega group for total scores and subscales 012 014 023 030

Yet one more way to treat the hierarchical structure of a data set is to consider hierarchicalcluster analysis using the ICLUST algorithm (Figure 11) ICLUST is most appropriate forforming item composites

Hierarchical cluster analysis of bifact data

V9

V8

V7

V6

V5

V4

V3

V2

V1

C8

C7

C2

C6

C5

C3

C4

C1

α =

α =

α =

α =

α =

α =

α =

α =

078

058

059

076

062

059

074

072

061

048

059

066

057

059

068

072

065

064

072065

065

07

075

069

084065

065

073078

086075

075

Figure 11 A hierarchical cluster analysis of the bifact data set using ICLUST

4 Exploratory Structural Equation Modeling (ESEM)

Traditional Exploratory Factor Analysis (EFA) examines how latent variables can accountfor the correlations within a data set All loadings and cross loadings are found and

27

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 28: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

rotation is done to some approximation of simple structure Traditional ConfirmatoryFactor Analysis (CFA) tests such models by fitting just a limited number of loadings andtypically does not allow any (or many) cross loadings Structural Equation Modeling thenapplies two such measurement models one to a set of X variables another to a set ofY variables and then tries to estimate the correlation between these two sets of latentvariables (Some SEM procedures estimate all the parameters from the same model thusmaking the loadings in set Y affect those in set X) It is possible to do a similar exploratorymodeling (ESEM) by conducting two Exploratory Factor Analyses one in set X one inset Y and then finding the correlations of the X factors with the Y factors as well asthe correlations of the Y variables with the X factors and the X variables with the Yfactors

Consider the simulated data set of three ability variables two motivational variables andthree outcome variables

Call simstructural(fx = fx Phi = Phi fy = fy)

$model (Population correlation matrix)

V Q A nach Anx gpa Pre MA

V 100 072 054 000 000 038 032 025

Q 072 100 048 000 000 034 028 022

A 054 048 100 048 -042 050 042 034

nach 000 000 048 100 -056 034 028 022

Anx 000 000 -042 -056 100 -029 -024 -020

gpa 038 034 050 034 -029 100 030 024

Pre 032 028 042 028 -024 030 100 020

MA 025 022 034 022 -020 024 020 100

$reliability (population reliability)

V Q A nach Anx gpa Pre MA

081 064 072 064 049 036 025 016

We can fit this by using the esem function and then draw the solution (see Figure 12) usingthe esemdiagram function (which is normally called automatically by esem

Exploratory Structural Equation Modeling Analysis using method = minres

Call esem(r = gregpa$model varsX = 15 varsY = 68 nfX = 2 nfY = 1

nobs = 1000 plot = FALSE)

For the X set

X1 X2

V 091 -006

Q 081 -005

28

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 29: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

A 053 057

nach -010 081

Anx 008 -071

For the Y set

Y1

gpa 06

Pre 05

MA 04

Correlations between the X and Y sets

X1 X2 Y1

X1 100 019 068

X2 019 100 067

Y1 068 067 100

The degrees of freedom for the null model are 56 and the empirical chi square function was 693029

The degrees of freedom for the model are 7 and the empirical chi square function was 2183

with prob lt 00027

The root mean square of the residuals (RMSR) is 002

The df corrected root mean square of the residuals is 004

with the empirical chi square 2183 with prob lt 00027

The total number of observations was 1000 with fitted Chi Square = 217506 with prob lt 0

Empirical BIC = -2653

ESABIC = -429

Fit based upon off diagonal values = 1

To see the item loadings for the X and Y sets combined and the associated fa output print with short=FALSE

5 Confirmatory models

Although the exploratory models shown above do estimate the goodness of fit of the modeland compare the residual matrix to a zero matrix using a χ2 statistic they estimate moreparameters than are necessary if there is indeed a simple structure and they do not allowfor tests of competing models The sem function in the sem package by John Fox allowsfor confirmatory tests The interested reader is referred to the sem manual for more detail(Fox et al 2013)

29

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 30: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Exploratory Structural Model

V

Q

A

nach

Anx

X1

09

08

X2

06

08

minus07

gpa

Pre

MA

Y1

06

05

04

07

07

Figure 12 An example of a Exploratory Structure Equation Model

30

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 31: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

51 Using psych as a front end for the sem package

Because preparation of the sem commands is a bit tedious several of the psych packagefunctions have been designed to provide the appropriate commands That is the functionsstructurelist philist structurediagram structuresem and omegagraph maybe used as a front end to sem Usually with no modification but sometimes with justslight modification the model output from the structurediagram structuresem andomegagraph functions is meant to provide the appropriate commands for sem

52 Testing a congeneric model versus a tau equivalent model

The congeneric model is a one factor model with possibly unequal factor loadings Thetau equivalent model model is one with equal factor loadings Tests for these may be doneby creating the appropriate structures The structuregraph function which requiresRgraphviz or structurediagram or the structuresem functions which do not may beused

The following example tests the hypothesis (which is actually false) that the correlationsfound in the cong data set (see 21) are tau equivalent Because the variable labels in thatdata set were V1 V4 we specify the labels to match those

gt library(sem)

gt modtau lt- structuresem(c(aaaa)labels=paste(V14sep=))

gt modtau show it

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 a NA

[3] X1-gtV3 a NA

[4] X1-gtV4 a NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semtau lt- sem(modtaucong100)

gt summary(semtaudigits=2)

Model Chisquare = 6593496 Df = 5 Pr(gtChisq) = 02526696

AIC = 165935

BIC = -1643236

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-103157 -044199 -025025 -007905 052702 088767

R-square for Endogenous Variables

V1 V2 V3 V4

05245 04592 04500 04432

Parameter Estimates

31

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 32: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Estimate Std Error z value Pr(gt|z|)

a 06865481 006299180 10899007 1165221e-27 V1 lt--- X1

x1e 04272839 008086561 5283876 1264786e-07 V1 lt--gt V1

x2e 05551772 009751222 5693411 1245260e-08 V2 lt--gt V2

x3e 05760999 010030974 5743210 9289853e-09 V3 lt--gt V3

x4e 05920607 010245375 5778809 7523134e-09 V4 lt--gt V4

Iterations = 11

Test whether the data are congeneric That is whether a one factor model fits Comparethis to the prior model using the anova function

gt modcong lt- structuresem(c(abcd)labels=paste(V14sep=))

gt modcong show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] V1lt-gtV1 x1e NA

[6] V2lt-gtV2 x2e NA

[7] V3lt-gtV3 x3e NA

[8] V4lt-gtV4 x4e NA

[9] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semcong lt- sem(modcongcong100)

gt summary(semcongdigits=2)

Model Chisquare = 2941678 Df = 2 Pr(gtChisq) = 02297327

AIC = 1894168

BIC = -6268663

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-05739 -00699 00339 00113 01605 05412

R-square for Endogenous Variables

V1 V2 V3 V4

06880 04384 03942 03524

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 08294562 009786772 8475279 2345174e-17 V1 lt--- X1

b 06621164 010066777 6577243 4792500e-11 V2 lt--- X1

c 06278767 010146860 6187891 6097433e-10 V3 lt--- X1

d 05936695 010238816 5798224 6702094e-09 V4 lt--- X1

x1e 03120026 010044870 3106089 1895798e-03 V1 lt--gt V1

x2e 05616018 010154893 5530356 3195810e-08 V2 lt--gt V2

x3e 06057707 010421285 5812822 6142832e-09 V3 lt--gt V3

x4e 06475566 010732995 6033326 1606191e-09 V4 lt--gt V4

Iterations = 12

gt anova(semcongsemtau) test the difference between the two models

LR Test for Difference Between Models

32

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 33: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semcong 2 29417

semtau 5 65935 3 36518 03016

The anova comparison of the congeneric versus tau equivalent model shows that the changein χ2 is significant given the change in degrees of freedom

53 Testing the dimensionality of a hierarchical data set by creating themodel

The bifact correlation matrix was created to represent a hierarchical structure Variousconfirmatory models can be applied to this matrix

The first example creates the model directly the next several create models based uponexploratory factor analyses modone is a congeneric model of one factor accounting forthe relationships between the nine variables Although not correct with 100 subjectsthis model can not be rejected However an examination of the residuals suggests seriousproblems with the model

gt modone lt- structuresem(letters[19]labels=paste(V19sep=))

gt modone show the model

Path Parameter Value

[1] X1-gtV1 a NA

[2] X1-gtV2 b NA

[3] X1-gtV3 c NA

[4] X1-gtV4 d NA

[5] X1-gtV5 e NA

[6] X1-gtV6 f NA

[7] X1-gtV7 g NA

[8] X1-gtV8 h NA

[9] X1-gtV9 i NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] X1lt-gtX1 NA 1

attr(class)

[1] mod

gt semone lt- sem(modonebifact100)

gt summary(semonedigits=2)

Model Chisquare = 2116848 Df = 27 Pr(gtChisq) = 0778334

AIC = 5716848

BIC = -1031711

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

33

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 34: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

-03336510 -02923555 -01940195 00369476 00000019 18875159

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

05636 04524 03377 03292 02522 01798 02568 01980 00932

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

a 07507129 009512871 7891549 2984584e-15 V1 lt--- X1

b 06726412 009807150 6858682 6949882e-12 V2 lt--- X1

c 05811209 010137932 5732145 9916850e-09 V3 lt--- X1

d 05737425 010163173 5645309 1648847e-08 V4 lt--- X1

e 05021915 010392785 4832116 1350893e-06 V5 lt--- X1

f 04239908 010609489 3996335 6433059e-05 V6 lt--- X1

g 05067957 010378883 4882950 1045102e-06 V7 lt--- X1

h 04450171 010554867 4216227 2484236e-05 V8 lt--- X1

i 03052415 010867168 2808841 4972015e-03 V9 lt--- X1

x1e 04364302 008884720 4912143 9008624e-07 V1 lt--gt V1

x2e 05475539 009653118 5672300 1408927e-08 V2 lt--gt V2

x3e 06622982 010678972 6201891 5578883e-10 V3 lt--gt V3

x4e 06708197 010761127 6233731 4554546e-10 V4 lt--gt V4

x5e 07478036 011527456 6487153 8747374e-11 V5 lt--gt V5

x6e 08202314 012278021 6680485 2381530e-11 V6 lt--gt V6

x7e 07431581 011480162 6473411 9581477e-11 V7 lt--gt V7

x8e 08019593 012086557 6635134 3242077e-11 V8 lt--gt V8

x9e 09068284 013200380 6869714 6433079e-12 V9 lt--gt V9

Iterations = 11

gt round(residuals(semone)2)

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 006 004 -003 -003 -003 -003 -003 -003

V2 006 000 003 -003 -004 -003 -003 -003 -003

V3 004 003 000 -003 -003 -003 -003 -003 -003

V4 -003 -003 -003 000 013 011 -002 -002 -002

V5 -003 -004 -003 013 000 009 -002 -002 -002

V6 -003 -003 -003 011 009 000 -002 -002 -002

V7 -003 -003 -003 -002 -002 -002 000 019 013

V8 -003 -003 -003 -002 -002 -002 019 000 010

V9 -003 -003 -003 -002 -002 -002 013 010 000

54 Testing the dimensionality based upon an exploratory analysis

Alternatively the output from an exploratory factor analysis can be used as input to thestructuresem function

gt f1 lt- fa(bifact)

gt modf1 lt- structuresem(f1)

gt semf1 lt- sem(modf1bifact100)

gt semf1

Model Chisquare = 2116848 Df = 27

V1 V2 V3 V4 V5 V6 V7 V8 V9

07507129 06726412 05811209 05737425 05021915 04239908 05067957 04450171 03052415

x1e x2e x3e x4e x5e x6e x7e x8e x9e

04364302 05475539 06622982 06708197 07478036 08202314 07431581 08019593 09068284

34

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 35: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Iterations = 11

The answers are of course identical

55 Specifying a three factor model

An alternative model is to extract three factors and try this solution The fa factoranalysis function (using the minimum residual algorithm) is used to detect the structureAlternatively the factanal could have been used Rather than use the default rotationof oblimin we force an orthogonal solution (even though we know it will be a poorsolution)

gt f3 lt-fa(bifact3rotate=varimax)

gt modf3 lt- structuresem(f3)

gt semf3 lt- sem(modf3bifact100)

gt summary(semf3digits=2)

Model Chisquare = 5386635 Df = 27 Pr(gtChisq) = 0001579738

AIC = 8986635

BIC = -7047325

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-0000003 0000000 1950175 1642171 2632737 4011789

R-square for Endogenous Variables

V1 V2 V3 V4 V5 V6 V7 V8 V9

064 049 036 049 036 025 049 036 016

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

F1V1 08000000 01114517 7177994 7074151e-13 V1 lt--- MR1

F1V2 07000001 01089845 6422931 1336754e-10 V2 lt--- MR1

F1V3 06000000 01068002 5617968 1932167e-08 V3 lt--- MR1

F2V4 06999999 01427544 4903527 9413091e-07 V4 lt--- MR3

F2V5 06000001 01328610 4515998 6301927e-06 V5 lt--- MR3

F2V6 04999995 01238740 4036354 5428827e-05 V6 lt--- MR3

F3V7 07000001 01680059 4166522 3092827e-05 V7 lt--- MR2

F3V8 06000000 01530271 3920873 8822871e-05 V8 lt--- MR2

F3V9 04000005 01265677 3160368 1575701e-03 V9 lt--- MR2

x1e 03600000 01297434 2774707 5525146e-03 V1 lt--gt V1

x2e 05099999 01165643 4375268 1212834e-05 V2 lt--gt V2

x3e 06399999 01130156 5662936 1488043e-08 V3 lt--gt V3

x4e 05100000 01739239 2932316 3364440e-03 V4 lt--gt V4

x5e 06399998 01475345 4337967 1438068e-05 V5 lt--gt V5

x6e 07500007 01336788 5610466 2017821e-08 V6 lt--gt V6

x7e 05100000 02136118 2387509 1696298e-02 V7 lt--gt V7

x8e 06400001 01734024 3690837 2235172e-04 V8 lt--gt V8

x9e 08400000 01362332 6165898 7008420e-10 V9 lt--gt V9

Iterations = 24

gt round(residuals(semf3)2)

35

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 36: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

V1 V2 V3 V4 V5 V6 V7 V8 V9

V1 000 000 000 040 035 029 035 030 020

V2 000 000 000 035 030 025 031 026 018

V3 000 000 000 030 026 022 026 023 015

V4 040 035 030 000 000 000 027 024 016

V5 035 030 026 000 000 000 024 020 013

V6 029 025 022 000 000 000 020 017 011

V7 035 031 026 027 024 020 000 000 000

V8 030 026 023 024 020 017 000 000 000

V9 020 018 015 016 013 011 000 000 000

The residuals show serious problems with this model Although the residuals within eachof the three factors are zero the residuals between groups are much too large

56 Allowing for an oblique solution

The previous solution is clearly very bad What would happen if the exploratory solutionwere allowed to have correlated (oblique) factors

gt f3 lt-fa(bifact3) extract three factors and do an oblique rotation

gt modf3 lt- structuresem(f3) create the sem model

gt modf3 show it

Path Parameter Value

[1] MR1-gtV1 F1V1 NA

[2] MR1-gtV2 F1V2 NA

[3] MR1-gtV3 F1V3 NA

[4] MR3-gtV4 F2V4 NA

[5] MR3-gtV5 F2V5 NA

[6] MR3-gtV6 F2V6 NA

[7] MR2-gtV7 F3V7 NA

[8] MR2-gtV8 F3V8 NA

[9] MR2-gtV9 F3V9 NA

[10] V1lt-gtV1 x1e NA

[11] V2lt-gtV2 x2e NA

[12] V3lt-gtV3 x3e NA

[13] V4lt-gtV4 x4e NA

[14] V5lt-gtV5 x5e NA

[15] V6lt-gtV6 x6e NA

[16] V7lt-gtV7 x7e NA

[17] V8lt-gtV8 x8e NA

[18] V9lt-gtV9 x9e NA

[19] MR3lt-gtMR1 rF2F1 NA

[20] MR2lt-gtMR1 rF3F1 NA

[21] MR2lt-gtMR3 rF3F2 NA

[22] MR1lt-gtMR1 NA 1

[23] MR3lt-gtMR3 NA 1

[24] MR2lt-gtMR2 NA 1

attr(class)

[1] mod

The structure being tested may be seen using structuregraph

This makes much better sense and in fact (as hoped) recovers the original structure

36

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 37: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Structural model

V1

V2

V3

V4

V5

V6

V7

V8

V9

MR1

08

0706

MR307

0605

MR207

06

04

07

06

06

Figure 13 A three factor oblique solution

37

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 38: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

57 Extract a bifactor solution using omega and then test that modelusing sem

A bifactor solution has previously been shown (Figure 9) The output from the omega

function includes the sem commands for the analysis As an example of doing this withreal rather than simulated data consider 9 variables from Thurstone For completenessthe stdCoef from sem is used as well as the summary function

571 sem of Thurstone 9 variable problem

The sem manual includes an example of a hierarchical solution to 9 mental abilities origi-nally reported by Thurstone and used in the SAS manual for PROC CALIS and discussedin detail by McDonald (1999) The data matrix as reported by Fox may be found in theThurstone data set (which is ldquolazy loadedrdquo) Using the commands just shown it is possibleto analyze this data set using a bifactor solution (Figure 14)

gt sembi lt- sem(omthbi$model$semThurstone213) use the model created by omega

gt summary(sembidigits=2)

Model Chisquare = 242163 Df = 18 Pr(gtChisq) = 01480685

AIC = 782163

BIC = -7228696

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-08211748 -03341044 -00000009 00281693 01561983 17968088

R-square for Endogenous Variables

Sentences Vocabulary SentCompletion FirstLetters FourLetterWords

08276 08302 07315 07472 06126

Suffixes LetterSeries Pedigrees LetterGroup

04824 08503 04996 04483

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

Sentences 07678671 007059396 108772353 1479833e-27

Vocabulary 07909248 006969232 113488087 7518003e-30

SentCompletion 07536211 007113218 105946585 3154903e-26

FirstLetters 06083814 007063841 86126138 7141338e-18

FourLetterWords 05973349 007092937 84215455 3715499e-17

Suffixes 05717903 007157752 79884057 1366950e-15

LetterSeries 05668949 007249339 78199523 5284337e-15

Pedigrees 06623314 007003035 94577757 3145633e-21

LetterGroup 05299524 007332494 72274501 4921470e-13

F1Sentences 04878698 008141095 59926801 2064107e-09

F1Vocabulary 04523234 008353995 54144562 6147524e-08

F1SentCompletion 04044507 008727334 46342988 3581494e-06

F2FirstLetters 06140531 008471145 72487623 4205973e-13

F2FourLetterWords 05058063 008145488 62096500 5310276e-10

F2Suffixes 03943208 007805383 50519075 4374195e-07

F3LetterSeries 07272955 015844866 45901015 4430304e-06

F3Pedigrees 02468417 008677536 28446053 4446649e-03

F3LetterGroup 04091495 011352380 36040854 3132541e-04

e1 01723633 003405646 50611045 4168346e-07

38

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 39: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

056

055

052

024

056

049

041

062

046

034

071

073

068

065

062

056

059

054

058

Figure 14 A bifactor solution to the Thurstone 9 variable problem All items load ona general factor of ability the residual factors account for the correlations between itemswithin groups

39

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 40: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

e2 01698419 003001233 56590697 1521958e-08

e3 02684749 003316228 80957909 5689350e-16

e4 02528108 007942835 31828791 1458185e-03

e5 03873510 006317399 61314949 8705712e-10

e6 05175679 005955079 86912013 3586269e-18

e7 01496709 021861502 06846325 4935759e-01

e8 05003855 005956551 84005902 4442400e-17

e9 05517474 008455914 65249884 6800680e-11

Sentences Sentences lt--- g

Vocabulary Vocabulary lt--- g

SentCompletion SentCompletion lt--- g

FirstLetters FirstLetters lt--- g

FourLetterWords FourLetterWords lt--- g

Suffixes Suffixes lt--- g

LetterSeries LetterSeries lt--- g

Pedigrees Pedigrees lt--- g

LetterGroup LetterGroup lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 72

gt stdCoef(sembidigits=2)

Std Estimate

1 Sentences 07678671 Sentences lt--- g

2 Vocabulary 07909246 Vocabulary lt--- g

3 SentCompletion 07536211 SentCompletion lt--- g

4 FirstLetters 06083814 FirstLetters lt--- g

5 FourLetterWords 05973349 FourLetterWords lt--- g

6 Suffixes 05717900 Suffixes lt--- g

7 LetterSeries 05668950 LetterSeries lt--- g

8 Pedigrees 06623317 Pedigrees lt--- g

9 LetterGroup 05299523 LetterGroup lt--- g

10 F1Sentences 04878697 Sentences lt--- F1

11 F1Vocabulary 04523233 Vocabulary lt--- F1

12 F1SentCompletion 04044507 SentCompletion lt--- F1

13 F2FirstLetters 06140531 FirstLetters lt--- F2

14 F2FourLetterWords 05058063 FourLetterWords lt--- F2

15 F2Suffixes 03943206 Suffixes lt--- F2

16 F3LetterSeries 07272957 LetterSeries lt--- F3

40

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 41: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

17 F3Pedigrees 02468418 Pedigrees lt--- F3

18 F3LetterGroup 04091494 LetterGroup lt--- F3

19 e1 01723633 Sentences lt--gt Sentences

20 e2 01698418 Vocabulary lt--gt Vocabulary

21 e3 02684748 SentCompletion lt--gt SentCompletion

22 e4 02528108 FirstLetters lt--gt FirstLetters

23 e5 03873510 FourLetterWords lt--gt FourLetterWords

24 e6 05175675 Suffixes lt--gt Suffixes

25 e7 01496710 LetterSeries lt--gt LetterSeries

26 e8 05003859 Pedigrees lt--gt Pedigrees

27 e9 05517473 LetterGroup lt--gt LetterGroup

28 10000000 F1 lt--gt F1

29 10000000 F2 lt--gt F2

30 10000000 F3 lt--gt F3

31 10000000 g lt--gt g

Compare this solution to the one reported below and to the sem manual

58 Examining a hierarchical solution

A hierarchical solution to this data set was previously found by the omega function (Fig-ure 10) The output of that analysis can be used as a model for a sem analysis Once againthe stdCoef function helps see the structure Alternatively using the omega function onthe Thurstone data will create the model for this particular data set

gt semhi lt- sem(omhi$model$semThurstone213)

gt summary(semhidigits=2)

Model Chisquare = 381963 Df = 24 Pr(gtChisq) = 003310059

AIC = 801963

BIC = -9047471

Normalized Residuals

Min 1st Qu Median Mean 3rd Qu Max

-09724643 -04164673 -00000001 00401007 00938588 16274666

R-square for Endogenous Variables

F1 F2 F3 Sentences Vocabulary

06758 06112 06642 08185 08351

SentCompletion FirstLetters FourLetterWords Suffixes LetterSeries

07329 06985 06355 04936 06097

Pedigrees LetterGroup

05186 04949

Parameter Estimates

Estimate Std Error z value Pr(gt|z|)

gF1 14438115 025653564 5628113 1821922e-08

gF2 12538296 021136562 5932041 2991910e-09

gF3 14065517 026890804 5230605 1689563e-07

F1Sentences 05151232 006292248 8186632 2686376e-16

F1Vocabulary 05203104 006338431 8208820 2233734e-16

F1SentCompletion 04874316 006081528 8014954 1101786e-15

F2FirstLetters 05211221 006106205 8534304 1410015e-17

F2FourLetterWords 04970664 005902388 8421446 3718664e-17

F2Suffixes 04380644 005595794 7828458 4938915e-15

F3LetterSeries 04524352 006596903 6858297 6968649e-12

41

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 42: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Omega

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

LetterGroup

Pedigrees

09

089

084

038

085

075

063

021 084

063

046

078

076

068

Figure 15 Hierarchical analysis of the Thurstone 9 variable problem using an exploratoryalgorithm can provide the appropriate sem code for analysis using the sem package

42

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 43: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

F3Pedigrees 04172903 006215816 6713363 1901887e-11

F3LetterGroup 04076312 006131399 6648258 2965820e-11

e1 01814979 002847741 6373397 1848862e-10

e2 01649304 002776938 5939292 2862558e-09

e3 02671331 003336340 8006771 1177597e-15

e4 03015024 005102191 5909274 3436179e-09

e5 03645010 005263547 6925008 4359513e-12

e6 05064150 005962608 8493180 2010593e-17

e7 03903313 005933649 6578268 4759607e-11

e8 04813697 006224844 7733041 1050075e-14

e9 05051017 006332869 7975875 1513055e-15

gF1 F1 lt--- g

gF2 F2 lt--- g

gF3 F3 lt--- g

F1Sentences Sentences lt--- F1

F1Vocabulary Vocabulary lt--- F1

F1SentCompletion SentCompletion lt--- F1

F2FirstLetters FirstLetters lt--- F2

F2FourLetterWords FourLetterWords lt--- F2

F2Suffixes Suffixes lt--- F2

F3LetterSeries LetterSeries lt--- F3

F3Pedigrees Pedigrees lt--- F3

F3LetterGroup LetterGroup lt--- F3

e1 Sentences lt--gt Sentences

e2 Vocabulary lt--gt Vocabulary

e3 SentCompletion lt--gt SentCompletion

e4 FirstLetters lt--gt FirstLetters

e5 FourLetterWords lt--gt FourLetterWords

e6 Suffixes lt--gt Suffixes

e7 LetterSeries lt--gt LetterSeries

e8 Pedigrees lt--gt Pedigrees

e9 LetterGroup lt--gt LetterGroup

Iterations = 54

gt stdCoef(semhidigits=2)

Std Estimate

1 gF1 08220754 F1 lt--- g

2 gF2 07817998 F2 lt--- g

3 gF3 08150140 F3 lt--- g

4 F1Sentences 09047111 Sentences lt--- F1

5 F1Vocabulary 09138214 Vocabulary lt--- F1

6 F1SentCompletion 08560764 SentCompletion lt--- F1

7 F2FirstLetters 08357617 FirstLetters lt--- F2

8 F2FourLetterWords 07971819 FourLetterWords lt--- F2

9 F2Suffixes 07025560 Suffixes lt--- F2

10 F3LetterSeries 07808129 LetterSeries lt--- F3

11 F3Pedigrees 07201599 Pedigrees lt--- F3

12 F3LetterGroup 07034902 LetterGroup lt--- F3

13 e1 01814979 Sentences lt--gt Sentences

14 e2 01649304 Vocabulary lt--gt Vocabulary

15 e3 02671331 SentCompletion lt--gt SentCompletion

16 e4 03015024 FirstLetters lt--gt FirstLetters

17 e5 03645010 FourLetterWords lt--gt FourLetterWords

18 e6 05064151 Suffixes lt--gt Suffixes

19 e7 03903313 LetterSeries lt--gt LetterSeries

43

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 44: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

20 e8 04813697 Pedigrees lt--gt Pedigrees

21 e9 05051016 LetterGroup lt--gt LetterGroup

22 03241920 F1 lt--gt F1

23 03887891 F2 lt--gt F2

24 03357521 F3 lt--gt F3

25 10000000 g lt--gt g

gt anova(semhisembi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semhi 24 38196

sembi 18 24216 6 1398 002986

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

Using the Thurstone data set we see what happens when a hierarchical model is applied toreal data The exploratory structure derived from the omega function (Figure 15) providesestimates in close approximation to those found using sem The model definition createdby using omega is the same hierarchical model discussed in the sem help page The bifactormodel with 6 more parameters does provide a better fit to the data than the hierarchicalmodel

Similar analyses can be done with other data that are organized hierarchically Examplesof these analyses are analyzing the 14 variables of holzinger and the 16 variables of reiseThe output from the following analyses has been limited to just the comparison betweenthe bifactor and hierarchical solutions

gt omholzbi lt- omega(Holzinger4)

gt semholzbi lt- sem(omholzbi$model$semHolzinger355)

gt omholzhi lt- omega(Holzinger4sl=FALSE)

gt semholzhi lt- sem(omholzhi$model$semHolzinger355)

gt anova(semholzbisemholzhi)

LR Test for Difference Between Models

Model Df Model Chisq Df LR Chisq Pr(gtChisq)

semholzbi 63 14766

semholzhi 73 17879 10 31129 00005587

---

Signif codes 0 lsquorsquo 0001 lsquorsquo 001 lsquorsquo 005 lsquorsquo 01 lsquo rsquo 1

59 Estimating Omega using EFA followed by CFA

The function omegaSem combines both an exploratory factor analysis using omega thencalls the appropriate sem functions and organizes the results as in a standard omega anal-ysis

An example is found from the Thurstone data set of 9 cognitive variables

gt omsem lt- omegaSem(Thurstonenobs=213)

44

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 45: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Call omegaSem(m = Thurstone nobs = 213)

Omega

Call omegah(m = m nfactors = nfactors fm = fm key = key flip = flip

digits = digits title = title sl = sl labels = labels

plot = plot nobs = nobs rotate = rotate Phi = Phi option = option

covar = covar)

Alpha 089

G6 091

Omega Hierarchical 074

Omega H asymptotic 079

Omega Total 093

Schmid Leiman Factor loadings greater than 02

g F1 F2 F3 h2 u2 p2

Sentences 071 056 082 018 061

Vocabulary 073 055 084 016 063

SentCompletion 068 052 074 026 063

FirstLetters 065 056 073 027 057

FourLetterWords 062 049 063 037 061

Suffixes 056 041 050 050 063

LetterSeries 059 062 073 027 048

Pedigrees 058 024 034 051 049 066

LetterGroup 054 046 052 048 056

With eigenvalues of

g F1 F2 F3

358 096 074 072

generalmax 373 maxmin = 134

mean percent general = 06 with sd = 005 and cv of 009

Explained Common Variance of the general factor = 06

The degrees of freedom are 12 and the fit is 001

The number of observations was 213 with Chi Square = 298 with prob lt 1

The root mean square of the residuals is 001

The df corrected root mean square of the residuals is 001

RMSEA index = 0 and the 10 confidence intervals are 0 0

BIC = -6136

Compare this with the adequacy of just a general factor and no group factors

The degrees of freedom for just the general factor are 27 and the fit is 148

The number of observations was 213 with Chi Square = 30671 with prob lt 33e-49

The root mean square of the residuals is 014

The df corrected root mean square of the residuals is 016

RMSEA index = 022 and the 10 confidence intervals are 0199 0244

BIC = 16196

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 086 073 072 075

Multiple R square of scores with factors 074 054 051 057

Minimum correlation of factor score estimates 049 007 003 013

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 083 079

Omega general for total scores and subscales 074 058 050 047

45

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 46: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Omega group for total scores and subscales 016 034 032 032

The following analyses were done using the lavaan package

Omega Hierarchical from a confirmatory model using sem = 079

Omega Total from a confirmatory model using sem = 093

With loadings of

g F1 F2 F3 h2 u2 p2

Sentences 077 049 082 018 072

Vocabulary 079 045 083 017 075

SentCompletion 075 040 073 027 077

FirstLetters 061 061 074 026 050

FourLetterWords 060 050 061 039 059

Suffixes 057 039 048 052 068

LetterSeries 057 073 085 015 038

Pedigrees 066 025 050 050 087

LetterGroup 053 041 045 055 062

With eigenvalues of

g F1 F2 F3

386 060 078 075

The degrees of freedom of the confirmatory model are 18 and the fit is 2433052 with p = 01444947

generalmax 492 maxmin = 13

mean percent general = 065 with sd = 015 and cv of 023

Explained Common Variance of the general factor = 064

Measures of factor score adequacy

g F1 F2 F3

Correlation of scores with factors 090 068 080 085

Multiple R square of scores with factors 081 046 063 073

Minimum correlation of factor score estimates 061 -008 027 045

Total General and Subset omega for each subset

g F1 F2 F3

Omega total for total scores and subscales 093 092 082 080

Omega general for total scores and subscales 079 069 048 050

Omega group for total scores and subscales 014 023 035 031

To get the standard sem fit statistics ask for summary on the fitted object

Comparing the two models graphically (Figure 16 with Figure 14 shows that while notidentical they are very similar The sem version is basically a forced simple structureNotice that the values of ωh are not identical from the EFA and CFA models The CFAsolution yields higher values of ωh because by forcing a pure cluster solution (no crossloadings) the correlations between the factors is forced to be through the g factor

6 Summary and conclusion

The use of exploratory and confirmatory models for understanding real data structuresis an important advance in psychological research To understand these approaches it ishelpful to try them first on ldquobabyrdquo data sets To the extent that the models we use canbe tested on simple artificial examples it is perhaps easier to practice their application

46

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 47: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

Omega from SEM

F1

F2

F3

g

Sentences

Vocabulary

SentCompletion

FirstLetters

FourLetterWords

Suffixes

LetterSeries

Pedigrees

LetterGroup

05

05

04

06

05

04

07

02

04

08

08

08

06

06

06

06

07

05

Figure 16 Confirmatory Omega structure using omegaSem

47

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 48: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

The psych tools for simulating structural models and for specifying models are a usefulsupplement to the power of packages such as sem The techniques that can be used onsimulated data set can also be applied to real data sets

48

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 49: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

gt sessionInfo()

R version 402 (2020-06-22)

Platform x86_64-apple-darwin170 (64-bit)

Running under macOS Catalina 10156

Matrix products default

BLAS LibraryFrameworksRframeworkVersions40ResourcesliblibRblasdylib

LAPACK LibraryFrameworksRframeworkVersions40ResourcesliblibRlapackdylib

Random number generation

RNG Mersenne-Twister

Normal Inversion

Sample Rounding

locale

[1] en_USUTF-8en_USUTF-8en_USUTF-8Cen_USUTF-8en_USUTF-8

attached base packages

[1] stats graphics grDevices utils datasets methods base

other attached packages

[1] GPArotation_201411-1 sem_31-11 psychTools_208 psych_208

loaded via a namespace (and not attached)

[1] Rcpp_105 splines_402 MASS_73-516 mnormt_201 statmod_1434

[6] pbivnorm_060 arm_111-1 lattice_020-41 mi_10 minqa_124

[11] tools_402 matrixcalc_10-3 parallel_402 grid_402 tmvnsim_10-2

[16] nlme_31-148 coda_019-3 abind_14-5 lme4_11-23 lavaan_06-6

[21] Matrix_12-18 nloptr_1222 compiler_402 boot_13-25 stats4_402

[26] foreign_08-80

References

Adams D (1980) The hitchhikerrsquos guide to the galaxy Harmony Books New York 1stAmerican edition

Bechtoldt H (1961) An empirical study of the factor analysis stability hypothesis Psy-chometrika 26(4)405ndash432

Fox J (2006) Structural equation modeling with the sem package in R StructuralEquation Modeling 13465ndash486

Fox J (2009) sem Structural Equation Models R package version 09-15

Fox J Nie Zhenghua and Byrnes J (2013) sem Structural Equation Models R packageversion 31-3

Holzinger K and Swineford F (1937) The bi-factor method Psychometrika 2(1)41ndash54

Jensen A R and Weng L-J (1994) What is a good g Intelligence 18(3)231ndash258

49

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion
Page 50: Using the psych package to generate and test structural models · 2020. 8. 22. · 1 The psych package 1.1 Preface The psych package (Revelle,2020) has been developed to include those

McDonald R P (1999) Test theory A unified treatment L Erlbaum Associates MahwahNJ

Rafaeli E and Revelle W (2006) A premature consensus Are happiness and sadnesstruly opposite affects Motivation and Emotion 30(1)1ndash12

Reise S Morizot J and Hays R (2007) The role of the bifactor model in resolvingdimensionality issues in health outcomes measures Quality of Life Research 16(0)19ndash31

Revelle W (2020) psych Procedures for Personality and Psychological Research North-western University Evanston R package version 208

Rosseel Y (2012) llavaan An R Package for Structural Equation Modeling Journal ofStatistical Software 48(2)1-36 R package version 05-14

Schmid J J and Leiman J M (1957) The development of hierarchical factor solutionsPsychometrika 22(1)83ndash90

Thurstone L L and Thurstone T G (1941) Factorial studies of intelligence TheUniversity of Chicago press Chicago Ill

50

  • The psych package
    • Preface
    • Creating and modeling structural relations
      • Functions for generating correlational matrices with a particular structure
        • simcongeneric
        • simhierarchical
        • simitem and simcirc
        • simstructure
          • f f f fx is a vector implies a congeneric model
          • f f f fx is a matrix implies an independent factors model
          • f f f fx is a matrix and Phi =I is a correlated factors model
          • f f f fx and f f f fy are matrices and Phi =I represents their correlations
          • A hierarchical structure among the latent predictors
              • Exploratory functions for analyzing structure
                • Exploratory simple structure models
                • Exploratory hierarchical models
                  • A bifactor solution
                  • A hierarchical solution
                      • Exploratory Structural Equation Modeling (ESEM)
                      • Confirmatory models
                        • Using psych as a front end for the sem package
                        • Testing a congeneric model versus a tau equivalent model
                        • Testing the dimensionality of a hierarchical data set by creating the model
                        • Testing the dimensionality based upon an exploratory analysis
                        • Specifying a three factor model
                        • Allowing for an oblique solution
                        • Extract a bifactor solution using omega and then test that model using sem
                          • sem of Thurstone 9 variable problem
                            • Examining a hierarchical solution
                            • Estimating Omega using EFA followed by CFA
                              • Summary and conclusion