missouri state university · web viewcan be checked using mahlanobis distance linearity...

14
Structural Equation Modeling Theory: 1. Confirmatory 2. Theory-driven 3. Complex multiple regression 4. Not causal, model is based on correlation 5. Model = Data + Error Definitions: 1. Measurement and Structural Models a. A measurement model is a latent variable and the observed variables that measure the latent. b. A structural model includes the relationships among latent variables. 1 | Page Kayla N. Jordan – R Stats Workshop Spring 2014 Measurement Model Structural

Upload: others

Post on 25-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

Structural Equation ModelingTheory:

1. Confirmatory2. Theory-driven3. Complex multiple regression4. Not causal, model is based on correlation5. Model = Data + Error

Definitions:1. Measurement and Structural Models

a. A measurement model is a latent variable and the observed variables that measure the latent.

b. A structural model includes the relationships among latent variables.

2. Latent and Manifest Variablesa. Latent variables are the constructs being measured. Examples of latent variables

include IQ, depression, political orientation, short-term memory, and consumerism. There must be at least two manifest (or observed variables) to measure each latent variable with more variables being better. Latent variables are represented by ovals in the model.

1 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Measurement Model

Structural Model

Page 2: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

b. Manifest or observed variables are the actual measurements on a construct. These could include scores on an IQ test, clinical ratings of depression symptoms, support of political candidates, and violent behaviors. Observed variables are represented by squares in the model.

c. SEM works best is observed variables are measured on a continuous scale. d. Measures should be reliable and valid. If they variables are not measured well,

then the model will not work.

3. Endogenous and Exogenous Variablesa. Endogenous variables are latent variables and are the independent variables in the

model. These variables do not have any arrows going into them. b. Exogenous variables are latent variables and are the dependent variables in the

model. These are any variables which have arrows going into them.

2 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Latent VariableExogenous Variable

Manifest or Observed Variable

Error

Residual

Endogenous

Page 3: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

4. Knowns and Unknownsa. The knowns in the model are all the correlations derived from the data. The

number of knowns can be calculated with the formula, n(n+1)/2, where n is the number of observed variables in the model.

b. The unknowns (or parameters) in the model are the number of arrows in the model which represent all the pathways being estimated.

c. Degrees of freedom = knowns - unknowns5. Underidentified and Overidentified Models

a. Underidentified models are ones in which there are more unknowns than knowns. In order to fix these, certain parameters should be constrained, or forced to equal a certain value. Typically, this is accomplished by setting one of the factor loadings for each latent equal to one; this also assigns a scale for the latent variable.

b. Overidentified models are ones in which there are more knowns than unknowns. This is good.

6. Example of Calculating Degrees of Freedom:a. Knowns: n(n+1)/2 = 8(9)/2 = 36b. Unknowns: 5 factor loadings, 2 path coefficients, 8 error variances, 2 residuals =

17 total unknownsc. Degrees of Freedom: Knowns – Unknowns = 19

7. Arrowsa. A straight arrow represents a path in the model.b. A curved arrow represents a correlation between two variables.

8. Estimates

3 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Page 4: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

a. Factor loadings are used to determine if an observed variable is measuring the latent variable. If factor loadings are not significant, then the observed variable is not measuring the latent variable well.

b. Path coefficients or beta weights are used to determine how well exogenous (predictor) variables are predicting endogenous (outcome) variables. If path coefficients are not significant, then the exogenous variable is not predicting the endogenous variable.

c. Error variances are associated with each observed variable and are a measure of how much error there is in measuring each observed variable. If high, then there are problems with measure the variable.

d. Residuals are associated with each endogenous variable and represent the difference between the hypothesized model and the data. These can also be used to calculate an effect size for the relationship between two variables in terms of the variance accounted for by squaring the residual and subtracting that value from one.

4 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Regression Weights: (Group number 1 - Default model)

Estimate S.E. C.R. P Label H1 <--- Harm 1.000 H2 <--- Harm 1.020 .112 9.105 *** H3 <--- Harm 1.096 .111 9.843 *** H4 <--- Harm .883 .112 7.900 *** H5 <--- Harm .692 .153 4.508 *** H6 <--- Harm .643 .171 3.754 ***

Standardized Regression Weights: (Group number 1 - Default model)

Estimate H1 <--- Harm .716 H2 <--- Harm .705 H3 <--- Harm .768 H4 <--- Harm .608 H5 <--- Harm .344 H6 <--- Harm .286

Indicates all observed variables are measuring the latent variable.

Values closer to one indicate that the observed variable is measuring latent better (e.g., H3 is a better item than H6)

Page 5: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

Types of Models1. Confirmatory Factor Analysis (CFA)

a. Tests the validity of a factor structureb. Should be based on theory or exploratory factor analysis (EFA)c. Often used to determine if items on a questionnaire group togetherd. A group of items which are similar (based on how well they are correlated) is

called a factor.

2. Path Analysisa. Tests relationship between observed variables

5 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Page 6: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

3. Full Structural Modelsa. Tests the fit of both measurement and structural modelsb. Mediation and moderation

4. Multi-trait, multi-method (MTMM) a. Tests whether or not multiple measures measure the same set of traits

6 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Page 7: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

5. Multi-group CFAa. Tests whether a factor structure holds for two groupsb. For example, if the same factors work for both men and women

6. Latent Growth Curvesa. Tests how a measure changes over time

Fit Indices1. Chi-square

a. Test of the goodness-of-fit of the modelb. Smaller chi-square is betterc. Non-significant = goodd. Highly influenced by sample size

2. Factor Loadings/Path Coefficientsa. These should be significantb. If non-significant, that observed variable or path is not contributing to the model.

3. Comparative Fit Index (CFI)a. Test of goodness-of-fitb. Greater than .90 is good; greater than .95 is better

4. Normed Fit Index (NFI)a. Same as CFI

5. Tucker-Lewis Index (TLI)a. Same as CFI

6. Root Mean Square Error of Approximation (RMSEA)a. Measure of difference between observed correlations and model correlationsb. Less than .10 is good; less than .06 is better

7 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Page 8: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

c. Influenced by small df and sample size7. Standardized Root Mean Square Residual (SRMR)

a. Same as RMSEA8. Example Output:

a. Default Model is the hypothesized model and produces the fit indices for the model that was created.

b. Saturated Model is a model with no degrees of freedom model, and Independence Model is a model with the least possible number degrees of freedom. These can be ignored.

c. CMIN is the chi-square value and DF is the associated degrees of freedom. The P value should be greater than .05 in a good fitting model. This output indicates a poor fitting model.

d. CMIN/DF is a chi-square value correcting for sample size. This should be less than 3 in a good fitting model.

e. NFI, TLI, and CFI indicate a poor fitting model as they are less than .90. f. RMSEA indicates an adequate model as it is below .10.

Model Fit Summary

CMIN

Model NPAR CMIN DF P CMIN/DF Default model 70 1021.692 395 .000 2.587 Saturated model 465 .000 0 Independence model 30 2715.382 435 .000 6.242

Baseline Comparisons

Model NFI Delta1

RFI rho1

IFI Delta2

TLI rho2 CFI

Default model .624 .586 .730 .697 .725 Saturated model 1.000 1.000 1.000 Independence model .000 .000 .000 .000 .000

RMSEA

Model RMSEA LO 90 HI 90 PCLOSE Default model .089 .082 .096 .000 Independence model .162 .156 .168 .000

Model Comparisons1. The initial, hypothesized model often does not work out or there could be competing

models; therefore, it is important to be able to compare models.2. Modification Indices

a. These are suggested additions to the model to improve the fit.

8 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Page 9: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

3. Chi-square difference testa. Subtract the chi-square values of the two models, b. Subtract the degrees of freedom of the two models, c. Determine the critical chi-square value, d. If the difference in the chi-square values is greater than the critical chi square

value, then the models are significantly different. e. Smaller chi-square values are better, so whichever model has the smaller chi-

square value is the better model.4. CFI difference

a. If the difference in the CFI values of two models is greater than .01, then the models are significantly different.

b. The model with the greater CFI is the better model.

Assumptions1. Sample Size

a. Good sample size estimate is 100-200b. Larger sample may be needed if there are a large number of variablesc. Some SEM programs do not allow there to be missing data

2. Normalitya. Data should be normally distributedb. Bootstrapping can help with non-normal data

3. Outliersa. Outliers can affect the fit of the modelb. Can be checked using Mahlanobis distance

4. Linearity5. Multicollinearity

a. It is important to test for multicollinearity.b. If two variables are highly correlated (r > .9), it is basically like putting the same

variable in the model twice.c. If this is a problem, one of the variables should be deleted.

6. Homoscedascity

Programs1. AMOS

a. Good for beginners b. Visual representation of models

2. EQS3. LISREL4. MPlus

Recommended TextsByrne, B. (2006). Structural Equation Modeling with EQS. 2nd Edition, (New York:

Routlege).

9 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4

Page 10: Missouri State University · Web viewCan be checked using Mahlanobis distance Linearity Multicollinearity It is important to test for multicollinearity. If two variables are highly

Byrne, B. (2009). Structural Equation Modeling with AMOS. (New York: Routlege). Brown, T. A. (2006) Confirmatory Factor Analysis for Applied Research. (New York:

Guilford Press).Kline, R. B. (2010). Principles and Practice of Structural Equation Modeling. (New

York: Guilford Press).

10 | P a g e K a y l a N . J o r d a n – R S t a t s W o r k s h o p S p r i n g 2 0 1 4