structural equation modeling (sem) niina kotamäki
TRANSCRIPT
Structural Equation Modeling (SEM)
Niina Kotamäki
SEM
Covariance structure analysis Causal modeling Simultaneous equations modeling Path analysis Confirmatory factor analysis Latent variable modeling LISREL-modeling
Highly flexible “modeling toolbox”
Extension of the general linear model (GLM)
SEM
Quite recent innovation (late 1960s early 1970 )
Extensively applied in social sciences, psychology, economy, chemistry and biology• Applications in ecology and environmental sciences are limited• Even less common in aquatic ecosystems
tests theoretical hypothesis about causal relationships
tests relationships between observed and unobserved variables
combines regression analysis (path analysis) and factor analysis
researchers use SEM to determine whether a certain model is valid
X1
Y
X2
a
b
εRegression model:
Y=aX1+bX2+ε
LIMITATIONSMultiple dependent (Y) variables are not permitted
Each independent variable (X) is assumed to be measured without error controlled experiments measurement errors are negligible and uncontrolled variation is at minimumobservational studies all variables are subject to measurement error and uncontrolled variation
Strong correlation (multicollinearity) may cause biased parameter estimates and inflated standard errors Indirect effects (mediating variables) cannot be includedThe error or residual variable is the only unobserved variable
corr
DEPENDENT INDEPENDENT
M
SEM deals with these limitations
Works with multiple, related equations simultaneously Allows reciprocal relationships Ability to model constructs as latent variables Allows the modeller to explicitly capture unreliability of measurement in the
model Indirect effects / mediating variables Compares the performance of a model across multiple populations
1. Development of hypothesis / theory
2. Construction of path diagram
3. Model specification
4. Model identification
5. Parameter estimation
6. Model evaluation
7. Model modification
Steps of SEM analysis
1. Development of hypothesis
SEM is a confirmatory technique: researcher needs to have established theory about the
relationships suited for theory testing, rather than theory development
2. Construction of path diagram
ηξ
η
correlationpath
coefficients
erro
r
error
error
path
Endogenous latent variable
Exogenous latent variable
Creating a hypothesized model that you think explains the relationships among multiple variables
Converting the model to multiple equations
3. Model Specification
4. Model Identification
(Just) identified• a unique estimate for each parameter• number of equations = number of parameters to be estimated• a+b=5, a-b=2
Under-identified (not identified)• number of equations < number of parameters• infinite number of solutions • a+b=7 • model can not be estimated
Over-identified • number of equations > number of parameters• the model can be wrong
ξ1
ξ2
ξ3
η2
η1
Just identified model
ξ1
ξ2
ξ3
η1
η2
Over-identified model (SEM usually)
5. Parameter estimation
technique used to calculate parameters
testing how well a model fits the data
expected covariance structure is tested against the covariance matrix of oberved data H0: Σ=Σ(θ)
estimating methods: e.g. maximum likelihood (ML), ordinary least Squares (OLS), etc.
Measurement Model• The part of the model that relates indicators to latent factors• The measurement model is the factor analytic part of SEM• The respective regression coefficient is called lambda () / loading
Structural model• This is the part of the model that includes the relationships between the
latent variables• relation between endogenous and exogenous construct is called gamma
(γ) and relation between two endogenous constructs is called beta (β)
ξ1
X1
X2
δ1
δ2
λx11
λx21
ξ2
X3
X4
δ1
δ2
λx32
λx42
ξ3
X5
X6
δ1
δ2
λx53
λx63
η1
η 2
y1
y2
y3
y4
ε1
ε2
ε3
ε4
λy11
λy21
λy32
λy42
Measurement model
Structural model
β21
γ11
γ12
γ22
γ23
ϕ21
ϕ32
ϕ31
Endogenous latent variables
Exogenous latent variables
6. Model evaluation
Total model• Chi Square (2) test
• the theoretically expected values vs. the empirical data
• Because we are dealing with a measure of misfit, the p-value for 2 should be larger than .05 to decide that the theoretical model fits the data
• fit indices e.g. RMSEA, CFI, NNFI etc.
Model parts• t-value for the estimated parameters showing whether they are different from
0 (or any other value that we want to fix!); t > 1.96, p < .05
7. Model modification
Simplify the model (i.e., delete non-significant parameters or parameters with large standard error)
Expand the model (i.e., include new paths)
Confirmatory vs. explanatory• Don’t go too far with model modification!
use of confirmatory factor analysis to reduce measurement error by having multiple indicators per latent variable
graphical modeling interface
testing models overall rather than coefficients individually
testing models with multiple dependents
modeling indirect variables
testing coefficients across multiple between-subjects groups
handling difficult data (time series with autocorrelated error, non-normal data, incomplete data).
Advantages of SEM
SEM in ecology, example
Phytoplankton dynamics
Nutrients Herbivore
Physical environment Water clarity
Structural model
Example from: G.B. Arhonditsis, C.A. Atow, L.J. Steinberg, M.A. Kenney, R.C. Lathrop, S.j. McBride, K.H. Reckhow. Exploring ecological patterns with structural equation modeling and Bayesian analysis. Ecological Modeling 192 (2006) 385-409
Phytoplankton dynamics
Nutrients Herbivore
Phosphorus (SRP)
Chlorophyll aBiovolume
ZooplanktonDaphniaNitrogen (DIN)
Epilimnion depthwater clarity
Phytoplankton dynamics
Nutrients Herbivore
Phosphorus (SRP)
Chlorophyll aBiovolume
ZooplanktonDaphniaNitrogen (DIN)
Epilimnion depth (physical environment)
water clarity
ε1 ε2
ε4ε5
β2
β1
φ12
ψ33
ψ22
ψ11
δ2 δ3
γ1
γ2
λ2 λ3λ6 λ7
λ4 λ5
Phytoplankton dynamics
Nutrients Herbivore
Phosphorus (SRP)
Chlorophyll aBiovolume
ZooplanktonDaphniaNitrogen (DIN)
Epilimnion depth (physical environment)
water clarity
0.67 0.79
0.830.93
-0.66
0.82
-0.92
0.91
0.89
0.990.96
0.84
0.42
0.43
0.84
0.76
0.71 0.98
-0.07
-0.84
2 =22.473; df=19
p=0.261 >0.05 OK!
SEM Software packages
LISREL AMOS Function sem in R MPlus EQS Mx SEPATH
References: http://www.upa.pdx.edu/IOA/newsom/semrefs.htm