this work is licensed under a creative commons attribution ... · cross-lagged panel analysis: key...
TRANSCRIPT
Copyright 2007, The Johns Hopkins University and Qian-Li Xue. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided “AS IS”; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site.
Advanced Structural Equations Models I
Statistics for Psychosocial Research II: Structural Models
Qian-Li Xue
Test of causal hypotheses?
Ordinary Regression
SEM (Origin: Path Models)
Continuous endogenous var. and
Continuous LV?
Classic SEM Latent Class Reg.
Categorical indicators and Categorical LV?
Latent Trait
Latent Profile
Adv. SEM I: latent growth curves)
Longitudinal Data?
Adv. SEM II: Multilevel Models
Multilevel Data ?
Classic SEM
Yes No
YesNo
Yes No
Yes No
Yes No
Outline1. Estimating means of observed and latent
variables2. Modeling repeated measures of outcome over
timeThe Simplex-Growth Over Time
3. Non-Recursive Models4. Modeling repeated measures of outcome and
covariate over timeCross-Lag Panel AnalysisLatent Growth Curve Models (Next Lecture)
1. Estimating Means of Observed and Latent Variables
Estimating Means of Observed and Latent Variables
So far, we have largely ignored intercept terms in our analysesWhat has happened to the alpha coefficient?
Estimating Means of Observed and Latent Variables
Up to now, information on means and intercepts has not been of interest
It is possible to estimate levels of association without information on these parameters
If of interest, these parameters can be estimated using a “mean model.”
In addition to covariances, these models also require information on mean of variables
These parameters are of key interest in group comparisons and growth curve models
Estimating Means of Observed and Latent Variables
Does the mean score on the latent variable ξ (e.g. depression) differ between men and women?
ξ11
X11 X12 X13
.64 .36 .51
0.8 0.70.6
Man
ξ12
X21 Y22 X23
.64 .36 .51
0.8 0.70.6
1d
ab
c a bc
eWomen
4.0 5.0 6.0 4.3 5.4 6.35Resid. Var.Means
(Loehlin p.139)
Estimating Means of Observed and Latent Variables
d=0 (reference)
a=4.0,b=5.0,c=6.0 (baseline values, same across groups)
e – difference between the means of the latent variable
e*0.6+a=4.3 ⇒ e=0.5
ξ11
X11 X12 X13
.64 .36 .51
0.8 0.70.6
Man
ξ12
X21 Y22 X23
.64 .36 .51
0.8 0.70.6
1d
ab
c a bc
eWomen
4.0 5.0 6.0 4.3 5.4 6.35Resid. Var.Means
(Loehlin p.139)
Example: Stress, Resources, and Depression (Holahan & Moos, 1991)
“How do the high-stressor and the low-stressor groups compare on the two latent variables: depression (D) and resources (R)
D
DM DF
m n
High-Stressor R
SC EG FS
o p q
1
a b
cd e
h i
jk
l
fg
r
Example: Stress, Resources, and Depression (Holahan & Moos, 1991)
DM DF SC EG FS SD M
Depressed Mood 1 .84 -.36 -.45 -.51 5.97 8.82
Depressive Features .71 1 -.32 -.41 -.50 7.98 13.87
Self-confidence -.35 -.16 1 .26 .47 3.97 15.24
Easygoingness -.35 -.21 .11 1 .34 2.27 7.92
Family support -.38 -.26 .30 .28 1 4.91 19.03
Standard Deviation 4.84 6.33 3.84 2.14 4.43 N 128
Mean 6.15 9.96 15.14 8.80 20.43 126
High-stressor group: above diagonal (underlined)
Low-stressor group: below diagonal
Example: Stress, Resources, and Depression (Holahan & Moos, 1991)
D
DM DF
m n
Low-Stressor
R
SC EG FS
o p q
1
a b
cd e
h i
jk
l
fg
r
MPLUS code
TITLE: Stress, resources, and depression (Loehlin, p.142)DATA: FILE is c:/teaching/140.658.2007/depression.dat;
TYPE IS CORRELATION MEANS STDEVIATIONS;NOBSERVATIONS ARE 126 128;NGROUPS=2;
VARIABLE: NAMES ARE DM DF SC EG FS;USEVARIABLES ARE DM-FS;
MODEL: D BY DM* DF;R BY SC* EG FS;DM (1);DF (2);SC (3);EG (4);FS (5);
MODEL g1:[D@0 R@0];D@1 R@1;
OUTPUT:TECH1;
Equate the measurement models across the groups
Set reference group (i.e. low- stressor)
Example: Stress, Resources, and Depression (Holahan & Moos, 1991)
Measurement ModelLatent Variables Path
Coeff.Residual Var.
Baseline means
Depression: Mean f [0]* a 4.42 m 2.91 h 6.09
Resources: Mean g [0] b 5.22 n 16.04 i 10.27
Depression: SD [1] c 1.56 o 11.76 j 15.59
Resources: SD [1] d 1.01 p 3.61 k 8.61
correlation r -0.72 e 2.67 q 12.25 l 20.40
Depression: Mean f 0.63
Same as aboveResources: Mean g -0.50
Depression: SD 1.30
Resources: SD 1.29
correlation r -0.78
* Numbers in [ ] are prefixed in order to make the model identified
Low- Stressor
High- Stressor
Example: Stress, Resources, and Depression (Holahan & Moos, 1991)
TESTS OF MODEL FITChi-Square Test of Model Fit
Value 27.245Degrees of Freedom 19P-Value 0.0991
CFI/TLICFI 0.979TLI 0.978
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.05890 Percent C.I. 0.000 0.104
SRMR (Standardized Root Mean Square Residual)Value 0.055
The model fits reasonably well to the data!
2. Modeling Repeated Measures of Outcome Over Time
The Simplex-Growth Over TimeModeling growth over (e.g. height)Measurements taken repeatedly over timeIn general, measurements made closer together in time would be more highly correlated (called “simplex” by Guttman, 1954)E.g.
1 2 3 41 1 0.73 0.72 0.682 1 0.79 0.763 1 0.844 1
Correlation
Smaller
The Simplex-Growth Over TimeExample: Scores on standardized tests of academic achievement atgrades 1-7 (Bracht & Hopkins, 1972)Test score (Y) is a measure of the latent academic achievement (η)Achievement at grade t is a function of achievement at t-1 via β, and other factors ζ
η1 η2 η3 η4 η5 η6
Y1 Y2 Y3 Y4 Y5 Y6
1 11111
η7
Y7
1
ε1 ε3 ε5ε4 ε6ε2 ε7
ζ2 ζ6ζ5ζ4ζ3 ζ7
Loehlin p.125
β21 β32 β76β65β54β43
The Simplex-Growth Over Time
η1 η2 η3 η4 η5 η6
Y1 Y2 Y3 Y4 Y5 Y6
1 11111
η7
Y7
1
ε1 ε3 ε5ε4 ε6ε2 ε7
ζ2 ζ6ζ5ζ4ζ3 ζ7
β21 β32 β76β65β54β43
iiii
iiiYςηβη
εη+=
+=
−1
εi are uncorrelated, εi ⊥ηi , and ζi ⊥ηi-1
The Simplex-Growth Over TimeVar(η1), Var(ζ7), Var(ε1), Var(ε2), β21 are unidentifiedTo achieve identification, set Var(ε1)=Var(ε2) AND Var(ε6)=Var(ε7), reasonable if Ys are on the same scale# free parameters = 3p-3, where p=# of YsFor testing a simplex model, p>3 !!!
η1 η2 η3 η4 η5 η6
Y1 Y2 Y3 Y4 Y5 Y6
1 11111
η7
Y7
1
ε1 ε3 ε5ε4 ε6ε2 ε7
ζ2 ζ6ζ5ζ4ζ3 ζ7
β21 β32 β76β65β54β43
3. Non-Recursive Models
Non-Recursive Models
So far, there has been little discussion of models with feedback loops
Non-recursive models deal with reciprocal causal relationships
Can not be analyzed by ordinary regression analysis due to correlated errors
Non-recursive models may not be identified even if the T-rule is met
Non-Recursive Models
What do you mean by “reciprocal causation”?Alternative: Lagged model
Assumption: the principal of “finite causal lag”Roles of the variables in the bidirectional relationship change over time (e.g. A is a cause at Time 1, but effect at Time 2)
The reciprocal causation model becomes the only choice if only cross-sectional data are available
A
B
A
B
A
B
Time 1 Time 2
Reciprocal Lagged
Non-Recursive Models: Model Identification
Recall: recursive path models without measurement error are always identified
Not true for non-recursive models
Definition: Instrumental variable – a predictor is an instrument for an endogenous variable if it has a direct path to other endogenous variables but not the endogenous variable of interest
X1
X3
Y1
Y2
X2 X3 is an instrument for Y1
Maruyama, 1998; p.106
Non-Recursive Models: Model Identification
Order condition (necessary but not sufficient) – For any system of N endogenous variables, a particular equation is identified only if at least N-1 variables are left out of that equation
Rank condition (necessary AND sufficient) – is met for a particular equation if there is at least one non-zero determinant of rank N-1 from the coefficients of the variables omitted from that equation
X1
X3
Y1
Y2
X2
Maruyama, 1998; p.106
4. Modeling Repeated Measures of Outcome and Covariate Over Time
Cross-Lagged Panel Analysis: Terminology
X1
Y1
X2
Y2
Time 1 Time 2Synchronous correlations: Corr(X1,Y1) and Corr(X2,Y2)
Autocorrelations (i.e. stability): Corr(X1,X2) and Corr(Y1,Y2)
Cross-lagged: Corr(X1,Y2) and Corr(Y1,X2)
Residual correlations (due to measure-specific variance): Corr(ex1,eX2) and Corr(eY1,eY2)
Here Corr. denotes total correlation!
eX1 eX2
eY1 eY2
Cross-Lagged Panel Analysis: Identification
X1
Y1
X2
Y2
Time 1 Time 2 Is this model identified?# equations = 4*5/2=10# unknowns = 11Not identified!What is the problem?
The repeated assessment of the same measure leads to two sources of common variance
construct varianceMeasure-specific variance
Model would be identified if delete residual correlations orBuild multiple-indicator models
eX1 eX2
eY1 eY2
Cross-Lagged Panel Analysis: Key Issues (Maruyama, pp.112-120)
Time 1 Time 2 1. Stability of a variableFor example, if Y is perfectly stable, Y2 is perfectly determined by Y1
If data is only available at Time 2, then Y1 is not available
Any variable correlated with Y or caused by Y could be included as predictors, leading to a misspecifiedmodel!
Low stability over time may result from poor reliability (if so, we’re in trouble!) or
Real change in the measure
Y1 Y2
X2
eX2
eY2
Cross-Lagged Panel Analysis: Key Issues
2. Temporal LagsHow long is the causal lag?
It the sampling interval > causal lag ⇒ attenuated effect
If the sampling interval < causal lag ⇒ no effect or underestimated effect
What if the causal lag from X1 to Y2 is different from Y1 to X2?
Solution: three-wave data with different intervals
eY2
X1
Y1
X2
Y2
Time 1 Time 2
eX1
eY1 eY2
Cross-Lagged Panel Analysis: Key Issues
3. Growth Across TimeWhen to use covariance vs. correlation data in SEM
Covariance allows for “growth” by focusing on raw scoresCorrelation focuses on standardized relationshipsIf no change in variability of any of the variables over time, the results are identicalUsing covariance is highly recommended!
Cross-Lagged Panel Analysis: Key Issues
3. Stability of Causal ProcessCausal dynamics between variables remain stable across time intervals of the same length
If not true, the relationships would differ depending on the particular interval sampled
On the other hand, modeling unstable processes may be warranted when studying
Developmental processesTime-varying interventions
Cross-Lagged Panel Analysis with Latent Variables: Example
(Ma & Xu, Journal of Adolescence 27 (2): 165-179 APR 2004 )
Grade 7 Anxiety
Nervous or upset
Often get scared
Grade 8 Anxiety
Nervous or upset
Often get scared
0.39 0.53
Grade 9 Anxiety
Nervous or upset
Often get scared
0.52 0.54
…0.630.51 0.64
0.63 0.73 0.72 0.73
0.48 0.53
0.69 0.73
Cross-Lagged Panel Analysis with Latent Variables: Example
(Ma & Xu, Journal of Adolescence 27 (2): 165-179 APR 2004 )
Grade 7 Achieve
Basic skills Algebra
Grade 8 Achieve
Basic skills Algebra
0.77 0.31
…0.98
0.88 0.56
0.79 0.77
Geometry Literacy
0.46 0.64
0.68 0.80
Geometry Literacy
0.72 0.81
0.92
0.890.88 0.85
0.90
Achievement Grade
Anxiety Grade
7 7
10 10 11 11
8 8 9 9 10 10 11 11 12 12
12 12 9 9 8 8 77
0.390.39 0.550.55 0.570.57 0.590.59 0.570.57
0.910.91 0.950.95 0.970.97 0.970.970.980.98
-0.05 -0.05 -0.01 -0.02 -0.02
-0.20 -0.12 -0.14 -0.15 -0.11
Example of cross-lagged panel analysis with latent variables. Structural equation model estimating the causal relationship between mathematics anxiety & mathematics achievement across Grades 7–12. Large ovals represent latent factors & unidirectional arrows represent casual links. All parameter estimates for unidirectional paths are standardized. Pink boxes indicated P < 0.001). Adapted from Ma & Xu, Journal of Adolescence 2004;27:165-179