four innovative applications of hierarchical linear modeling (hlm) shenyang guo, ph.d. university of...

Four Innovative Applications of Hierarchical Linear Modeling (HLM)

Shenyang Guo, Ph.D.University of North Carolina at Chapel Hill

[email protected]

Acknowledgment

Support for this research is provided by the Discretionary Grants Program of Children’s Bureau to Shenyang Guo (PI). The project aims to develop innovative quantitative methods for child welfare research.

Overview of HLM Why HLM? The need to study multilevel influences on an

outcome variable, and to run growth curve analysis. Central problem: intraclass correlation. Conceptually one may view HLM as running regression

model several times, or at 2 or 3 levels. Other names:multi-level analysis, mixed-effects model,

random-effects model, growth curve analysis, random-coefficient regression model, covariance components model.

The key idea is to estimate random effects. In addition to traditional regression coefficients, HLM estimates a set of random effects associated with each high-level unit, which can be used to control for autocorrelation.

Four innovative models

1. Latent-variable analysis of HLM

2. Omnibus score of CBCL & TRF using latent-variable HLM

3. Meta analysis using HLM

4. Modeling multivariate change

Latent-variable analysis (1)

Latent variables: variables that are not directly observed. Under this framework, any observed variable is an indicator, and can be viewed as a latent true-score plus measurement error.

Statistical models for analyzing latent variables: structural equation modeling: (1) measurement model – relations between indicator and latent variable; (2) structural model – relations among latent variables.

In HLM, a latent-variable analysis consists of two parts: measurement model, and structural model involving explanatory variables.

Latent-variable analysis (2)Example Sampson, Raudenbush, & Earls (1997, Science 277(15):

918-924) applied this approach to analyzing multilevel influences of collective efficacy, in which they view collective efficacy as a latent variable. Their three-level HLM treats ten items collected from all survey respondents as level 1, and conceptualizes that these items are commonly determined by a latent true score “collective efficacy” plus measurement errors. Their model then explores how informants within neighborhoods (i.e., level 2) vary randomly around the neighborhood mean of “collective efficacy”, and how neighborhoods across whole study area (i.e., level 3) vary randomly about the grand mean of “collective efficacy”.

Omnibus score of CBCL & TRF (1) Disentangle multiple raters’ measurement error from

clients true change (Guo & Hussey, 1999, Social Work Research 23(4): 258-269).

Ratings are likely to be collected by multiple raters (e.g., Achenbach instruments: CBCL, TRF, & YSR).

Attritions can also occur in raters. None of the prior studies (before 1999) ever controlled

for raters’ impact on ratings, though many used multiple raters to collect ratings.

A theoretical framework to investigate multiple sources of measurement error: Cronbach’s Generalizability Theory.

The Need: Hypothetical DataTwo raters’ ratings on a single subject

1a 1b 1c

010203040506070

0 2 4 6 8 10Time

Y

Rater ARater B

010203040506070

0 2 4 6 8 10Time

Y

010203040506070

0 2 4 6 8 10Time

Y

010203040506070

0 2 4 6 8 10Time

Y1d

Omnibus score of CBCL & TRF (2)

Problem Type of

Data

Causes of the Problem Solution Major Results

Divergent ratings made by caregiver, teacher, and youth self

Both cross-sectional & longitudinal

Three versions of rating forms: CBCL, TRF, & YSR

A three-level HLM with latent-variable analysis; or Guo & Hussey’s three-level HLM

Both models estimate a “true score” (an omnibus score based on multiple ratings) for each study child.

Both facilitate a multivariate analysis identifying significant predictors of true-score differences in the study sample.

Attrition of raters or missing ratings

Longitudinal A teacher may change job in a longitudinal study and make herself no longer a member of the TRF collection team. A caregiver may miss one or more CBCL collections. A youth may miss one or more YSR collections.

Guo & Hussey’s three-level HLM

The model estimates a “true change trajectory” (an omnibus trajectory based on all available ratings) for each study child;

The model facilitates an inter-individual analysis that identifies significant predictors of the overall trajectory of the study sample.

Omnibus score of CBCL & TRF (3)Problems & solutions


L e v e l 1 : ijkijkjkijk DR

L e v e l 2 : j kj kkj kkkj k rXRATERXRATER 0020100 )2_()1_(

L e v e l 3 : kkkk uWCHILDWCHILD 0000200100000 )2_()1_(

0 0 k i s t h e o m n i b u s s c o r e

Omnibus score of CBCL & TRF (5) Model 1

L e v e l 1 : ijkijkjkijk DR

L e v e l 2 : jkkjk r 000

L e v e l 3 : kk u 0000000


Omnibus score of CBCL & TRF (6) Model 2

L e v e l 1 : i j ki j kj ki j k DR L e v e l 2 : jkkjk r 00 0

L e v e l 3 : kkqkk uWqCHILDWCHILD 000000100000 )_(...)1_(


Omnibus score of CBCL & TRF (6) Illustrating example Acknowledgment to Dr. Richard Barth and Ms. Ariana

Wall at UNC for their help. Data: National Survey of Child and Adolescent Well-

being (NSCAW). We focus on externalizing and internalizing scores.

Each child has four such scores: two from caregiver (CBCL), and two from teacher (TRF). The task: how to create one score?

Variables employed in level 3 of Model 2: age gender, race, social behavior, MBA reading score, MBA math score, count of risky behaviors of delinquency, count of risky behaviors of substance abuse, and count of risky behavior of suicidal attempt.


_____________________________________________________________________________Ec Et Ic It

_______ _______ _______ _______Externalizing rated by caregiver (Ec) 1.000

Externalizing rated by teacher (Et) .405** 1.000

Internalizing rated by caregiver (Ic) .639** .142** 1.000

Internalizing rated by teacher (It) .174** .476** .196** 1.000

Mean (S.D.) 60.8 (11.52) 58.7 (9.60) 57.3 (11.91) 56.0 (9.69)

_____________________________________________________________________________ ** p < .01

Correlation coefficients and descriptive statistics on disagreement between caregiver and teacher’s scores (N=448)

Omnibus score of CBCL & TRF (8) Evaluation Schemes:

C1 Caregiver's scores only .5Ec + .5IcC2 Teacher's scores only .5Et + .5ItC3 All 4 scores from both versions with equal weights .25Ec + .25Ic + .25Et + .25ItC4 Similar to C3 but heavier weights giving to caregiver's scores .35Ec + .35Ic + .15Et + .15ItC5 Similar to C3 but heavier weights giving to teacher's scores .15Ec + .15Ic + .35Et + .35ItC6 Similar to C3, a 50/50 split between caregiver and teacher's scores but heavier weights giving to externalizing scores .35Ec + .15Ic + .35Et + .15It

Omnibus score of CBCL & TRF (9) Evaluation Schemes (continued):

C7 Similar to C3, a 50/50 split between caregiver and teacher's scores but heavier weights giving to internalizing scores .15Ec + .35Ic + .15Et + .35ItC8 Extreme value, low end .5 [Min (Ec,Et)] + .5 [Min (Ic,It)]C9 Extreme value, high end .5 [Max (Ec,Et)] + .5 [Max (Ic,It)]C10 Arbitrary: half high externalizing and half low internalizing 1.5 [Max (Ec,Et)] + .5 [Min (Ic,It)]C11 Arbitrary: half low externalizing and half high internalizing .5 [Min (Ec,Et)] + .5 [Max (Ic,It)]

Omnibus score of CBCL & TRF (10) Evaluations

_____________________________________________________________________________

Scheme Mean S.D. Minimum Maximum Correlation Coefficient_____________________________with Omnibus 1 with Omnibus 2

_____________________________________________________________ ______________

Omnibus 1 60.56 3.29 52.30 69.21Omnibus 2 60.56 4.78 49.34 72.81 .707C1 59.03 10.61 32.00 82.50 .855 .700C2 57.32 8.29 39.50 80.00 .747 .406C3 58.18 7.63 39.00 78.25 1.000 .707C4 58.52 8.49 37.00 79.35 .966 .731C5 57.84 7.39 39.40 77.15 .955 .620C6 58.79 7.90 39.30 77.60 .979 .702C7 57.57 7.69 37.20 79.35 .978 .681C8 53.00 8.22 32.00 74.00 .929 .656C9 63.36 8.21 41.50 82.50 .929 .657C10 57.74 8.09 37.50 76.50 .961 .698C11 58.62 7.81 39.00 81.00 .958 .658

_____________________________________________________________________________All correlation coefficients are statistically significant (p<.01)

Omnibus score of CBCL & TRF (11) Use the score as a dependent variable

__________________________________________________________________________________________________________

Scheme Employed the score createdby the scheme

as an outcome variable

R2

_______________________________________

Omnibus 1 .388Omnibus 2 .987C1 .431C2 .147C3 .388C4 .434C5 .293C6 .388C7 .361C8 .337C9 .337C10 .383C11 .332

__________________________________________________________________________________________________________

Omnibus score of CBCL & TRF (12) Use the score as independent variable:

__________________________________________________________________________________________________________

Scheme Employed the score as an independent variable_____________________________________________________________________ DV=Substance Abuse Risk DV=Delinquency Risk DV=Suicidal Risk

________________________ ________________________ ________________________

P-value for

B

R2

Incremental

P-value for

B

R2

Incremental

P-value for

B

R2

Incremental________________________________ _____________________ _____________________

Omnibus 1 .005 0.012 .894 .000 .000 0.080Omnibus 2 .000 0.043 .797 .000 .000 0.244C1 .061 0.005 .695 .000 .000 0.076C2 .008 0.010 .793 .000 .000 0.026C3 .005 0.012 .894 .000 .000 0.080C4 .014 0.009 .787 .000 .000 0.085C5 .003 0.013 .969 .000 .000 0.062C6 .001 0.017 .635 .000 .000 0.067C7 .033 0.007 .825 .000 .000 0.087C8 .015 0.009 .505 .000 .000 0.067C9 .007 0.011 .678 .000 .000 0.070C10 .004 0.013 .883 .000 .000 0.078C11 .015 0.009 .916 .000 .000 0.069

__________________________________________________________________________________________________________

Meta analysis using HLM (1) R & B (2002): Chapter 7 Meta analysis: research synthesis, or a “study

of the studies”. Objective: summarize results from a series of related studies.

Collect the following data from literature review: the mean outcome for the experimental group; the mean outcome for the control group; the pooled, within-group standard deviation; the sample size of the experimental group; the sample size of the control group; where j indicates the jth study.

EjY

CjY

jS

Ejn

Cjn

Meta analysis using HLM (2)

Based on these data, calculate effect size:

And variance of the effect size:

Square root of Vj is called “standard error of dj”

jCjEjj SYYd /)(

)](2/[)/()( 2CjEjjCjEjCjEjj nndnnnnV

Meta analysis using HLM (3) General model

Level 1: dj = j + ej

Level 2: j = 0 + uj

or combined model: dj = 0 + uj + ej

where dj ~N(0, j) with j = + Vj

In this model, we only have one subscript j to indicate study. This is a special case of two-level model, in which subscript i is omitted, because we don’t have original data at the study subject level.

V-known model: unlike previous HLM, this model has known variance Vj,or S.E.(dj)= Vj.

Meta analysis using HLM (4) Use HLM DOS version to run the v-

known model. Data look like this:

1 .030 .016 2.000

2 .120 .022 3.000

3 -.140 .028 3.000

……

19 -.070 .030 3.000

Format of the raw data file: (a11,3f11.3) See HLM 5 manual pp. 221-226.

Meta analysis using HLM (5)Experimental Studies of Teacher Expectancy Effects on Pupil IQ

Study

Effect Size

Estimate dj

Standard Error of

dj

Weeks of Prior

Contact

1. Rosenthal et al. (1974) 0.030 0.126 2.0002. Conn et al. (1968) 0.120 0.148 3.0003. Jose & Cody (1971) -0.140 0.167 3.0004. Pellegrini & Hicks (1972) 1.180 0.373 0.0005. Pellegrini & Hicks (1972) 0.260 0.369 0.0006. Evans & Rosenthal (1969) -0.060 0.105 3.0007. Fielder et al. (1971) -0.020 0.105 3.0008. Claiborn (1969) -0.320 0.219 3.0009. Kester & Letchworth (1972) 0.270 0.164 0.00010. Maxwell (1970) 0.800 0.251 1.00011. Carter (1970) 0.540 0.302 0.00012. Flowers (1966) 0.180 0.224 0.00013. Keshock (1970) -0.020 0.290 1.00014. Henrickson (1970) 0.230 0.290 2.00015. Fine (1972) -0.180 0.158 3.00016 Greiger (1970) -0.060 0.167 3.00017. Rosenthal & Jacobson (1968) 0.300 0.138 1.00018. Fleming & Anttonen (1971) 0.070 0.095 2.00019. Ginsburg (1970) -0.070 0.173 3.000

Meta analysis using HLM (6)Running HLM, we obtain the following findings:

The estimated grand-mean effect size is 0.084, implying that, on average, experimental students scored about .084 standard deviation units above the controls. However, the estimated variance of the effect parameter is =.019. This corresponds to a standard deviation of .138 (i.e., .019 = .138), which implies that important variability exists in the true-effect sizes. For example, an effect one standard deviation above the average would be .084+.138=.222, which is of nontrivial magnitude.

In a cross-sectional study, we use correlation coefficients to see the level of association of an outcome variable with other variables. In a longitudinal study, we have a similar task, that is, we need to model multivariate change: whether two change trajectories (outcome measures) correlate over time? For details of this method, see MacCallum, R.C., & Kim, C. (2000). “Modeling multivariate change”, in Little, Schnabel, & Baumert edited, Modeling Longitudinal and Multilevel Data. Lawrence Erlbaum Associates, pp.51-68.

Multivariate change (1)

What kind of questions can be answered?

Whether benefits clients gained from an intervention over time negatively correlate with the intervention’s side effects? Whether clients’ change in physical health correlates with their change in mental health? Whether a program’s designed change in outcome (e.g., abstinence from alcohol or substance abuse) correlates with clients’ level of depression?

Use software MLn/MLwiN to estimate the model. It’s possible to use SAS Proc Mixed.

Multivariate change (2)

four innovative applications of hierarchical linear modeling (hlm) shenyang guo, ph.d. university of...

Documents