Download - Mixture & Multilevel Modeling
1
MIXTURE & MULTILEVEL MODELINGShaunna Clark & Ryne Estabrook
NIDA Workshop – October 19, 2010
2
OUTLINE Mixture Models
What is mixture modeling? Growth Mixture Model Open Mx Genetic Mixture Models Other Longitudinal Mixture Models
Multilevel Models What is multilevel data? Multilevel regression model Open Mx
3
HOMOGENEITY VS. HETEROGENEITY Most models assume homogeneity
i.e. Individuals in a sample all follow the same model
What have seen so far today But not always the case
Ex: Sex, Age, Alcohol Use Trajectories
12 14 16 18 21 240
5
10
15
20
25
Age
Num
ber
of D
rinks
Per
Wee
k
4
WHAT IS MIXTURE MODELING
Used to model unobserved heterogeneity by identifying different subgroups of individuals
Ex: IQ, Religiosity
5
GROWTH MIXTURE MODELING
6
GROWTH MIXTURE MODELING (GMM) Muthén & Shedden, 1999; Muthén, 2001 Setting
A single item measured repeatedly Hypothesized trajectory classes Individual trajectory variation within class
Aims Estimate trajectory shapes Estimate trajectory class probabilities
Proportion of sample in each trajectory class Estimate variation within class
7
LINEAR GROWTH MODEL DIAGRAM
σ2Slope
x1 x2 x3 x4 x5
1I SmInt mSlope
σ2Int
σ2Int,Slope
1 1 111 0 1 2 3 4
σ2ε1 σ2
ε2 σ2ε3 σ2
ε4 σ2ε5
8
LINEAR GMM MODEL DIAGRAM
x1 x2 x3 x4 x5
1I S
C
mInt mSlope
σ2Slopeσ2
Intσ2
Int,Slope
1 1 111 0 1 2 3 4
σ2ε1 σ2
ε2 σ2ε3 σ2
ε4 σ2ε5
9
GMM EXAMPLE PROFILE PLOT
10
GMM EXAMPLE PROFILE PLOT
11
GROWTH MIXTURE MODEL EQUATIONS
xitk = Interceptik + λtk*Slopeik + εitk
for individual i at time t in class kεitk ~ N(0,σ)
12
LCGA VS. GMM LCGA – Latent Class Growth Analysis
Nagin, 1999; Nagin & Tremblay, 1999 Same as GMM except no residual variance on
growth factors No individual variation within class (i.e. everyone
has the same trajectory LCGA is a special case of GMM
13
CLASS ENUMERATION Determining the number of classes Can’t use LRT Χ2
Not distributed as Χ2 due to boundary conditions (McLachlan & Peel, 2000)
Information Criteria: AIC (Akaike, 1974), BIC (Schwartz,1978) Penalize for number of parameters and sample
size Model with lowest value
Interpretation and usefulness Profile plot Substantive theory Predictive validity
14
GLOBAL VS LOCAL MAXIMUM
Log Likelihood
Parameter
GlobalLocal
Parameter
Log Likelihood
GlobalLocal
15
OPEN MX EXAMPLE Take it away Ryne!
16
SELECTION OF MIXTURE GENETIC ANALYSIS WRITINGS Growth Mixture Model
Wu et al., 2002; Kerner and Muthen, 2009; Gillespie et al., (submitted)
Latent Class Analysis Eaves, 1993; Muthén et al., 2006; Clark, 2010
Additional References McLachlan, Do, & Ambroise, 2004
17
OTHER LONGITUDINAL MIXTURE MODELS Survival Mixture
Multiple latent classes of individuals with different survival functions
Kaplan, 2004; Masyn, 2003; Muthén & Masyn, 2005
Longitudinal Latent Class Analysis Models patterns of change over time, rather than
functional growth form Lanza & Collins, 2006; Feldman et al., 2009
Latent Transition Analysis Models transition from one state to another over
time Ex: Drinking alcohol or not over time
Graham et al., 1991; Nylund et al., 2006
18
MULTILEVEL MODELS
19
WHAT IS MULTILEVEL DATA . . . Most methods assume individuals are
independentResponses for one individual do not
influence another individual’s responses Multilevel, or nested data, arise when
individuals are not independentEx: Twins in a family, students in a
classroomShare common experiences
20
. . .AND WHY WE SHOULD CARE When ignore nested structure, have
underestimated standard errorsCan lead to misinterpretation of the
significance of model parameters Large body of literature about how to
handle nested dataToday, focus on multilevel techniquesGeneral multilevel texts:
Raudenbush & Bryk, 2002; Snijders & Bosker, 1999
21
MULTILEVEL MODEL EQUATIONFor individual i in cluster j: Level One (Individual)
yij = β0j + β1j*xij + εij
Level Two (Twin Pair\Family)β0j = γ00 + γ01*wj + μ0j
β1j = γ10 + γ11*wj + μ1j
Where εitk ~ N(0,σ), μ ~ N(0,Ψ), Cov(ε, μ) = 0
xij is an individual level covariate (age, weight)wj is a cluster level covariate (maternal smoking)
22
MULTILEVEL MODEL EQUATION EXTENSIONS Can have additional levels
Ex: Individuals within nuclear families with family Can be longitudinal
Ex: Observations within individuals within families
23
MIXED EFFECTS VS. MULTILEVEL MODELING They are the same thing!!Multilevel Model Equation:
Level One (L1):yij = β0j + β1j*xij + εij
Level Two (L2):β0j = γ00 + μ0j
β1j = γ10 + μ1j
Mixed Model Equation:
Plug L2 into L1, some rearranging
yij = (γ00 + μ0j) + (γ10 + μ1j) *xij + εij
yij = γ00 + γ10*xij + μ0j + μ1j*xij + εijFixed Effects
Random Effects
24
MULTILEVEL VS. MULTIVARIATE MODELING OF FAMILIES Today have dealt with multivariate analyses Multivariate
Model for all variables for each family member Family members can have different parameter
values Ex: different growth trajectories for parents vs. children
Only feasible when small number of family members Ex: twins, spouses
PA
A C E
PB
A C E
25
MULTILEVEL MODELING OF FAMILIES Model for variation within individual and
between family members Members of a cluster are assumed
statistically equivalent i.e. Same model for each family member
Can handle various family structures Ex: Large pedigrees, families with differing
numbers of siblings Do not have to make arbitrary assignment of
family members (and checking whether assignment impacted estimates) Ex: Assigning twins to A and B
26
IMPLEMENTATION OF MULTILEVEL MODELS IN OPEN MX OpenMx Discussion
http://openmx.psyc.virginia.edu/thread/125
Discuss more tomorrow in Dynamical Systems talk
27
MULTILEVEL GENETIC ARTICLES General
Discuss how to extend ACDE model to twins and larger family pedigrees
Guo & Wang, 2002; McArdle & Prescott, 2005; Rabe-Hesketh, Skrondal, Gjessing, 2008
Longitudinal McArdle, 2006
Other Inclusion of measured genotypes: Van den Oord,
2001
28
29
DATA CONSIDERATIONS Multivariate – Wide
Multiple family members per row of data
Multilevel – Long One individual per row of data
FAMID ZYG ALC_T1 ALC_T21 1 20 102 6 15 63 4 0 5
FAMID ZYG ALC1 1 201 1 102 6 152 6 63 4 03 4 5