mixture modeling
DESCRIPTION
Mixture Modeling. Chongming Yang Research Support Center FHSS College. Mixture of Distributions. Mixture of Distributions. Classification Techniques. Latent Class Analysis (categorical indicators) Latent Profile Analysis (continuous Indicators) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/1.jpg)
Mixture Modeling
Chongming YangResearch Support Center
FHSS College
![Page 2: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/2.jpg)
Mixture of Distributions
![Page 3: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/3.jpg)
Mixture of Distributions
![Page 4: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/4.jpg)
Classification Techniques
• Latent Class Analysis (categorical indicators)• Latent Profile Analysis (continuous Indicators)• Finite Mixture Modeling (multivariate normal
variables)• …
![Page 5: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/5.jpg)
Integrate Classification Models into Other Models
• Mixture Factor Analysis• Mixture Regressions• Mixture Structural Equation Modeling• Growth Mixture Modeling• Multilevel Mixture Modeling
![Page 6: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/6.jpg)
Disadvantages of Multi-steps Practice
• Multistep practice– Run classification model – Save membership Variable– Model membership variable and other variables
• Disadvantages– Biases in parameter estimates– Biases in standard errors • Significance• Confidence Intervals
![Page 7: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/7.jpg)
Latent Class Analysis (LCA)
• Setting– Latent trait assumed to be categorical– Trait measured with multiple categorical indicators– Example: drug addiction, Schizophrenia
• Aim– Identify heterogeneous classes/groups – Estimate class probabilities– Identify good indicators of classes– Relate covariates to Classes
![Page 8: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/8.jpg)
Graphic LCA Model
• Categorical Indicators u: u1, u2,u3, …ur
• Categorical Latent Variable C: C =1, 2, …, or K
![Page 9: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/9.jpg)
Probabilistic Model
• Assumption: Conditional independence of u so that interdependence is explained by C like factor analysis model
• An item probability
• Joint Probability of all indicators
𝑃 (𝑢 𝑗=1)=∑𝑘=1
𝐾
¿¿
1 2 3
1 21
( , , ... )
( ) ( | ) ( | )... ( | )
r
k
rk
P u u u u
P c k P u c k P u c k P u c k
![Page 10: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/10.jpg)
LCA Parameters
• Number of Classes -1• Item Probabilities -1
![Page 11: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/11.jpg)
Class Means (Logit)
• Probability Scale
(logistic Regression without any Covariates x)
• Logit Scale
• Mean (highest number of Class) = 0
![Page 12: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/12.jpg)
Latent Class Analysis with Covariates
• Covariates are related to Class Probability with multinomial logistic regression
1
( 1| )ck ck
cj cj
x
ik i Kx
J
eP c xe
![Page 13: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/13.jpg)
Posterior Probability(membership/classification of cases)
1 21 2
1 2
( ) ( | ) ( | )... ( | )( | , ,... )( , ,... )
rr
r
P c k P u c k P u c k P u c kP c k u u uP u u u
![Page 14: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/14.jpg)
Estimation
• Maximum Likelihood estimation via • Expectation-Maximization algorithm– E (expectation) step: compute average posterior
probabilities for each class and item– M (maximization) step: estimate class and item
parameters– Iterate EM to maximize the likelihood of the
parameters
![Page 15: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/15.jpg)
Test against Data
• O = observed number of response patterns• E = model estimated number of response
patterns• Pearson
• Chi-square based on likelihood ratio
22 ( )o e
e
2 2 log( / )LR o o e
![Page 16: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/16.jpg)
Determine Number of Classes
• Substantive theory (parsimonious, interpretable)• Predictive validity• Auxiliary variables / covariates• Statistical information and tests– Bayesian Information Criterion (BIC)– Entropy– Testing K against K-1 Classes
• Vuong-Lo-Mendell-Rubin likelihood-ratio test• Bootstrapped likelihood ratio test
![Page 17: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/17.jpg)
Bayesian Information Criterion (BIC)
2 ( ) ( ) ln( )BIC log L h N L = likelihoodh = number of parametersN = sample sizeChoose model with smallest BICBIC Difference > 4 appreciable
![Page 18: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/18.jpg)
Quality of Classification
• Entropy
– = average of highest class probability of
individuals– A value of close to 1 indicates good classification– No clear cutting point for acceptance or rejection
![Page 19: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/19.jpg)
Testing K against K-1 Classes
• Bootstrapped likelihood ratio test LRT = 2[logL(model 1)- logL(model2)], where
model 2 is nested in model 1.Bootstrap Steps:1. Estimate LRT for both models2. Use bootstrapped samples to obtain
distributions for LRT of both models3. Compare LRT and get p values
![Page 20: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/20.jpg)
Testing K against K-1 Classes
• Vuong-Lo-Mendell-Rubin likelihood-ratio test
![Page 21: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/21.jpg)
Determine Quality of Indicators
• Good indicators– Item response probability is close to 0 or 1 in each
class• Bad indicators– Item response probability is high in more than one
classes, like cross-loading in factor analysis– Item response probability is low in all classes like
low-loading in factor analysis
![Page 22: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/22.jpg)
LCA Examples
• LCA• LCA with covariates• Class predicts a categorical outcome
![Page 23: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/23.jpg)
Save Membership Variable
Variable: idvar = id;
Output:Savedata: File = cmmber.txt; Save = cprob;
![Page 24: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/24.jpg)
Latent Profile Analysis
• Covariance of continuous variables are dependent on class K and fixed at zero
• Variances of continuous variables are constrained to be equal across classes and minimized
• Mean differences are maximized across classes
![Page 25: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/25.jpg)
Finite Mixture Modeling(multivariate normal variables)
• Finite = finite number of subgroups/classes• Variables are normally distributed in each class• Means differ across classes • Variances are the same across • Covariances can differ without restrictions or
equal with restrictions across classes• Latent profile can be special case with
covariances fixed at zero.
![Page 26: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/26.jpg)
Mixture Factor Analysis
• Allow one to examine measurement properties of items in heterogeneous subgroups / classes
• Measurement invariance is not required assuming heterogeneity
• Factor structure can change• See Mplus outputs
![Page 27: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/27.jpg)
Factor Mixture Analysis
• Parental Control
• Parental AcceptanceFeel people in your family understand you
Feel you want to leave home
Feel you and your family have fun together
Feel that your family pay attention to you
Feel your parents care about you
Feel close to your mother
Feel close to your father
Parents let you make your own decisions about the time you must be home on weekend nights
Parents let you make your own decisions about the people you hang around with
Parents let you make your own decisions about what you wear
Parents let you make your own decisions about which television programs you watch
Parents let you make your own decisions about which television programs you watch
Parents let you make your own decisions about what time you go to bed on week nights
Parents let you make your own decisions about what you eat
![Page 28: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/28.jpg)
Two dimensions of Parenting
![Page 29: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/29.jpg)
Mixture SEM
• See mixture growth modeling
![Page 30: Mixture Modeling](https://reader035.vdocuments.us/reader035/viewer/2022062305/568160bf550346895dcfe4bd/html5/thumbnails/30.jpg)
Mixture Modeling with Known Classes
• Identify hidden classes within known groups• Under nonrandomized experiments – Impose equality constraints on covariates to
identify similar classes from known groups – Compare classes that differ in covariates