an introduction to loglinear modelingcyfs.unl.edu/cyfsprojects/videoppt/53fdf8e3fc67742fc50b... ·...

71
Introduction to Loglinear Models Ann M. Arthur

Upload: others

Post on 14-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Introduction to Loglinear Models

Ann M. Arthur

Page 2: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Overview

• Considerations• Parameters of contingency tables• Loglinear model

• Hypotheses to be tested• Interpretation of estimates• Model selection

• Useful parameterization for some categorical models

2

Page 3: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Considerations

• Categorical and discrete data• Poisson (count data)• Binomial (dichotomous data)• Multinomial (polytomous data)

• Research questions• All variables are categorical• Want to describe and understand associations between variables

3

Page 4: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Parameters

4

Page 5: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Categorical Data

• Frequencies or cell counts • Compute probabilities

• 𝑝𝑝𝑖𝑖 = 𝑓𝑓𝑖𝑖/𝑛𝑛• E.g., 𝑝𝑝𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 = 24

93= 0.258

5

Page 6: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Contingency Tables

• Assume two categorical variables, A with i=1,…,I categories and B with j=1,…,J categories

• Frequencies/cell counts can be arranged into an 𝐼𝐼 × 𝐽𝐽contingency table

6

Page 7: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

2 x 2 Contingency Table• Data from Sewell and Shah

(1968) on 10,319 Wisconsin high school seniors

• See also Fienberg (1977)

• Fundamental parameters• Probabilities• Odds• Odds Ratios

7

Page 8: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Probabilities• Joint probabilities

• Describe co-occurrence• 𝑝𝑝𝑖𝑖𝑖𝑖 = 𝑓𝑓𝑖𝑖𝑖𝑖

𝑛𝑛• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿,𝑁𝑁𝐿𝐿 = 4653

10319= 0.45

• Marginal probability• 𝑝𝑝𝑖𝑖+ = 𝑓𝑓𝑖𝑖+

𝑛𝑛, also

𝑓𝑓+𝑖𝑖𝑛𝑛

• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿 = 496510319

= 0.48

• 𝑝𝑝𝑌𝑌𝑏𝑏𝑌𝑌 = 337610319

= 0.33

• Conditional probability• Implies causal structure (DV|IV)• 𝑝𝑝𝑖𝑖|𝑖𝑖 =

𝑝𝑝𝑖𝑖𝑖𝑖𝑝𝑝𝑖𝑖+

• 𝑝𝑝𝑁𝑁𝐿𝐿|𝐿𝐿𝐿𝐿𝐿𝐿 = 0.450.48

= 0.94

8

Page 9: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Probabilities• Joint probabilities

• Describe co-occurrence• 𝑝𝑝𝑖𝑖𝑖𝑖 = 𝑓𝑓𝑖𝑖𝑖𝑖

𝑛𝑛• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿,𝑁𝑁𝐿𝐿 = 4653

10319= 0.45

• Marginal probability• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿 = 4965

10319= 0.48

• 𝑝𝑝𝑌𝑌𝑏𝑏𝑌𝑌 = 337610319

= 0.33• Conditional probability

• Implies causal structure (DV|IV)• 𝑝𝑝𝑖𝑖|𝑖𝑖 = 𝑝𝑝𝑖𝑖𝑖𝑖

𝑝𝑝𝑖𝑖+

• 𝑝𝑝𝑁𝑁𝐿𝐿|𝐿𝐿𝐿𝐿𝐿𝐿 = 0.450.48

= 0.94

9

Page 10: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Probabilities (2)• Joint probabilities

• Describe co-occurrence• 𝑝𝑝𝑖𝑖𝑖𝑖 = 𝑓𝑓𝑖𝑖𝑖𝑖

𝑛𝑛• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿,𝑁𝑁𝐿𝐿 = 4653

10319= 0.45

• Marginal probability• Marginal probability

• 𝑝𝑝𝑖𝑖+ = 𝑓𝑓𝑖𝑖+𝑛𝑛

, also 𝑓𝑓+𝑖𝑖𝑛𝑛

• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿 = 496510319

= 0.48

• 𝑝𝑝𝑌𝑌𝑏𝑏𝑌𝑌 = 337610319

= 0.33

• Conditional probability• Implies causal structure (DV|IV)• 𝑝𝑝𝑖𝑖|𝑖𝑖 = 𝑝𝑝𝑖𝑖𝑖𝑖

𝑝𝑝𝑖𝑖+

• 𝑝𝑝𝑁𝑁𝐿𝐿|𝐿𝐿𝐿𝐿𝐿𝐿 = 0.450.48

= 0.94

10

Page 11: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Probabilities (3)• Marginal probabilities

• 𝑝𝑝𝑖𝑖+ = 𝑓𝑓𝑖𝑖+𝑛𝑛

, also 𝑓𝑓+𝑖𝑖𝑛𝑛

• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿 = 496510319

= 0.48

• Conditional probability• Implies causal structure (DV|IV)• 𝑝𝑝𝑖𝑖|𝑖𝑖 = 𝑝𝑝𝑖𝑖𝑖𝑖

𝑝𝑝𝑖𝑖+

• 𝑝𝑝𝑁𝑁𝐿𝐿|𝐿𝐿𝐿𝐿𝐿𝐿 = 0.450.48

= 0.94

11

Page 12: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Probabilities (3)• Marginal probabilities

• 𝑝𝑝𝑖𝑖+ = 𝑓𝑓𝑖𝑖+𝑛𝑛

, also 𝑓𝑓+𝑖𝑖𝑛𝑛

• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿 = 496510319

= 0.48

• Conditional probability• structure (DV|IV)• 𝑝𝑝𝑖𝑖|𝑖𝑖 = 𝑝𝑝𝑖𝑖𝑖𝑖

𝑝𝑝𝑖𝑖+

• 𝑝𝑝𝑁𝑁𝐿𝐿|𝐿𝐿𝐿𝐿𝐿𝐿 = 0.450.48

= 0.94

12

Page 13: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Probabilities (4)• Marginal probabilities

• 𝑝𝑝𝑖𝑖+ = 𝑓𝑓𝑖𝑖+𝑛𝑛

, also 𝑓𝑓+𝑖𝑖𝑛𝑛

• 𝑝𝑝𝐿𝐿𝐿𝐿𝐿𝐿 = 496510319

= 0.48

• Conditional probability• (DV|IV)• 𝑝𝑝𝑖𝑖|𝑖𝑖 = 𝑝𝑝𝑖𝑖𝑖𝑖

𝑝𝑝𝑖𝑖+

• 𝑝𝑝𝑁𝑁𝐿𝐿|𝐿𝐿𝐿𝐿𝐿𝐿 = 0.450.48

= 0.94

13

Page 14: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Probabilities (5)• Conditional probabilities

• Implies causal structure (DV|IV)• 𝑝𝑝𝑖𝑖|𝑖𝑖 = 𝑝𝑝𝑖𝑖𝑖𝑖

𝑝𝑝𝑖𝑖+• E.g., What is the probability that they

are not planning to attend college, given low parental encouragement?

• 𝑝𝑝𝑁𝑁𝐿𝐿|𝐿𝐿𝐿𝐿𝐿𝐿 = 0.450.48

= 0.94

14

Page 15: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Probabilities (5)• Conditional probabilities

• Implies causal structure (DV|IV)• 𝑝𝑝𝑖𝑖|𝑖𝑖 = 𝑝𝑝𝑖𝑖𝑖𝑖

𝑝𝑝𝑖𝑖+• E.g., Given low parental

encouragement, what is the probability that they do not plan to attend college?

• 𝑝𝑝𝑁𝑁𝐿𝐿|𝐿𝐿𝐿𝐿𝐿𝐿 = 0.450.48

= 0.94

15

Page 16: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Odds• 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 = 𝑝𝑝/(1 − 𝑝𝑝)

• E.g., Odds of seniors not planning to attend college relative to those planning to attend

• Ω1+ = 𝑝𝑝1+/𝑝𝑝2+ = 69433376

= 2.05• Seniors are twice as likely to not plan

to attend college, compared to those planning to attend

• E.g., Odds of seniors planning to attend college compared to those planning to not attend

• 𝛺𝛺2+ = 33766943

= 0.48• Seniors are half as likely to plan to

attend college, compared to those that are not planning to attend

16

Page 17: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Odds• 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 = 𝑝𝑝/(1 − 𝑝𝑝)

• E.g., Odds of seniors not planning to attend college relative to those planning to attend

• Ω1+ = 𝑝𝑝1+/𝑝𝑝2+ = 69433376

= 2.05• Seniors are twice as likely to not plan

to attend college, compared to those planning to attend

• E.g., Odds of seniors planning to attend college compared to those planning to not attend

• 𝛺𝛺2+ = 33766943

= 0.48• Seniors are half as likely to plan to

attend college, compared to those that are not planning to attend

17

Page 18: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Odds Ratio• Odds ratios

• Compares two odds• 𝜃𝜃 = 𝐿𝐿𝑜𝑜𝑜𝑜𝑌𝑌1

𝐿𝐿𝑜𝑜𝑜𝑜𝑌𝑌2= 𝑝𝑝1/(1−𝑝𝑝1)

𝑝𝑝2/(1−𝑝𝑝2)

• Ratio of cross-products• 𝜃𝜃11 = (𝑝𝑝11/𝑝𝑝12)

(𝑝𝑝21/𝑝𝑝22)= 𝑝𝑝11𝑝𝑝22

𝑝𝑝12𝑝𝑝21= 𝑓𝑓11𝑓𝑓22

𝑓𝑓12𝑓𝑓21

• 𝜃𝜃11 = (4653/312)(2290/3064)

= 4653∗30642290∗312

= 19.95

• Students with low parental encouragement have estimated odds of planning to not attend college that are 20 times the estimated odds of someone with high encouragement

18

Page 19: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Odds Ratio (2)• Local odds ratios

• Adjacent cells• Local odds ratios perfectly define all

associations within the table!

19

Page 20: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Independence

• No association between two variables• Equal odds ratios• Joint probability is a product of the marginals

• Not significantly different from expected values

• Can you collapse the table across a dimension?• To do this, the variable must have no significant interaction with the other

variable

20

Page 21: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Multiway Tables

• Simpson’s Paradox• When marginal tables leads to

highly misleading inference• Specification problem

• Correct functional form • All necessary variables• No unnecessary variables

21

Page 22: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Independence (2)

• Tests of independence• Pearson chi-square statistic, 𝑋𝑋2

• Likelihood ratio chi-square statistic, 𝐺𝐺2

• Degrees of freedom• (I x J) - # of estimated

parameters

22

Page 23: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Independence (2)

data college;input encouragement$ attend $ count @@;datalines;low no 4653 low yes 312 high no 2290 high yes 3064;

proc freq data=college order=data; weight count;tables encouragement*attend/chisq expected;

run;

23

Page 24: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Test of Independence (3)

• H0: Independence • How plausible is it that the local odds

ratio is 1?

• Larger values are the result of greater differences between expected and observed values

• Reject H0

24

Page 25: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Loglinear Models

25

Page 26: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

The Loglinear Model

• A type of generalized linear models (GLM), the family of models that extend ordinary least squares regression to non-normal distributions

• Models to describe the joint distributions• The dependent variable is a cell size (no distinction between dependent and

independent variables)• Used to analyze cell counts in a more formal and complete manner

(Hagenaars, 1993)

• The canonical link is the log link

26

Page 27: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Fundamental Parameters• Odds and odds ratios

• Range of 0,∞• Not symmetrical around 0• Value of 1 indicates equal odds and independence

• Logits• 𝐿𝐿𝑜𝑜𝐿𝐿(𝜃𝜃)• Range of −∞,∞• Symmetric around 0• Value of 0 indicates equal odds and no difference

27

Page 28: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Multiway Tables

• Higher-order odds ratio• 𝜃𝜃111 = 𝑓𝑓111𝑓𝑓221

𝑓𝑓121𝑓𝑓212/𝑓𝑓112𝑓𝑓222𝑓𝑓122𝑓𝑓212

• Partial odds ratio • Average conditional odds

• Can’t add and divide odds ratios• Geometric mean (multiply and take nth root)

• 𝜃𝜃11𝑝𝑝 = (∏𝑘𝑘𝐾𝐾 𝜃𝜃11𝑘𝑘)1/𝐾𝐾= 𝐾𝐾 𝜃𝜃111𝜃𝜃112 …𝜃𝜃11𝐾𝐾

28

Page 29: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

The Independence Model

• In probability form• Joint probabilities can determined by the marginals• 𝑝𝑝𝑖𝑖𝑖𝑖 = 𝑝𝑝𝑖𝑖+𝑝𝑝+𝑖𝑖

• In expected frequency form• 𝜇𝜇𝑖𝑖𝑖𝑖 = 𝑛𝑛𝑝𝑝𝑖𝑖+𝑝𝑝+𝑖𝑖• This form is multiplicative

• Take the natural log of expected frequencies• Yields the loglinear model• Additive

29

Page 30: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Multiplicative and Additive Models

• Taking the natural log yields the loglinear model of independence • 𝜇𝜇𝑖𝑖𝑖𝑖 = 𝑛𝑛𝑝𝑝𝑖𝑖+𝑝𝑝+𝑖𝑖 (Multiplicative, expected frequencies)• log(𝜇𝜇𝑖𝑖𝑖𝑖) = log 𝑛𝑛 + log 𝑝𝑝𝑖𝑖+ + log(𝑝𝑝+𝑖𝑖) (Additive)• log(𝜇𝜇𝑖𝑖𝑖𝑖) = 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝐴𝐴 + 𝜆𝜆𝑖𝑖𝐵𝐵 (Additive, loglinear notation)

• Where 𝐴𝐴 and 𝐵𝐵 denote parental encouragement and college plans, respectively

30

Page 31: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Analogous to ANOVA

• log(𝜇𝜇𝑖𝑖𝑖𝑖) = 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝐴𝐴 + 𝜆𝜆𝑖𝑖𝐵𝐵𝜇𝜇 is the average cell size, or “grand mean”𝜆𝜆𝑖𝑖𝐴𝐴 is the row effect for variable A, or deviation from the average cell size due to level 𝑖𝑖𝜆𝜆𝑖𝑖𝐵𝐵 is the column effect of variable B, or deviation from the average cell size due to level 𝑗𝑗

• The equation for the expected values• Sparseness

• Can’t take the natural log of 0• If there are cell counts of 0, they need to be adjusted

• Create a new count variable with a very small amount added (e.g., 0.0001)

31

Page 32: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Loglinear Model of Independence

• The loglinear model of independence for three variables is:log(𝑓𝑓𝑖𝑖𝑖𝑖𝑘𝑘𝐴𝐴𝐵𝐵𝐴𝐴) = 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝐴𝐴 + 𝜆𝜆𝑖𝑖𝐵𝐵 + 𝜆𝜆𝑘𝑘𝐴𝐴

• This model omits all higher-order terms• Assumes there are no interactions between variables (e.g., 𝜆𝜆𝑖𝑖𝑖𝑖𝐴𝐴𝐵𝐵 = 0)

32

Page 33: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Estimation

• SAS• PROC GENMOD• PROC CATMOD

• Lem• R

• loglin()• glm()

33

Page 34: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Independence Loglinear Model

data college;

input sex $ encouragement $ attend $ count;

datalines;

male low no 1949

male low yes 136

male high no 1203

male high yes 1703

female low no 2704

female low yes 176

female high no 1087

female high yes 1361

;34

Page 35: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Independence Loglinear Model

proc genmod data=college;

class sex encouragement attend;

model count = sex encouragement attend /

dist=poi link=log lrci type3 obstats;

run;

35

Page 36: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Output – Independence Loglinear Model• H0: Independence holds

• Overall fit• Large 𝑋𝑋2 and 𝐺𝐺2

36

Page 37: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Output – Independence Loglinear Model

𝜆𝜆1𝐴𝐴

𝜆𝜆1𝐵𝐵

𝜆𝜆1𝐴𝐴

37

𝜇𝜇

𝜆𝜆2𝐴𝐴

𝜆𝜆2𝐵𝐵

𝜆𝜆2𝐴𝐴

• Convert estimates back into cell counts (dummy-coding approach)• Males with low encouragement not planning to attend college• 𝜇𝜇111 = exp 𝜆𝜆 + 𝜆𝜆2𝐴𝐴 + 𝜆𝜆2𝐵𝐵 + 𝜆𝜆1𝐴𝐴 = exp 6.67 + 0 + 0 + 0.72 = 1615.662

Page 38: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Output – Independence Loglinear Model

38

Page 39: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

The Saturated Loglinear Model

• Models all possible associations between cell counts

log(𝑓𝑓𝑖𝑖𝑖𝑖𝑘𝑘𝐴𝐴𝐵𝐵𝐴𝐴) = 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝐴𝐴 + 𝜆𝜆𝑖𝑖𝐵𝐵 + 𝜆𝜆𝑘𝑘𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝐴𝐴𝐵𝐵 + 𝜆𝜆𝑖𝑖𝑘𝑘𝐴𝐴𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑘𝑘𝐵𝐵𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑘𝑘𝐴𝐴𝐵𝐵𝐴𝐴

𝜇𝜇 is the average cell size 𝜆𝜆𝑖𝑖𝐴𝐴, 𝜆𝜆𝑖𝑖𝐵𝐵 , and 𝜆𝜆𝑘𝑘𝐴𝐴 are the main effects of variables A, B, and C𝜆𝜆𝑖𝑖𝑖𝑖𝐴𝐴𝐵𝐵 , 𝜆𝜆𝑖𝑖𝑘𝑘𝐴𝐴𝐴𝐴 , 𝜆𝜆𝑖𝑖𝑘𝑘𝐵𝐵𝐴𝐴 , and 𝜆𝜆𝑖𝑖𝑖𝑖𝑘𝑘𝐴𝐴𝐵𝐵𝐴𝐴 are higher-order terms

39

Page 40: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

The Saturated Loglinear Model

• “Saturated” means the number of cells is equal to the number of parameters estimated

• Just-identified model

• This model is often not of interest • No degrees of freedom available to test hypotheses • Does not simplify interpretation of the data

40

Page 41: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Saturated Loglinear Model

proc genmod data=college;

class sex encouragement attend;

model count = sex|encouragement|attend /

dist=poi link=log lrci type3 obstats;

run;

41

Page 42: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Saturated Loglinear Model• No degrees of freedom to test

model fit

42

Page 43: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Saturated Loglinear Model• All parameters estimated• Some non-significant interactions

43

Page 44: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Saturated Loglinear Model

• Predicted values perfectly represent the observed data

44

Page 45: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Reduced Loglinear Models

• Do you need higher-order terms, or can they be eliminated?• Reduced models with good fit greatly simplifies the interpretation• Parsimony

• Possible models• Model of all possible associations• Models with main effects and two-ways interactions• Models with main effects only• Model of independence

45

Page 46: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Reduced Loglinear Models

• A three-variable model that permits some two-way associations log(𝑓𝑓𝑖𝑖𝑖𝑖𝑘𝑘𝐴𝐴𝐵𝐵𝐴𝐴) = 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝐴𝐴 + 𝜆𝜆𝑖𝑖𝐵𝐵 + 𝜆𝜆𝑘𝑘𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝐴𝐴𝐵𝐵 + 𝜆𝜆𝑖𝑖𝑘𝑘𝐵𝐵𝐴𝐴

• Two factor terms describe conditional odds ratios• 𝜆𝜆𝑖𝑖𝑖𝑖𝐴𝐴𝐵𝐵 association between A and B, controlling for C• 𝜆𝜆𝑖𝑖𝑘𝑘𝐵𝐵𝐴𝐴 association between B and C, controlling for A

• This model is referred to as (AB, BC)

46

Page 47: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Reduced Loglinear Models

• Test whether there is conditional independence within the multiway table

• Compare the fit of various reduced loglinear models to the saturated model

47

Page 48: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Reduced Loglinear Models

48

Page 49: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Reduced Loglinear Models

49

Page 50: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Reduced Loglinear Models

50

Page 51: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Example – Reduced Loglinear Models

• Cell counts – how do they compared to the saturated model?• Model (SE, SA, EA) comes the closest to the observed data

51

Page 52: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Model Selection

52

Page 53: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Loglinear Models

• Hypotheses to be tested• Independence • Reduced models

• Interpretation• For dummy-coding approach, ANOVA-style decomposition• Convert estimates into expected cell counts

• Males with low encouragement not planning to attend college• 𝜇𝜇111 = exp 𝜆𝜆 + 𝜆𝜆2𝐴𝐴 + 𝜆𝜆2𝐵𝐵 + 𝜆𝜆1𝐴𝐴 = exp 6.67 + 0 + 0 + 0.72 = 1615.662

• Model selection• Retain the model that fits well and represents the observed data well

53

Page 54: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Loglinear Parameterization of Common Categorical Models

54

Page 55: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

The Logistic Model

• Special case of the generalized linear model • Regresses a binary dependent variable on 1+ independent variables• Models the log of the odds of the dependent variable• The canonical link function is the logit• Does not describe relationships among independent variables

• When one variable is binary, the logistic models for that response are equal to certain loglinear models

• Construct logits for one variable to help interpret loglinear models (Bishop, 1969)

55

Page 56: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Using Logistic Models to Interpret

• Two types of coding yield identical estimates• Dummy coding (0, 1)• Effect coding (-1, 1)

• Identifying constraints and power rules• ∑𝜆𝜆𝑖𝑖𝐴𝐴 = 0• Lambdas sum to zero in effect coding approach• When you change an odd number, change the sign

• 𝜆𝜆1𝐴𝐴 = −𝜆𝜆2𝐴𝐴

• When you change an even number, same sign• 𝜆𝜆11𝐴𝐴𝐵𝐵 = −𝜆𝜆12𝐴𝐴𝐵𝐵= −𝜆𝜆21𝐴𝐴𝐵𝐵= 𝜆𝜆22𝐴𝐴𝐵𝐵

56

Page 57: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Using Logistic Models to Interpret

• Odds ratios relate to two-factor loglinear parameters and main effects• The log odds ratio for the effect of A on C

Logit Loglinear𝛽𝛽1𝐴𝐴 − 𝛽𝛽2𝐴𝐴 𝜆𝜆11𝐴𝐴𝐴𝐴 + 𝜆𝜆22𝐴𝐴𝐴𝐴 − 𝜆𝜆12𝐴𝐴𝐴𝐴 − 𝜆𝜆21𝐴𝐴𝐴𝐴

1) Specify a logit for one variable2) Substitute the loglinear parameterization for the odds3) Use power rules to substitute and solve

57

Page 58: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Using Logistic Models to Interpret

1) Form a logit for the loglinear model• log 𝜇𝜇𝑖𝑖𝑖𝑖𝑘𝑘 = 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆𝑘𝑘𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖𝑘𝑘𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑘𝑘𝐸𝐸𝐴𝐴

• Suppose A is the dependent variable and E and A are explanatory variables

• 𝑙𝑙𝑜𝑜𝐿𝐿𝑖𝑖𝑙𝑙[𝑃𝑃 𝐴𝐴 = 1 = 𝑙𝑙𝑜𝑜𝐿𝐿 𝑃𝑃 𝐴𝐴=11−𝑃𝑃(𝐴𝐴=1

= log 𝑃𝑃 𝐴𝐴=1|𝑆𝑆=𝑖𝑖,𝐸𝐸=𝑖𝑖𝑃𝑃 𝐴𝐴=2|𝑆𝑆=𝑖𝑖,𝐸𝐸=𝑖𝑖

= 𝑙𝑙𝑜𝑜𝐿𝐿𝑓𝑓𝑖𝑖𝑖𝑖1𝑓𝑓𝑖𝑖𝑖𝑖2

= log 𝑓𝑓𝑖𝑖𝑖𝑖1 − log 𝑓𝑓𝑖𝑖𝑖𝑖2

2) Substitute the loglinear parameterization for the odds= 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆1𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴

−(𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆2𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖2𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖2𝐸𝐸𝐴𝐴)= (𝜆𝜆1𝐴𝐴−𝜆𝜆2𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 − 𝜆𝜆𝑖𝑖2𝑆𝑆𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴−𝜆𝜆𝑖𝑖2𝐸𝐸𝐴𝐴)

58

Page 59: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Using Logistic Models to Interpret

1) Form a logit for the loglinear model• log 𝜇𝜇𝑖𝑖𝑖𝑖𝑘𝑘 = 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆𝑘𝑘𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖𝑘𝑘𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑘𝑘𝐸𝐸𝐴𝐴

• Suppose A is the dependent variable and S and E are explanatory variables

• 𝑙𝑙𝑜𝑜𝐿𝐿𝑖𝑖𝑙𝑙[𝑃𝑃 𝐴𝐴 = 1 = 𝑙𝑙𝑜𝑜𝐿𝐿 𝑃𝑃 𝐴𝐴=11−𝑃𝑃(𝐴𝐴=1

= log 𝑃𝑃 𝐴𝐴=1|𝑆𝑆=𝑖𝑖,𝐸𝐸=𝑖𝑖𝑃𝑃 𝐴𝐴=2|𝑆𝑆=𝑖𝑖,𝐸𝐸=𝑖𝑖

= 𝑙𝑙𝑜𝑜𝐿𝐿𝑓𝑓𝑖𝑖𝑖𝑖1𝑓𝑓𝑖𝑖𝑖𝑖2

= log 𝑓𝑓𝑖𝑖𝑖𝑖1 − log 𝑓𝑓𝑖𝑖𝑖𝑖2

2) Substitute the loglinear parameterization for the odds= 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆1𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴

−(𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆2𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖2𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖2𝐸𝐸𝐴𝐴)= (𝜆𝜆1𝐴𝐴−𝜆𝜆2𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 − 𝜆𝜆𝑖𝑖2𝑆𝑆𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴−𝜆𝜆𝑖𝑖2𝐸𝐸𝐴𝐴)

59

Page 60: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Using Logistic Models to Interpret

1) Form a logit for the loglinear model• log 𝜇𝜇𝑖𝑖𝑖𝑖𝑘𝑘 = 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆𝑘𝑘𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖𝑘𝑘𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑘𝑘𝐸𝐸𝐴𝐴

• Suppose A is the dependent variable and E and A are explanatory variables

• 𝑙𝑙𝑜𝑜𝐿𝐿𝑖𝑖𝑙𝑙[𝑃𝑃 𝐴𝐴 = 1 = 𝑙𝑙𝑜𝑜𝐿𝐿 𝑃𝑃 𝐴𝐴=11−𝑃𝑃(𝐴𝐴=1

= log 𝑃𝑃 𝐴𝐴=1|𝑆𝑆=𝑖𝑖,𝐸𝐸=𝑖𝑖𝑃𝑃 𝐴𝐴=2|𝑆𝑆=𝑖𝑖,𝐸𝐸=𝑖𝑖

= 𝑙𝑙𝑜𝑜𝐿𝐿𝑓𝑓𝑖𝑖𝑖𝑖1𝑓𝑓𝑖𝑖𝑖𝑖2

= log 𝑓𝑓𝑖𝑖𝑖𝑖1 − log 𝑓𝑓𝑖𝑖𝑖𝑖2

2) Substitute the loglinear parameterization for the odds= 𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆1𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴

−(𝜇𝜇 + 𝜆𝜆𝑖𝑖𝑆𝑆 + 𝜆𝜆𝑖𝑖𝐸𝐸 + 𝜆𝜆2𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑖𝑖𝑆𝑆𝐸𝐸 + 𝜆𝜆𝑖𝑖2𝑆𝑆𝐴𝐴 + 𝜆𝜆𝑖𝑖2𝐸𝐸𝐴𝐴)= (𝜆𝜆1𝐴𝐴−𝜆𝜆2𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 − 𝜆𝜆𝑖𝑖2𝑆𝑆𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴−𝜆𝜆𝑖𝑖2𝐸𝐸𝐴𝐴)

60

Page 61: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Using Logistic Models to Interpret

= (𝜆𝜆1𝐴𝐴−𝜆𝜆2𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 − 𝜆𝜆𝑖𝑖2𝑆𝑆𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴−𝜆𝜆𝑖𝑖2𝐸𝐸𝐴𝐴)

61

Page 62: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Using Logistic Models to Interpret

= (𝜆𝜆1𝐴𝐴−𝜆𝜆2𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 − 𝜆𝜆𝑖𝑖2𝑆𝑆𝐴𝐴) + (𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴−𝜆𝜆𝑖𝑖2𝐸𝐸𝐴𝐴)3) Use power rules to substitute again

= 2𝜆𝜆1𝐴𝐴 + 2𝜆𝜆𝑖𝑖1𝑆𝑆𝐴𝐴 + 2𝜆𝜆𝑖𝑖1𝐸𝐸𝐴𝐴

Loglinear parameters have corresponding logit parameters𝑙𝑙𝑜𝑜𝐿𝐿𝑖𝑖𝑙𝑙 𝑃𝑃 𝐴𝐴 = 1 = 𝛼𝛼 + 𝛽𝛽1𝑆𝑆+𝛽𝛽1𝐸𝐸

62

Page 63: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

The Logistic Model

• The equivalent parameterizations enhance interpretation• Historical breakthrough

• Logistic models could be solved using iterative proportional fitting, which was previously used to solve loglinear models (Goodman, 1964; 1968)

63

Page 64: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

The Latent Class Model

• Latent class analysis (LCA) is a special case of discrete (finite) mixture models (McLachlan & Peel, 2000; Newcomb, 2000)

• Used to identify unobserved or latent groups• Assumes conditional independence

• Controlling for the latent variable, all manifest variables are independent

• Two equivalent parameterizations (Goodman, 1974; Haberman, 1979)• Probabilistic• Loglinear

64

Page 65: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

The Latent Class Model (3)

65

𝐺𝐺𝑔𝑔

𝐴𝐴𝑖𝑖

𝐵𝐵𝑖𝑖

𝐶𝐶𝑘𝑘

𝐷𝐷𝑏𝑏

𝑋𝑋𝑡𝑡

Page 66: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

LCA Parameterizations

• The basic LCA model, assuming 3 manifest variables and 1 latent variable

• Probabilistic parameterization𝜋𝜋𝑖𝑖𝑖𝑖𝑘𝑘𝑡𝑡𝐴𝐴𝐵𝐵𝐴𝐴𝐴𝐴=𝜋𝜋𝑡𝑡𝐴𝐴𝜋𝜋𝑖𝑖𝑡𝑡

𝐴𝐴|𝐴𝐴𝜋𝜋𝑖𝑖𝑡𝑡𝐵𝐵|𝐴𝐴𝜋𝜋𝑘𝑘𝑡𝑡

𝐴𝐴|𝐴𝐴

• 𝜋𝜋𝑡𝑡𝐴𝐴 is the latent class probability or “mixing” probability that a given member of the sample is in latent class t

• 𝜋𝜋𝑖𝑖𝑡𝑡𝐴𝐴|𝐴𝐴,𝜋𝜋𝑖𝑖𝑡𝑡

𝐵𝐵|𝐴𝐴and 𝜋𝜋𝑘𝑘𝑡𝑡𝐴𝐴|𝐴𝐴 are conditional probabilities that the respondent in latent

class t responds with 0 or 1 for each manifest indicator variable

66

Page 67: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

LCA Parameterizations

• To obtain the loglinear form of the model, take the natural log of the probabilistic model

ln 𝑓𝑓𝑖𝑖𝑖𝑖𝑘𝑘𝑡𝑡𝐴𝐴𝐵𝐵𝐴𝐴𝐴𝐴 = 𝜆𝜆 + 𝜆𝜆𝑡𝑡𝐴𝐴 + 𝜆𝜆𝑖𝑖𝐴𝐴 +𝜆𝜆𝑖𝑖𝐵𝐵 + 𝜆𝜆𝑘𝑘𝐴𝐴 +𝜆𝜆𝑖𝑖𝑡𝑡𝐴𝐴𝐴𝐴 + 𝜆𝜆𝑖𝑖𝑡𝑡𝐵𝐵𝐴𝐴 +𝜆𝜆𝑘𝑘𝑡𝑡𝐴𝐴𝐴𝐴

• Includes only the higher-order terms that include the latent variable• No interaction terms (e.g., 𝜆𝜆𝑖𝑖𝑖𝑖𝑡𝑡𝐴𝐴𝐵𝐵𝐴𝐴 = 0) because the model specifies

conditional independence (McCutcheon, 1987; 2002)

67

Page 68: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

LCA Parameterizations

• The two parameterizations are equivalent (Haberman, 1979)• Same number of parameters• Same expected values

• Some restrictions can only be imposed in one parameterization• “Reduced” latent class models

𝜋𝜋𝑖𝑖𝑖𝑖𝑘𝑘𝑏𝑏𝑌𝑌𝐴𝐴𝐵𝐵𝐴𝐴𝐷𝐷𝐷𝐷=𝜋𝜋𝑡𝑡𝑌𝑌𝐴𝐴|𝐷𝐷𝜋𝜋𝑖𝑖𝑡𝑡𝑌𝑌

𝐴𝐴|𝐴𝐴𝐷𝐷𝜋𝜋𝑖𝑖𝑡𝑡𝑌𝑌𝐵𝐵|𝐴𝐴𝐷𝐷𝜋𝜋𝑘𝑘𝑡𝑡𝑌𝑌

𝐴𝐴|𝐴𝐴𝐷𝐷𝜋𝜋𝑏𝑏𝑡𝑡𝑌𝑌𝐷𝐷|𝐴𝐴𝐷𝐷𝜋𝜋𝑌𝑌𝐷𝐷

• Test hypotheses using loglinear parameterization• Use power rules to obtain “reduced” model in loglinear form• Does conditional independence hold across groups?

68

Page 69: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Summary

• Loglinear models are an essential method for understanding categorical data

• Taking the natural log of cell counts yields an ANOVA decomposition • Log odds of cell sizes• Convert lambda parameters back into odds and odds ratios

• The two parameterizations permit ANOVA-style decomposition to contingency tables

• Aid interpretability • Can estimate equivalent logistic models

• Added flexibility in the types of restrictions that can be imposed• Conditional independence in latent class analysis models

69

Page 70: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

ReferencesAgresti, A. (2007). An introduction to categorical data analysis. (2nd ed.). Hoboken, NJ: Wiley. Bishop, Y. M. (1969). Full contingency tables, logits, and split contingency tables. Biometrics, 383-399.Fienberg, S. (1977) The analysis of cross-classified categorical data. Cambridge, MA: MIT PressGoodman, L. A. (1964). A short computer program for the analysis of transaction flows. Behavioral Science, 9, 176-

186.Goodman, L. A. (1968). The analysis of cross-classified data: Independence, quasi-independence, and interactions in

contingency tables with or without missing entries. Journal of the American Statistical Associate, 63(324), 1091-1131.

Hagenaars, J. (1993). Loglinear models with latent variables (Sage University Paper series on quantitative applications in the social sciences, series no. 07-094). Newbury Park, CA: Sage.

McCutcheon, A. L. (1987). Latent class analysis. Newbury Park, CA: Sage.McLachlan, G. and Peel, D. (2000). Finite Mixture Model. Hoboken, NJ: John Wiley & Sons. doi: 10.1002/0471721182Newcomb, S. (1886). A generalized theory of the combination of observations so as to obtain the best result.

American Journal of Mathematics, 8(4), 343-366.Sewell, W. H., & Shah, V. P. (1968). Social class, parental encouragement, and educational aspirations. American

Journal of Sociology, 73, 559-572.

70

Page 71: An Introduction to Loglinear Modelingcyfs.unl.edu/cyfsprojects/videoPPT/53fdf8e3fc67742fc50b... · 2018-05-10 · Probabilities • Joint probabilities • Describe co-occurrence

Questions?

[email protected]