vii. ordinal & multinomial logit models. to what degree do the dietary & exercise habits of...
TRANSCRIPT
![Page 1: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/1.jpg)
VII. Ordinal & Multinomial
Logit Models
![Page 2: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/2.jpg)
To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low, medium, or high-risk categories for cardiovascular disease?
How well do the social traits of a sample of high school students predict whether their achievement test scores are low, medium-low, medium-high, or high?
![Page 3: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/3.jpg)
To what extent do the institutional characteristics of a sample of political regimes predict whether their responsiveness to citizen demands is low, medium, or high?
How helpful are the institutional characteristics of a sample of industrial firms in predicting whether the amount of pollution they emit is low, medium, or high?
![Page 4: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/4.jpg)
These are examples of ordinal outcome variables.
The categories of an ordinal variable can be ranked, but the distances between the categories are not equal.
Because the distances between the categories are not equal, analyzing ordinal outcome variables via OLS regression violates its assumptions & leads to erroneous conclusions.
What statistical model avoids the assumption of equal intervals between ordinal categories?
![Page 5: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/5.jpg)
Logit & probit versions of the ordinal regression model safely ignore the OLS assumption of equal intervals between a variable’s categories.
But as Long & Freese (pages 137-38) observe, “Simply because the values of a variable can be ordered does not imply that the variable should be analyzed as ordinal.”
A categorical, multi-level variable could conceivably be ordered for one purpose but unordered for another.
![Page 6: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/6.jpg)
As Long & Freese conclude, “Overall, when the proper ordering is ambiguous, the models for nominal outcomes [multinomial regression] …should be considered.”
Multinomial models treat categories as nominal rather than ordinal: Which do you prefer—apple pie, hot fudge sundae, cheese cake, or cannoli?
Which is your racial-ethnic identity: Black, White, Asian, Hispanic, or other?
![Page 7: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/7.jpg)
Let’s use ordinal logistic regression to analyze respondent answers to this statement: “A working mother can establish just as warm & secure of a relationship with her child as a mother who does not work.”
The responses are coded as: 1=strongly disagree (SD), 2=disagree (D), 3=agree (A), & 4=strongly agree (SA).
These data are examined in Long/Freese, chapter 5.
. use ordwarm2, clear
![Page 8: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/8.jpg)
Let’s assume we’ve done the preparatory data analysis & transformations.
. ologit warm yr89 male white age ed prst, or nolog table
![Page 9: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/9.jpg)
. ologit warm yr89 male white age ed prst, or nolog tableOrdered logit estimates Number of obs = 2293
LR chi2(6) = 301.72
Prob > chi2 = 0.0000
Log likelihood = -2844.9123 Pseudo R2 = 0.0504
-------------------------------------------------------------------------
warm | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+-----------------------------------------------------------
yr89 | 1.688605 .1349175 6.56 0.000 1.443836 1.974867
male | .4803214 .0376969 -9.34 0.000 .4118389 .5601915
white | .6762723 .0800576 -3.30 0.001 .5362357 .8528791
age | .9785675 .0024154 -8.78 0.000 .9738449 .983313
ed | 1.06948 .0170849 4.20 0.000 1.036513 1.103496
prst | 1.006091 .003313 1.84 0.065 .9996188 1.012605
-------------------------------------------------------------------------
![Page 10: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/10.jpg)
What’s the interpretation?
Let’s see the coefficients as percentage change in odds:
. listcoef, percentologit (N=2293): Percentage Change in Odds
Odds of: >m vs <=m
warm b z P>z % %StdX SDofX
yr89 0.52390 6.557 0.000 68.9 29.2 0.4897
male -0.73330 -9.343 0.000 -52.0 -30.6 0.4989
white -0.39116 -3.304 0.001 -32.4 -12.1 0.3290
age -0.02167 -8.778 0.000 - 2.1 -30.5 16.7790
ed 0.06717 4.205 0.000 6.9 23.7 3.1608
prst 0.00607 1.844 0.065 0.6 9.2 14.4923
![Page 11: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/11.jpg)
Try fitting the model by means of ordinal probit:
. oprobit warm yr89 male white age ed prst, nolog table
Of course we can’t obtain odds ratios via ordinal probit.
Otherwise the only notable difference is that the logit coefficients are 1.7 times greater than the probit coefficients: the substantive conclusions are basically the same.
![Page 12: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/12.jpg)
We could have used the robust &/or cluster options:
.ologit warm yr89 male white age ed prst, or robust nolog table
.ologit warm yr89 male white age ed prst, or cluster(district) nolog table
.oprobit warm yr89 male white age ed prst, robust nolog table
.oprobit warm yr89 male white age ed prst, cluster(district) nolog table
Recall that cluster invokes robust standard errors.
![Page 13: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/13.jpg)
One possible problem with ologit or oprobit is perfect prediction: if the outcome variable does not vary within one of the categories of the explanatory variable, Stata will tell you. E.g.:
. note: 40 observations completely determined. Standard errors questionable.
We may receive the same message with binary outcome variables, but in that case Stata tells us which is the variable at fault & automatically drops the offending observations.
![Page 14: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/14.jpg)
In the case of logistic regression (i.e. a binary outcome variable),we may decide it wise to drop the offending variable from the model & re-estimate the model.
In ordinal categorical regression, we cross-tab the explanatory variables with the outcome variable to identify the culprit.
Then we re-categorize or drop the offending variable or—if we deem it wise—drop only the observations at fault (see Long/Freese, page 145).
![Page 15: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/15.jpg)
Let’s return to our ologit model & test nested models:
. ologit warm yr89 male white age ed prst, or nolog table
. estimates store full
. ologit warm yr89 male white age, nolog
. lrtest full .likelihood-ratio test LR chi2(2) = 44.57
(Assumption: . nested in full) Prob > chi2 = 0.0000
![Page 16: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/16.jpg)
We can also use the Wald-test to do the same thing (although, as mentioned previously, the likelihood ratio test is the preferred alternative):
. test ed prst( 1) ed = 0
( 2) prst = 0
chi2( 2) = 44.17
Prob > chi2 = 0.0000
The Wald-test & likelihood ratio test yield the same conclusion (as they usually do).
![Page 17: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/17.jpg)
Next step: test the model specification:
. linktest, nologOrdered logit estimates Number of obs = 2293
LR chi2(2) = 302.75
Prob > chi2 = 0.0000
Log likelihood = -2844.3934 Pseudo R2 = 0.0505
warm Coef. Std. Err. z P>z [95% Conf. Interval]
_hat 1.05767 .0821499 12.87 0.000 .8966591 1.218681
_hatsq .0652007 .0640337 1.02 0.309 -.0603031 .1907045
_cut1 -2.444759 .0763629 (Ancillary parameters)
_cut2 -.6149168 .052821
_cut3 1.282015 .0601348
No problems here.
![Page 18: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/18.jpg)
An aspect of model specification testing for ologit & oprobit models concerns the proportional odds (or, parallel regression) assumption: similar to OLS, the assumption is that the slope coefficients are identical across levels of the outcome variable—each probability curve is assumed to differ only in being shifted to the left or right (see Long/Freese, pages 150-52).
There are two ways of testing this assumption:
. omodel logit warm yrs89 male white age ed prst
. ologit warm yrs89 male white age ed prst
. brant
![Page 19: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/19.jpg)
. omodel logit warm yrs89 male white age ed prstOrdered logit estimates Number of obs = 2293
LR chi2(6) = 301.72
Prob > chi2 = 0.0000
Log likelihood = -2844.9123 Pseudo R2 = 0.0504
warm Coef. Std. Err. z P>z [95% Conf. Interval]
yr89 .5239025 .0798988 6.56 0.000 .3673037 .6805013
male -.7332997 .0784827 -9.34 0.000 -.8871229 -.5794766
white -.3911595 .1183808 -3.30 0.001 -.6231815 -.1591374
age -.0216655 .0024683 -8.78 0.000 -.0265032 -.0168278
ed .0671728 .015975 4.20 0.000 .0358624 .0984831
prst .0060727 .0032929 1.84 0.065 -.0003813 .0125267
_cut1 -2.465362 .2389126 (Ancillary parameters)
_cut2 -.630904 .2333155
_cut3 1.261854 .2340179
Approximate likelihood-ratio test of proportionality of odds
across response categories: chi2(12) = 48.91 Prob > chi2 = 0.0000
![Page 20: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/20.jpg)
. brantBrant Test of Parallel Regression Assumption
Variable chi2 p>chi2 df
All 49.18 0.000 12
yr89 13.01 0.001 2
male 22.24 0.000 2
white 1.27 0.531 2
age 7.38 0.025 2
ed 4.31 0.116 2
prst 4.33 0.115 2
A significant test statistic provides evidence that the parallel
regression assumption has been violated.
Both tests say that the model violates the parallel odds (or regression) assumption.
![Page 21: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/21.jpg)
What should we do in response to this model violation?
Most basically, recall the difference between statistical & practical significance.
We need to explore changes to the explanatory variables or alternatives that safely ignore the parallel odds assumption: e.g., generalized ordered logit (gologit2, which is downloadable); or else multinomial logit (mlogit).
Then compare the results across the kinds of models: Are there practically significant differences? If not, perhaps stick with the ologit model.
![Page 22: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/22.jpg)
One more thing: Brant test is more likely to yield significant results as samples get larger.
![Page 23: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/23.jpg)
. findit gologit2
. view help gologit
. gologit2 warm yr89 male white age ed prst, auto
. linktest
Conclusion: gologit2 works fine, but the relative ease & clarity of interpreting ologit vs. gologit2 must be considered.
On gologit2 & its various options, see: http://www.nd.edu/~rwilliam/gologit2/
gologit2 (generalized ordered ologit):
![Page 24: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/24.jpg)
There are no diagnostics for gologit2 beyond linktest.
So let’s return to the ologit model to find out how its diagnostics work.
![Page 25: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/25.jpg)
. ologit warm yr89 male white age ed prst, or nolog table
.predict pSD pD pA pSA if e(sample)
.su pSD-pSA
Variable Obs Mean Std. Dev. Min Max
pSD 2293 .1293539 .0793024 .0153572 .4657959
pD 2293 .3152335 .0832117 .073616 .4289543
pA 2293 .3738817 .070512 .1279493 .4407727
pSA 2293 .1815308 .0961532 .0268523 .6067042
![Page 26: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/26.jpg)
.dotplot pSD-pSA0
.2.4
.6
Pr(w arm==1) Pr(w arm==2) Pr(w arm==3) Pr(w arm==4)
![Page 27: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/27.jpg)
The rest of the diagnostics for ologit or oprobit are mere approximations of diagnostics for the overall model based on sequentially re-estimating the model in higher-versus-lower binary segments: in this example, category D(2) versus SD(1), category A(3) versus D(2), & category SD(4) versus A(3).
This is extremely tedious & time consuming.
Here, nonetheless, is how to do it.
![Page 28: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/28.jpg)
We must first recode the outcome variable so that the base value=0.
We must use not ologit but rather logit regression.
And using logit regression, we estimate an ordinal series of binary outcome variables: e.g., 0/1, 1/2, & 2/3.
![Page 29: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/29.jpg)
. recode warm 1=0 2=1 3=2 4=3
. logit warm yr89 male white age ed prst if warm~=2 & warm~=3, nolog
. predict p1 if e(sample) [note: p1=pD in original coding]
. predict db, db
. predict dd, dd
. predict dx2, dx2
. predict n, n
. su p1-n
Then plot the graphs & proceed as discussed under logistic regression.
![Page 30: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/30.jpg)
. drop db-n
. logit warm yr89 male white age ed prst if warm~=0 & warm~=1, nolog
. predict p2 if e(sample)
. predict db, db
. predict dd, dd
. predict dx2, dx2
. predict n, n
. su p2-n
Again, plot the graphs & proceed as discussed under logistic regression.
![Page 31: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/31.jpg)
. drop db-n
. ologit warm yr89 male white age ed prst if warm~=1 & warm~=2, nolog
. predict p3 if e(sample)
. predict db, db
. predict dd, dd
. predict dx2, dx2
. predict n, n
. su p3-n
Continue plotting graphs & proceed as discussed under logistic regression.
![Page 32: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/32.jpg)
Such diagnostics can alert us to problems of model fit, outliers & influence.
At best, though, they represent approximations.
At this point we typically would explore predicted probabilities (postgr3, prchange, prtab, prvalue, & prgen-graphs).
We’ll look only at postgr3, specifying a separate graph for outcomes 2, 3 & 4.
![Page 33: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/33.jpg)
. xi3:ologit warm yr89 male white age ed prst, or nolog table
. postgr3 age, by(male) outcome(2) table
Female: top line.
.2.2
5.3
.35
.4.4
5
20 40 60 80 100Age in years
yhat_, male == Women yhat_, male == Men
![Page 34: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/34.jpg)
. postgr3 age, by(male) outcome(3) table
Female: top line.
.2.2
5.3
.35
.4.4
5
20 40 60 80 100Age in years
yhat_, male == Women yhat_, male == Men
![Page 35: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/35.jpg)
. postgr3 age, by(male) outcome(4) table
Female: top line.
0.1
.2.3
.4
20 40 60 80 100Age in years
yhat_, male == Women yhat_, male == Men
![Page 36: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/36.jpg)
There are other ordered logit models, such as ordered continuation-ratio (ocratio), which predicts likelihoods of reaching higher versus lower categories that require passing through the lower categories to reach the higher ones.
E.g., earning a Ph.D. versus an M.A. versus a B.A. versus a high school degree versus less than a high school degree.
![Page 37: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/37.jpg)
But let’s turn our attention to the multinomial logit model.
![Page 38: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/38.jpg)
An outcome is nominal when the categories are assumed to be unordered.
Marital status: divorced, never married, married, or widowed.
Occupation: professional, white collar, blue collar, craft, or menial.
Race-ethnicity, religion, political affiliation, and citizenship are among the other examples of nominal variables.
![Page 39: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/39.jpg)
We use a multinomial logit model to compare nominal outcomes, or when the assumption of parallel regressions (i.e. parallel odds) is violated.
Among the other versions is the conditional logit model (see Long/Freese, chapter 6), which uses the characteristics of the outcomes to predict which choice is made (e.g., your voting options are George Bush Sr., George Bush Jr., or Jeb Bush: which would you choose given the array of choices?)
![Page 40: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/40.jpg)
Multinomial logit models include a lot of parameters, & interpreting the results can be overwhelming.
Advice: keep the categories of the outcome variable to the fewest number possible.
The STATA-based approaches developed by Long & Freese (chapter 6) are helpful in grappling with the complexities.
Here’s an example:
. u nomocc2, clear
![Page 41: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/41.jpg)
We’ll pretend that we’ve done the background exploratory analysis & transformations.
The research question: how effective are the variables white, ed & exper in predicting whether a sample of respondents work in menial jobs, blue collar jobs, craft jobs, white collar jobs, or professional jobs?
. mlogit occ white ed exper, rrr base(5) nolog
![Page 42: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/42.jpg)
. mlogit occ white ed exper, rrr base(5) nolog
Multinomial logistic regression Number of obs = 1685
LR chi2(12) = 830.44
Prob > chi2 = 0.0000
Log likelihood = -2134.0024 Pseudo R2 = 0.1629
![Page 43: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/43.jpg)
------------------------------------------------------------------------------
occ | RRR Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Menial |
white | .169601 .0572693 -5.25 0.000 .0874989 .3287412
ed | .4589326 .0235266 -15.19 0.000 .4150621 .50744
exper | .9649771 .0077839 -4.42 0.000 .949841 .9803545
-------------+----------------------------------------------------------------
BlueCol |
white | .5840301 .2088454 -1.50 0.133 .2897685 1.177116
ed | .4154983 .0186829 -19.53 0.000 .3804478 .4537781
exper | .9695438 .0062475 -4.80 0.000 .957376 .9818662
-------------+----------------------------------------------------------------
![Page 44: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/44.jpg)
Craft |
white | .2719974 .0787523 -4.50 0.000 .1542104 .4797509
ed | .5040718 .0201306 -17.15 0.000 .4661212 .5451123
exper | .9920646 .005637 -1.40 0.161 .9810776 1.003175
-------------+----------------------------------------------------------------
WhiteCol |
white | .8163426 .3173662 -0.52 0.602 .3810256 1.749003
ed | .653316 .0269439 -10.32 0.000 .602585 .7083181
exper | .9989455 .0064144 -0.16 0.869 .9864523 1.011597
------------------------------------------------------------------------------
(Outcome occ==Prof is the comparison group)
rrr (relative risk ratio): mlogit’s relative risk ratio coefficients are an approximation of the real thing; see Statalist on this.
The default is to display logit coefficients; ‘outreg,’ ‘estimates,’ & other table commands can display odds ratios.
![Page 45: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/45.jpg)
If you feel more comfortable using odds ratios or percentage change in odds:
. quietly mlogit occ white ed exper, base(5)
. listcoef, factor help
Or:
. qui mlogit occ white ed exper, base(5)
. listcoef, percent help
![Page 46: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/46.jpg)
We can change the comparison group via base( ), or not specify base( ) & let Stata choose the comparison group.
![Page 47: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/47.jpg)
We can display the results in terms of odds ratios or percentage change in odds either for all the explanatory variables together or for them individually:
. listcoef, factor help
. listcoef white, factor help
. listcoef, percent help
. listcoef white, percent help
And to simplify the output, we can display only those explanatory variables that attain a specified level of statistical significance:
. listcoef white, pvalue(.10)
![Page 48: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/48.jpg)
And we could have specified the robust &/or cluster options in the multinomial equation.
Recall that cluster invokes robust standard errors.
![Page 49: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/49.jpg)
The problem of perfect prediction: mlogit does not give us a warning message, but rather lists the culprit variables as z=0 (and p>|z|=1).
What to do: re-estimate the model, excluding the problem variable & deleting the observations that imply perfect prediction.
Identify the problem observations by doing a cross-tab of the problem variable with the outcome variable.
![Page 50: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/50.jpg)
Model specification & related tests: linktest is not an option for mlogit, but Long & Freese have developed mlogtest to greatly facilitate the battery of tests for a multinomial logit model.
. mlogtest, lr wald lrcom sm set
These & other options can be specified either individually or collectively.
![Page 51: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/51.jpg)
**** Likelihood-ratio tests for independent variables
Ho: All coefficients associated with given variable(s) are 0.
occ chi2 df P>chi2
white 40.477 4 0.000
ed 784.686 4 0.000
exper 42.805 4 0.000
**** Wald tests for independent variables
Ho: All coefficients associated with given variable(s) are 0.
occ chi2 df P>chi2
white 40.746 4 0.000
ed 424.841 4 0.000
exper 39.975 4 0.000
![Page 52: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/52.jpg)
**** Small-Hsiao tests of IIA assumption
Ho: Odds(Outcome-J vs Outcome-K) are independent of other alternatives.
Omitted lnL(full) lnL(omit) chi2 df P>chi2 evidence
Menial -867.293 -860.532 13.522 4 0.009 against Ho
BlueCol -732.720 -727.573 10.295 4 0.036 against Ho
Craft -666.973 -660.184 13.578 4 0.009 against Ho
WhiteCol -800.319 -791.873 16.892 4 0.002 against Ho
![Page 53: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/53.jpg)
**** LR tests for combining outcome categories
Ho: All coefficients except intercepts associated with given pair
of outcomes are 0 (i.e., categories can be collapsed).
Categories tested chi2 df P>chi2
Menial- BlueCol 20.474 3 0.000
Menial- Craft 16.882 3 0.001
Menial-WhiteCol 66.113 3 0.000
Menial- Prof 323.036 3 0.000
BlueCol- Craft 45.881 3 0.000
BlueCol-WhiteCol 114.015 3 0.000
BlueCol- Prof 628.497 3 0.000
Craft-WhiteCol 49.958 3 0.000
Craft- Prof 479.447 3 0.000
WhiteCol- Prof 133.678 3 0.000
![Page 54: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/54.jpg)
How to test for the joint significance of, say, ‘ethnicity’ (with ‘white’ as the base category)?
We can do so using either a likelihood ratio test (which is preferred) or a Wald test:
. mlogtest, lr set(black hispanic asian)
. mlogtest, wald set(black hispanic asian)
These tests of joint significance can be specified together with the other mlogtest options.
![Page 55: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/55.jpg)
Other Fit & Influence Diagnostics
. mlogit occ white ed exper, base(5) nolog
. predict prM prB prC prW prP if e(sample)
. su prM-prP
Note: M, B, C, W & P are labels attached to the coded responses 1, 2, 3, 4 & 5.
![Page 56: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/56.jpg)
As for the influence-diagnostic graphs, which are mere approximations as diagnostic tools:
Re-code the outcome variable so that the reference group=0.
Use not mlogit but rather logit to estimate the model for each binary outcome
Estimate the logit model for (at least) each outcome-level versus the base-level (e.g., menial vs. professional, craft versus professional, white collar vs. professional).
![Page 57: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/57.jpg)
How to explore predicted probabilities? Returing to the original coding of ‘occ’:
. mlogit occ white ed exper, base(5) nolog
Then use postgr3, prchange, prtab, prvalue & prgen to predict & graph probabilities.
We could also use prchange & mlogview to graph the predictions in another format.
![Page 58: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/58.jpg)
. xi3: mlogit occ white ed exper, base(5) nolog
. postgr3 ed, by(white) outcome(1) table
Nonwhite: top line.
0.1
.2.3
0 5 10 15 20Years of education
yhat_, white == 0 yhat_, white == 1
![Page 59: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/59.jpg)
. postgr3 ed, by(white) outcome(2) table
Nonwhite: top line.
0.2
.4.6
.8
0 5 10 15 20Years of education
yhat_, white == 0 yhat_, white == 1
![Page 60: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/60.jpg)
. postgr3 ed, by(white) outcome(3)
Nonwhite: bottom line.
0.1
.2.3
.4.5
0 5 10 15 20Years of education
yhat_, white == 0 yhat_, white == 1
![Page 61: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/61.jpg)
. postgr3 ed, by(white) outcome(4) table
Nonwhite: bottom line.
0.0
5.1
.15
.2
0 5 10 15 20Years of education
yhat_, white == 0 yhat_, white == 1
![Page 62: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/62.jpg)
Predicted probabilities: Long & Freese’s suite of commands (e.g., prvalue, x(…) delta save; prvalue, x(…) delta diff) is relevant.
See their particular commands for mlogit.
Remember to see Long & Freese’s final chapter to learn how to predict probabilities based on curvilinear explanatory variables.
![Page 63: VII. Ordinal & Multinomial Logit Models. To what degree do the dietary & exercise habits of a sample of adults predict whether they are in the low,](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649c765503460f9492ae97/html5/thumbnails/63.jpg)
Finally, consider regression models for another, common form of categorical outcome variable: counts (see Long/Freese, chapter 7).
E.g., the number of homicides, suicides, hospitalizations, accidents, alcoholic drinks consumed, academic publications, or wars.
OLS regression is commonly but inappropriately applied to such problems.
Instead use count models such as poisson & negative binomial regression.
That’s all!