7. models for count data, inflation models. models for count data
TRANSCRIPT
![Page 1: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/1.jpg)
7. Models for Count Data, Inflation Models
![Page 2: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/2.jpg)
Models forCount Data
![Page 3: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/3.jpg)
![Page 4: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/4.jpg)
Doctor Visits
![Page 5: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/5.jpg)
Basic Model for Counts of Events
• E.g., Visits to site, number of purchases, number of doctor visits
• Regression approach• Quantitative outcome measured• Discrete variable, model probabilities• Nonnegative random variable
• Poisson probabilities – “loglinear model”
![Page 6: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/6.jpg)
2
1
1
| ]
Moment Equations :
Inefficient but robust if nonPoisson
N
i ii
N
i i i ii
y
y
Estimati
Nonlinear Least Squares:
Maximum Likelihoo
on:
Min
x
d
ji i
i
i i i
exp(-λ )λProb[Y = j | ] =
j!
λ = exp( ) = E[y
i
i
x
β'x x
1
1
log log( !)
Moment Equations :
Efficient, also robust to some kinds of NonPoissonness
N
i i i ii
N
i i ii
y y
y
Max
x
:
![Page 7: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/7.jpg)
Efficiency and Robustness
• Nonlinear Least Squares• Robust – uses only the conditional mean• Inefficient – does not use distribution
information• Maximum Likelihood
• Less robust – specific to loglinear model forms• Efficient – uses distributional information
• Pseudo-ML• Same as Poisson• Robust to some kinds of nonPoissonness
![Page 8: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/8.jpg)
Poisson Model for Doctor Visits
![Page 9: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/9.jpg)
Alternative Covariance Matrices
![Page 10: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/10.jpg)
Partial Effects
iE[y | ]= λi
ii
xβ
x
![Page 11: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/11.jpg)
Poisson Model Specification Issues
• Equi-dispersion: Var[yi|xi] = E[yi|xi].
• Overdispersion: If i = exp[’xi + εi],• E[yi|xi] = γexp[’xi]
• Var[yi] > E[yi] (overdispersed)
• εi ~ log-Gamma Negative binomial model
• εi ~ Normal[0,2] Normal-mixture model
• εi is viewed as unobserved heterogeneity (“frailty”). Normal model may be more natural. Estimation is a bit more complicated.
![Page 12: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/12.jpg)
Overdispersion• In the Poisson model, Var[y|x]=E[y|x]• Equidispersion is a strong assumption• Negbin II: Var[y|x]=E[y|x] + 2E[y|x]2
• How does overdispersion arise:• NonPoissonness• Omitted Heterogeneity
j
u
1
exp( )Prob[y=j|x,u]= , exp( u)
j!
Prob[y=j|x]= Prob[y=j|x,u]f(u)du
exp( u)uIf f(exp(u))= (Gamma with mean 1)
( )
Then Prob[y=j|x] is negative binomial.
x
![Page 13: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/13.jpg)
Negative Binomial Regression
iyi ii i i i i
1 i
i i
i i i
i i i i i
( y )P(y | x ) r (1 r ) , r
(y 1) ( )
exp( )
E[y | x ] Same as Poisson
Var[y | x ] [1 (1/ ) ]; =1/ = Var[exp(u )]
x
![Page 14: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/14.jpg)
NegBin Model for Doctor Visits
![Page 15: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/15.jpg)
Poisson (log)Normal Mixture
![Page 16: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/16.jpg)
![Page 17: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/17.jpg)
Negative Binomial Specification• Prob(Yi=j|xi) has greater mass to the right and left
of the mean• Conditional mean function is the same as the
Poisson: E[yi|xi] = λi=Exp(’xi), so marginal effects have the same form.
• Variance is Var[yi|xi] = λi(1 + α λi), α is the overdispersion parameter; α = 0 reverts to the Poisson.
• Poisson is consistent when NegBin is appropriate. Therefore, this is a case for the ROBUST covariance matrix estimator. (Neglected heterogeneity that is uncorrelated with xi.)
![Page 18: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/18.jpg)
Testing for OverdispersionRegression based test: Regress (y-mean)2 on mean: Slope should = 1.
![Page 19: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/19.jpg)
Wald Test for Overdispersion
![Page 20: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/20.jpg)
Partial Effects Should Be the Same
![Page 21: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/21.jpg)
![Page 22: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/22.jpg)
![Page 23: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/23.jpg)
Model Formulations for Negative Binomial
Poisson
exp( )Prob[ | ] ,
(1 )
exp( ), 0,1,..., 1,...,
[ | ] [ | ]
i ii i
i
i i i
i i i
iy
Y yy
y i N
E y Var y
x
x
x x
E[yi |xi ]=λi
![Page 24: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/24.jpg)
NegBin-1 Model
![Page 25: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/25.jpg)
NegBin-P Model
NB-2 NB-1 Poisson
![Page 26: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/26.jpg)
Censoring and Truncation in Count Models
• Observations > 10 seem to come from a different process. What to do with them?
• Censored Poisson: Treat any observation > 10 as 10.
• Truncated Poisson: Examine the distribution only with observations less than or equal to 10.• Intensity equation in hurdle
models• On site counts for recreation
usage.
Censoring and truncation both change the model. Adjust the distribution (log likelihood) to account for the censoring or truncation.
![Page 27: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/27.jpg)
y
y
y
Log Likelihoods
exp( )Ignore Large Values: Prob(y) =
(y 1)
exp( )Discard Large Values: Prob = 1[y C]
(y 1)
exp( ) eCensor Large Values: Prob = 1[y C] 1[y C] 1
(y 1)
jC
j 0
y
jC
j 0
xp( )
( j 1)
exp( ) 1Truncate Large Values: Prob = 1[y C]
exp( )(y 1)( j 1)
![Page 28: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/28.jpg)
![Page 29: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/29.jpg)
Effect of Specification on Partial Effects
![Page 30: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/30.jpg)
Two Part Models
![Page 31: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/31.jpg)
Zero Inflation?
![Page 32: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/32.jpg)
Zero Inflation – ZIP Models
• Two regimes: (Recreation site visits)• Zero (with probability 1). (Never visit site)• Poisson with Pr(0) = exp[- ’xi]. (Number of visits,
including zero visits this season.)• Unconditional:
• Pr[0] = P(regime 0) + P(regime 1)*Pr[0|regime 1]• Pr[j | j >0] = P(regime 1)*Pr[j|regime 1]
• This is a “latent class model”
![Page 33: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/33.jpg)
Zero Inflation Models
ji i
i i i i
i
Zero Inflation = ZIP
exp(-λ )λProb(y = j | x ) = , λ = exp( )
j!
Prob(0 regime) = F( )
β x
γ z
![Page 34: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/34.jpg)
Notes on Zero Inflation Models
• Poisson is not nested in ZIP. γ = 0 in ZIP does not produce Poisson; it produces ZIP with P(regime 0) = ½.• Standard tests are not appropriate• Use Vuong statistic. ZIP model almost always wins.
• Zero Inflation models extend to NB models – ZINB(tau) and ZINB are standard models• Creates two sources of overdispersion• Generally difficult to estimate
![Page 35: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/35.jpg)
An Unidentified ZINB Model
![Page 36: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/36.jpg)
![Page 37: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/37.jpg)
![Page 38: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/38.jpg)
Partial Effects for Different Models
![Page 39: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/39.jpg)
The Vuong Statistic for Nonnested Models
i,0 0 i i 0 i,0
i,1 1 i i 1 i,1
Model 0: logL = logf (y | x , ) = m
Model 0 is the Zero Inflation Model
Model 1: logL = logf (y | x , ) = m
Model 1 is the Poisson model
(Not nested. =0 implies the splitting p
0 i i 0i i,0 i,1
1 i i 1
n 0 i i 0i 1
1 i i 1
2a
n 0 i i 0 0 i i 0i 1
1 i i 1 1 i i 1
robability is 1/2, not 1)
f (y | x , )Define a m m log
f (y | x , )
f (y | x , )1n log
n f (y | x , )[a]V
s / n f (y | x , ) f (y | x , )1log log
n 1 f (y | x , ) f (y | x , )
Limiting distribution is standard normal. Large + favors model
0, large - favors model 1, -1.96 < V < 1.96 is inconclusive.
![Page 40: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/40.jpg)
![Page 41: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/41.jpg)
A Hurdle Model
• Two part model:• Model 1: Probability model for more than zero
occurrences• Model 2: Model for number of occurrences
given that the number is greater than zero.• Applications common in health economics
• Usage of health care facilities• Use of drugs, alcohol, etc.
![Page 42: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/42.jpg)
![Page 43: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/43.jpg)
Hurdle Model
Prob[y > 0] = F( )
Prob[y=j] Prob[y=j] Prob[y = j | y > 0] = =
Prob[y>0] 1 Prob[y 0| x]
exp( ) Prob[y>0]=
1+exp( )
exp(- Prob[y=j|y>0,x]=
Two Part Model
γ'x
A Poisson Hurdle Model with Logit Hurdle
γ'xγ'x
j), =exp( )
j![1 exp(- )]
F( )exp( ) E[y|x] =0 Prob[y=0]+Prob[y>0] E[y|y>0] =
1-exp[-exp( )]
β'x
γ'x β'xβ'x
Marginal effects involve both parts of the model.
![Page 44: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/44.jpg)
Hurdle Model for Doctor Visits
![Page 45: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/45.jpg)
Partial Effects
![Page 46: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/46.jpg)
![Page 47: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/47.jpg)
![Page 48: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/48.jpg)
![Page 49: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/49.jpg)
Application of Several of the Models Discussed in this Section
![Page 50: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/50.jpg)
![Page 51: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/51.jpg)
Winkelmann finds that there is no correlation between the decisions… A significant correlation is expected … [T]he correlation comes from the way the relation between the decisions is modeled.
See also:van Ophem H. 2000. Modeling selectivity in count data models. Journal of Business and Economic Statistics18: 503–511.
![Page 52: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/52.jpg)
Probit Participation Equation
Poisson-Normal Intensity Equation
![Page 53: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/53.jpg)
Bivariate-Normal Heterogeneity in Participation and Intensity Equations
Gaussian Copula for Participation and Intensity Equations
![Page 54: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/54.jpg)
Correlation between Heterogeneity Terms
Correlation between Counts
![Page 55: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/55.jpg)
Panel Data Models for
Counts
![Page 56: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/56.jpg)
Panel Data Models
Heterogeneity; λit = exp(β’xit + ci)• Fixed Effects
Poisson: Standard, no incidental parameters issue NB
Hausman, Hall, Griliches (1984) put FE in variance, not the mean Use “brute force” to get a conventional FE model
• Random Effects Poisson
Log-gamma heterogeneity becomes an NB model Contemporary treatments are using normal heterogeneity with
simulation or quadrature based estimators NB with random effects is equivalent to two “effects” one time
varying one time invariant. The model is probably overspecified
Random parameters: Mixed models, latent class models, hierarchical – all extended to Poisson and NB
![Page 57: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/57.jpg)
Random Effects
![Page 58: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/58.jpg)
A Peculiarity of the FENB Model
• ‘True’ FE model has λi=exp(αi+xit’β). Cannot be fit if there are time invariant variables.
• Hausman, Hall and Griliches (Econometrica, 1984) has αi appearing in θ.• Produces different results• Implies that the FEM can contain time invariant
variables.
![Page 59: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/59.jpg)
![Page 60: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/60.jpg)
See: Allison and Waterman (2002),Guimaraes (2007)
Greene, Econometric Analysis (2011)
![Page 61: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/61.jpg)
![Page 62: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/62.jpg)
Bivariate Random Effects
![Page 63: 7. Models for Count Data, Inflation Models. Models for Count Data](https://reader034.vdocuments.us/reader034/viewer/2022051110/551af904550346f70d8b51a3/html5/thumbnails/63.jpg)