error component models ric scarpa prepared for the choice modelling workshop 1st and 2nd of may...
Post on 17-Dec-2015
213 Views
Preview:
TRANSCRIPT
Error Component models
Ric ScarpaPrepared for the Choice Modelling Workshop
1st and 2nd of MayBrisbane Powerhouse,
New FarmBrisbane
Presentation structure
• The basic MNL model
• Types of Heteroskedasticy in logit models
• Structure of error components
• Estimation
• Applications in env. economics– Flexible substitution patterns– Choice modeling
• Future perspectives (debate)
ML – RUM Specification
• The utility from individual i choosing alternative j is given by:
, , assume =1 and linearity
1, ,
ij ij ij
ij ij
U V x
x j J
Assume error is Gumbel
~ extreme valueij iid
i.e., has pdf and cdf, respectively, ofij
exp exp expij ij ijf
exp expij ijF
ML Choice Probabilities
• Given the distributional assumptions and representative agent specification, then defining
1
0 otherwise
ij ik
ij
U U k jy
we have that:
Pr 1| ,ij ijP y x Pr | ,ij ikU U k j x
Pr | ,ik ij ij ikV V k j x
ML Choice Probabilities (cont’d)
Pr | , ,ij ij ik ij ij ik ijP V V k j x
Thus, we have the conditional choice probability:
exp exp ij ij ikk j
V V
|jij iP sP dsf s
exp exp expexp exp ij ikk j
s V dV ss s
Taking the expectation of this with respect to yields the unconditional choice probability:
ij
ML Choice Probabilities (cont’d)
exexp exp p exp expij ikk j
ijP dsV ss sV
exp exp expij ikk
s V V s ds
exp exp exp expij ikk
s V V s ds
Consider a change of variables
exp expt s dt s ds
ML Choice Probabilities (cont’d)
exp exxp pe expij ij ikk
P V V s ss d
0
exp exp ij ikk
dtV Vt
0
exp exp
exp
ij ikj
ij ikk
V
V
t V
V
0
exp exp ij ikk
dtV Vt
1
exp ij ikk
V V
exp
expij
ikk
V
V
Merits of ML Specification
• The log-likelihood model is globally concave in its parameters (McFadden, 1973)
• Choice probabilities lie strictly within the unit interval and sum to one
• The log-likelihood function has a relatively simple form
1 1
1 1
, ln
ln exp
n J
ij iji j
n J
ij ij iki j k
L y x y P
y V V
Utility Variance in ML Specifications
• Assumes that the unobserved sources of heterogeneity are independently and identically distributed across individuals and alternatives; i.e.,
2
| , , | , ,i iVar U X Var X
I
where and 1, ,i i iJU U U
22
26
• Dependent on , but basically homoskedastic in most applications
• This is a problem as it leads to biased estimates if variance of utilities actually varies in real life, which is likely phenomenon
• Because the effect is multiplicative bias is likely to be big
Scale heteroskedasticy
…or Gumbel error heteroskedasticity• SP/RP joint response analysis allowed for
minimal heteroskedasticty (variance switch from SP to RP): i=exp(×1i(RP))
• Choice complexity work introduced i=exp(’zi), where zi is measure of complexity of choice context i
• Respondent cognitive effort: n=exp(’sn), where sn is a measure of cognitive ability of respondent n
Scale Het. limitations
• While scale heteroskedasticity allows the treatment of heteroskedasticity in the choice-respondent context it does not allow heteroskedasticity across utilities in the same choice context
• People may inherently associate more utility variance with less familiar alternatives (e.g. unknown destinations, hypothetical alternatives) than with better known ones (e.g. frequently attended sites, status quo option)
Mixed logit
• The mixed logit model is defined as any model whose choice probabilities can be expressed as
|ij ijP L f d where is a logit choice probability; i.e., ijL
1
exp
exp
ij
ij J
ikk
VL
V
and is the density function for , with underlying parameters
|f
ijV denotes the representative utility function
Special Cases
• Case #1: MNL results if the density function is degenerate; i.e.,
|f
1|
0
bf b
b
1
exp
exp
ij
ij ij J
ikk
V bP b L b
V b
Special Cases
• Case #2: Finite mixture logit model results if the density function is discrete; i.e., |f
; 1, ,|
0 otherwisem ms b m M
f
1
1
1
exp
exp
M
ij m ij mm
Mij m
m Jm
ik mk
P s L b
V bs
V b
Notes on Mixed Logit (MXL)
• Train emphasizes two interpretations of the MXL model– Random parameters (variation of taste intensities)
– Error components (heteroskedastic utilities)
• Mixed logit probabilities are simply weighted average of logit probabilities, with weights given by |f
• The goal of the research is to estimate the underlying parameter vector
Simulation Estimation
• Simulation methods are typically used to estimate mixed logit models
• Recall that the choice probabilities are given by
|ij ijP L f d
where
1
exp
exp
ij
ij J
ikk
VL
V
Simulation Estimation(cont’d)
which can then be used to compute
1 1
1 1
1
exp
exp
rR R
ij iR rij ij iR R J
rr rik i
k
VP L
V
• For any given value of , one can generatedrawn from
, 1, ,ri r R
|f
Simulation Estimation
1
lnN
Rij
i
L P
• The simulated log-likelihood for the panel of t choices becomes:
1
1 1 1
1
expln
exp
rN R
ijt i
R Jri r t
ikt ik
V
V
Error Components Interpretation
• The mixed logit model is generated in the RUM model by assuming that
,ij ij ij ijU V x
where
ij i ij ijx
with xij and both observed, ijx
~ EVij iid
and 0iE
Error Components Interpretation(cont’d)
• The error components perspective views the additional random terms as tools for inducing specific patterns of correlation across alternatives.
,ij ik i ij ij i ik ik
ij ik
Cov U U E z z
z z
where
ijVar
Example – Mimicking NL
• Consider a nesting structure
Stay at home (j=0)
Take a trip
Nest A Nest B
1 2 3 4
Example (cont’d)
The corresponding correlation structure among error components (and utilities) is given by
0 0 0 0
f f
f
a
b c
b f
d e
d
where c f
fe
Example (cont’d)
• We can build up this covariance structure using error components
0
1,2
3,4
ij ij
ij ij ij
ij ij
x j
U x j
x j
with~ EVij iid
i
i
2~ 0,i N
12i
12 21,2~ 0,i N
34i
34 23,4~ 0,i N
Example (cont’d)
• The resulting covariance structure becomes
2 2 2 2
2 2 2
2 21,2 1,2
21,
2
2
2
2
2
2 22 21,2 1,2
21,
22
2
0 0 0 0
ijVar U
Example (cont’d)
• One limitation of the NL model is that one has to fix the nesting structure
• MXL can be used to create overlapping nests
0 0
1 1
2 2
3 3
4 4
i i
i i
ij i i
i i
i i
x
x
U x
x
x
i
i
i
i
12i12i
34i
34i
13i
13i
14i
14i23i23i
24i
24i
Herriges and Phaneuf (2002)Covariance Pattern
1.88 2.14
-- 1.30
-- 0.64
1.61 1.09
-- 0.93
-- 0.58
1.61
-- 1.72
-- 0.08
-- 0.46
-- 0.35
-- 0.56
(1,2)(1,3)(1,4)
(1,5)( 2,3)
( 2,3,5)( 2,4)( 2,5)(3,4)(3,5)( 4,5)
Implications for Elasticity Patterns
• In general, elasticities given by
s
s
s
j ij ikjk ij
ik j ij
P x xx
x P X
,s
s
ij ij ik
ik j ij
L x f d x
x P x
,
s
ij ij ik
ik j ij
L X xf d
x P x
, , ,Lj ij jk ijw x x f d
Implications for Elasticity Patterns(cont’d)
where
,s s
Ljk ij jk ik s ikX L x
denotes the standard logit response elasticity (i.e., without nesting) conditional on a specific draw of the vector n
and
,,
ij ij
j ij
j ij
L xw x
P x
denotes the relative odds that alternative j is selected(i.e., conditional versus unconditional odds)
Illustration – Choice Probabilities
0.65 0.20 0.03 0.20 0.45
0.09 0.20 0.24 0.20 0.14
0i iL
; 0ij iL j
2i 0i 2i
j jP L f d
0.1 10
Choice modeling• Error component in hypothetical alternatives,
yet absent in the SQ or no alternative
The induced variance structure across utilities is:
Effect
• Fairly general result that it improves fit while requiring few additional parameters (only st. dev. of err. comp.)
• It can be decomposed by socio-economics covariates (e.g. spread of error varies across segments of respondents)
top related