shall we mixed logit?yamamoto/presentation/...– variability of parameter estimates – estimation...

1

Shall We Mixed Logit? Estimation stability and prediction reliability of

error component mixed logit models

Shusaku NAKAI Ryuichi KITAMURA

Kyoto University Toshiyuki YAMAMOTO

Nagoya University

2

Outline

• Introduction • Error component MXL models

– Identification issue – Variability of parameter estimates – Estimation of choice probabilities

• Usefulness of MNL models • Conclusions and future research

3

Introduction MXL models • considered the most promising discrete choice

model • widespread applications in recent years However • properties of parameter estimates are not well

understood

Objective • Estimation stability and prediction reliability of

error component MXL models are examined with simulated data

4

Error component MXL models • Examined is a trinomial MXL model

+++=+++=

+++=

nnnnn

nnnnn

nnnnn

XXuXXuXXu

322321313

222221212

112121111

εµββεµββ

εµββ

2 explanatory variables Standard iid Gumbel

2 error components

),0(~

),0(~2

22

211

sN

sN

n

n

µ

µ

+

+

+

=Σ

6

6

006

22

2

22

22

2

22

1

π

π

π

s

ss

s

5

Simulated discrete choice data Generated by a probit model

Error component MXL models

++=

++=

++=

nnnn

nnnn

nnnn

XXu

XXu

XXu

32321313

22221212

12121111

ξββ

ξββ

ξββ

=∑

1010

001

ρρξ

==

5.00.1

2

1

ββ

)1,0(~ NX jin

ρ = 0.00, 0.10, 0.30, 0.50, 0.70, 0.90, 0.95, 0.99 Each data set contains 1,000 cases 25 data sets are generated for each value of ρ

6

Identification issue

For trinomial probit models, Dansie (1985) suggests

• 3 matrices are equivalent, and produce the same likelihood value

• Model estimation would not be able to indicate which is most likely


=

1010

00

23

23

11

σσ

σ

AΣ

=

1'0'10

001

23

23

σσBΣ

=

10001000''11σ

CΣ

, thus ΣA is not estimable

7

Identification issue (cont.) For GEV models, Börsch-Supan (1990) and

Munizaga et al. (2000) estimated in the case of 4 alternatives

• and found that nested logit models have some

capacity to accommodate heteroscedasticity


=

1010

00

23

23

11

σσ

σ

AΣ

=

1'0'10

001

23

23

σσBΣ

=

10001000''11σ

CΣ

Nested logit model HEV model

8

Identification issue (cont.) • In this study, data sets are simulated by ΣB


=

1010

00

23

23

11

σσ

σ

AΣ

=

1'0'10

001

23

23

σσBΣ

=

10001000''11σ

CΣ

+

+

+

=Σ

6

6

006

22

2

22

22

2

22

1

π

π

π

s

ss

s • Error component MXL model examined in this study is consistent with ΣA

9

Identification issue (cont.)

• Standard deviation becomes extremely large, implying covariance structure is unidentified

• MXL model is subject to the same identification problem of probit model (consistent with Walker et al. (2007))


0.1

1

10

100

1000

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00ρ

Para

met

er E

stim

ate

0.1

1

10

100

1000

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00ρ

Para

met

er E

stim

ate

21s

22s

10

Identification issue (cont.) • Hereafter, we constrain


=

1010

00

23

23

11

σσ

σ

AΣ

=

1'0'10

001

23

23

σσBΣ

=

10001000''11σ

CΣ

+

+

+

=Σ

6

6

006

22

2

22

22

2

22

1

π

π

π

s

ss

s

22

21

2 sss ==

+

11

001

6

22 ρπs

6

22

2

πρ

+=

s

s

11

Variability of parameter estimates Error component MXL models

• Parameter estimates are quite instable especially for the case with higher ρ

0

1

2

3

4

5

6

7

8

9

10

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

誤差相関係数ρ

推定

パラメータ値

Error correlation coefficient ρ

Para

met

er e

stim

ate

1β

0.11 =β

12

Variability of parameter estimates (cont.)

• This instability is caused by the dependence of coefficient estimates on error variance

• Error variance is not standardized in MXL model • Needs for normalization of parameter estimates


=∑

1010

001

ρρξ

Probit model

+=Σ

11

001

6

22 ρπs

6

22

2

πρ

+=

s

s

Error component MXL model

jj

s

βπ

β ˆ

6ˆ

1~2

2 +

=

13

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00ρ

Para

emte

r Esti

mat

e


• After normalization, utility coefficients are unbiased and stable

0.11 =β


1β~

14

0.1

1

10

100

1000

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00ρ

Para

met

er E

stim

ate



• Estimated variances of the error components tend to be biased upward

True value

2s

15



• Biases in estimated variances might be related to the difference in shape of Normal and Gumbel distribution

• Amemiya (1981) suggests in binary case N(0, 1.62) rather than N(0, π2/3) fits better to L(0, π2/3),

though the latter has equal variance to L(0, π2/3) (1.6 < π/30.5 ≈ 1.8)

0

0.2

0.4

0.6

0.8

1

1.2

-3 -2.7 -2.4 -2.1 -1.8 -1.5 -1.2 -0.9 -0.6 -0.3 0 0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7 3

Cumulative distribution function

0.66

0.665

0.67

0.675

0.68

0.685

0.69

0.695

0.75

0.75

0.76

0.76

0.76

0.76

0.77

0.77

0.77

0.77

0.77

0.78

0.78

0.78

0.78

0.79

0.79

0.79

0.79 0.8

0.8

16

Estimation of choice probabilities Error component MXL models

• Choice probabilities are calculated by

for the case

• The effects of biased estimate of s is examined by introducing q, and calculate

• True probability is obtained when q ≈ 1.29

( )( ) )()(

ˆˆˆexpˆˆˆexp)(ˆ

212211

2211 ηηηββ

ηββ dfdfsXX

sXXiP

jjjnjn

iininn ∫∫∑ ++

++=

==

=3or2or

1or,

2

1

jiifjiif

ji ηη

ηη

0.1231322122111 ====== XXXXXX

( )( ) )()(

6exp

6exp)|( 212

2211

22211 ηη

πηββ

πηββdfdf

sXXq

sXXqqiP

jjjj

iii∫∫∑ ++

++=

17

0

0.1

0.2

0.3

0.4

0.5

0.6

0.1 1 10

q

Cho

ice

Prob

abili

ty

1.15 1.73

P(1)

P(3)

P(2)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.1 1 10

q

Cho

ice

Prob

abili

ty

1.15 1.73

P(1)

P(3)

P(2)

Estimation of choice probabilities (cont.)


• True probabilities are contained in the range of the estimated probability

0.1=ρ

Range of estimated probability

True value

18

0

0.1

0.2

0.3

0.4

0.5

0.6

0.1 1 10

q

Cho

ice

Prob

abili

ty

P(1)

P(3)

P(2)

1.57 2.290

0.1

0.2

0.3

0.4

0.5

0.6

0.1 1 10

q

Cho

ice

Prob

abili

ty

P(1)

P(3)

P(2)

1.57 2.29



• True probabilities are NOT contained in the range of the estimated probability


True value

0.5=ρ

19

0

0.1

0.2

0.3

0.4

0.5

0.6

0.1 1 10

q

Cho

ice

Prob

abili

ty

P(1)

P(3)

P(2)

3.45 5.330

0.1

0.2

0.3

0.4

0.5

0.6

0.1 1 10

q

Cho

ice

Prob

abili

ty

P(1)

P(3)

P(2)

3.45 5.33



• True probabilities are NOT contained in the range of the estimated probability


True value

0.9=ρ

20

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.00

ρ

Para

met

er E

stim

ate

Usefulness of MNL models • MNL models are estimated using the same

data sets

• Utility coefficient estimates are biased upward, but up to about 30%, smaller than MXL model

0.11 =β1β

21

Conclusions and future research For the error component MXL model 1. Variance structure cannot be uniquely identified

through model estimation 2. Parameter estimates are quite instable especially for

the case with a high error correlation 3. After proper normalization, utility coefficients are

unbiased and stable 4. Estimated variances of the error components tend to

be biased upwards 5. Choice probabilities are biased unless the error

correlation is very small MNL model can produce relatively unbiased utility

coefficient estimates

22

Conclusions and future research (cont.)

• One would adopt MXL model in search of covariance specification

-> The model is incapable of identifying the true structure, and parameter estimates are instable

• One may opt to develop adequately specified MNL through careful selection of explanatory variables, utility formulation or definition of alternatives (consistent with suggestion by Pinjari & Bhat (2006))

• Needs for further research on properties of parameter estimates of MXL model with taste heterogeneity as well as error components

shall we mixed logit?yamamoto/presentation/...– variability of parameter estimates – estimation...

Documents