specification tests for the multinomial logit model · october1981 commentswelcome...

department

of economics

SPECIFICATION TESTS FOR THE MULTINOMIAL LOGIT MODEL

Jerry A. HausmanDaniel McFadden

Number 292 October 1981

massachusetts

institute of

technology

50 memorial drive

Cambridge, mass. 02139

Comments welcome


Jerry A. HausmanDaniel McFadden

Number 292 October 1981

Jeff Dubin, Whitney Newey, and John Rust providedresearch assistance. The NSF and DOE provided researchsupport

.

October 1981

Comments Welcome


by

JERRY HAUSMAN

AND

DANIEL MCFADDEN

Jeff Dubin, Whitney Newey, and John Rust provided research

assistance. The NSF and DOE provided research support.

Digitized by the Internet Archive

in 2011 with funding from

Boston Library Consortium IVIember Libraries

http://www.archive.org/details/specificationtes00haus2


by

JERRY HAUSMAN AND DANIEL MCFADDEN

Discrete choice models are now used in a wide variety of situations in

applied econometrics.^ By far the model specification which is used most

often is the multinomial logit model, McFadden (1974). The multinomial logit

model provides a convenient closed form for the underlying choice

probabilities without any requirement of multivariate integration.

Therefore, choice situations characterized by many alternatives can be

treated in a computationally convenient manner. Furthermore, the likelihood

function for the multinomial logit specification is globally concave which

also eases the computational burden. The ease of computation and the

existence of a number of computer programs has led to the many applications

of the logit model. Yet it is widely known that a potentially important

drawback of the multinomial logit model is the independence from irrelevant

alternatives property. This property states that the ratio of the

probabilities of choosing any two alternatives is independent of the

attributes of any other alternative in the choice set .^ Debreu (1960) was

among the first economists to discuss the implausibility of the independence

from irrelevant alternatives assumption. Basically, no provision is made for

^McFadden (1981) provides references to many of their uses.

^A "universal" logit model avoids the independence from irrelevantalternatives property while maintaining the multinomial logit form by making

each ratio of probabiities a function of attributes of all alternatives,

McFadden (1981). It is difficult however to give an economic interpretation

of this model other than as a flexible approximation to a general functional

form.

different degrees of substitutability or corapLimentarity among the choices.

While most analysts recognize the implications of the independence of

irrelevant alternatives property, it has remained basically a maintained

assumption in applications.

The multinomial probit model does provide an alternative specification

for discrete choice models without any need for the independence of

irrelevant alternatives assumption, Hausnian and Wise (1978). Furthermore, a

test of the 'covariance' probit specification versus the 'independent' probit

specification which is very similar to the logit specification does provide a

test for the independence from irrelevant alternatives assumption. But use

of the multinomial probit model has been limited due to the requirement that

multivariate normal integrals must be evaluated to estimate the unknown

parameters. Thus, the multinomial probit model does not provide a convenient

specification test for the multinomial logit model because of its

complexity.

In this paper we provide two sets of computationally convenient

specification tests for the multinomial logit model. The first test is an

application of the Hausman (1978) specification test procedure. The basic

idea for the test here is to test the reverse implication of the independence

from irrelevant alternatives property. The usual implication is to note that

3

if two choices exist, say car and bus in a transportation choice application,

that addition of a third choice, subway, will not change the ratio of

probabilities of the initial two choices. Our test here is based on

eliminating one or more alternatives from the choice set to see if underlying

choice behavior from the restricted choice set obeys the independence from

irrelevant alternatives property. We estimate the unknown parameters from

both the unrestricted and restricted choice sets. If the parameter estimates

are here approximately the same, then we do not reject the multinomial logit

specification. The test statistic is easy to compute since it only requires

computation of a quadratic form which involves the difference of the

parameter estimates and the differences of the estimated covariance matrices.

Thus, existing logit computer programs provide all the necessary input to the

test.

The second set of specification tests that we propose are based on more

classical test procedures. We consider a generalization of the multinomial

logit model which is called the nested logit model, McFadden (1981). Since

the multinomial logit model is a special case of the more general model when

a given parameter equals one, classical test procedures such as the Wald,

likelihood ratio, and Lagrange multiplier tests can be used. Of course, we

have added the requirement of the specification of an alternative model to

test the original model specification. Maximum likelihood estimation of the

nested logit model is considerably more difficult than for the multinomial

logit model. Thus we base our test procedures on one-step asymptotically

equivalent estimators to maximum likelihood. Still, new programming is

4

required since existing multinomial logit programs do not provide asymptotic

covariance matrix estimates for the one-step estimators.

We then proceed to compare the two sets of specification test procedures

for an example. We find rather unexpected results. First despite a sample

size of 1000, the asymptotically equivalent classical tests differ markedly

in their operating characteristics. The exact size of the Wald test is 15-

25% larger than the nominal size while the exact size of the Lagrange

multiplier test is approximately 15-30% smaller than the nominal size. The

exact size of the likelihood ratio test is quite close to its nominal size.

Furthermore, the power of the Wald test is significantly greater than the

other two classical tests. This result holds also when the tests are

corrected for size. Thus, the Lagrange multiplier test which is based in

this application on inconsistent estimates is distinctly inferior to the Wald

test which is based on asymptotically efficient estimates. Perhaps, more

surprising, we find the power of one Hausman test to be comparable to that of

the Wald test, even though in our example this Wald test is based on the

correct alternative model. The exact size of this Hausman test is within .1%

of the nominal size. When it is compared to the size corrected Wald test, it

proves to be superior. Thus the often quoted asymptotic power results for

local departures from the null hypothesis do not provide a reliable guide to

the exact performance of our specification tests in the example despite the

relatively large sample size and small departures from the multinomial logit

model

.

The plan of the paper is as follows. In the next section we derive the

Hausman-type specification test for the multinomial logit model. The

5

distribution theory as well as computational considerations are discussed.

In Section 3 we apply the specification test to an actual choice situation.

In the next section we derive the classical tests from the nested logit

model. We apply the tests to the same data as we did for the Hausraan test.

All four tests lead to a decisive rejection of the multinomial logit model in

this application. In Section 4 we calculate the exact size and power for

the two sets of specification tests for an example. Lastly, in the

conclusion we report some additional empirical results and discuss some

further considerations for the test procedures.

T . A Hausman-Type Test of the IIA Property

A widely used functional form for discrete choice probabilities is the

multinomial logit (MNL) model

z.g z.g

P(i|z, C, e) = e ^ / .Z e J(1.1)

where

C = {l,...,j} is a finite choice set

i,j = alternatives in C

z. = a K-vector of explanatory variables describing the attributes

of alternatives j and/or the characteristics of the decision-

maker which affect the desirability of alternative j.

z = (zjL,...,z ) the attributes of C

B = a K-vector of taste parameters

P(i|z, C, 3) = the probability that a randomly selected decision-

maker when faced with choice set C with attributes z

will choose i.

The MNL model has a necessary and sufficient characterization, termed

independence from irrelevant alternatives (IIA), that the ratio of the

probabilities of choosing any two alternatives is independent of the

attributes or the availability of a third alternative, or

P(i|z, C, 3) = P(i|z, A, 3) P(A|z, C, 3) (1.2)

where i e A c_ C and

P(A|z, C, 3) = .2 P(j Iz, C, 3) (1.3)

7

This property greatly facilitates estimation and forecasting because it

implies the model can be estimated from data on binomial choices, or by

restricting attention to choice within a limited subset of the full choice

set. On the other hand, this property severely restricts the flexibility of

the functional form, forcing equal cross-elasticities of the probabilities of

choosing various alternatives with respect to an attribute of one

alternative. Further discussion of the IIA property and conditions under

which it is likely to be true or false is given in Doraencich and McFadden

(1975) McFadden, Tye , and Train (1976), and Hausraan-Wise (1978). The

McFadden, Tye and Train paper suggests that the MNL specification be tested

by comparing parameter estimates obtained from choice data from the full

choice set with estimates obtained from conditional choice data from a

restricted choice set. Here we develop an asymptotic test statistic for this

comparison, using the approach to specification tests introduced by Hausman

(1978).

Consider a random sample with observations n=l,...,N. Let z be the

attributes of C for case n, and define S. = 1 if case n chooses i andmS. =0 otherwise. The log Likelihood of the sample ism

N

L^(e) =-J: L, -^r S. In P(i|z", C, 3) (1.4)

C N n=l leC m

We first review the asymptotic properties of maximum likelihood

estimates of g from (4). We make the following regularity assumptions:

a. The vector of attributes z has a distribution p in the

population which has a bounded support.b. The MNL specification (1) with a parameter vector g* is the

true model

.

c. The parameter vector 3* is asymptotically identified, i.e., if

3* 3*, there exists a set Z of z values and an alternative i

such that

2 P(i|z, C, 3*) du (z) * I ^ P(i|z, C, 3) du (z). Under these

assumptions, E S. = P(i|z , C, 3*)) the log likelihood converges uniformly

in 3 to

plim L (3) = / .2_ P(i|z, C, 3*) In P(i|z, C, 3) du (z), (1.5)

and (5) has a unique maximum at 3 = 3*. Then the maximum likelihood

estimator 3_ is consistent, and /n (3-3*) converges in distribution to a

normal random vector with zero mean and covariance matrix

plim (-3^L ( 3*) / 9693'). Discussion and proofs of these properties can be

found in Manski and McFadden (1981); see also McFadden (1973).

Let A = {1,...,m} be a subset of the choice set C. Consider the

conditional log likelihood of the subsample who make choices from A. If the

MNL specification (1) is true, then the IIA property states that the

probability of choosing i from C, given that the choice is contained in A,

equals the probability of choosing i from A. The conditional log likelihood

is then

L,(3) = -^ "L. .E^ S. In (P(i|z", C, 3)/P(A|z", C, 3)) (1.6)A N n=l leA m

= ^ Z, .E^ S. In P(i|z'^, A, 3).N n=l leA in '

Some components of 3*, such as the coefficients of alternative-specific

9

variables for excluded alternatives, are not identified by choice from A.

Let z = (y ,x ) be a partition of the explanatory variables into a vector y

which only varies outside A and a vector x which varies within A, and let

6 = (y>9) be a conmensurate partition of the parameter vector. The

conditional choice probability is then

n„ n„X.9 X. 8

f.

P(i|x^ A, 8) = e ^ / .E . J ^^-'^

jeA

and y. = y. for i, j e A. We add to the regularity assumptions the

asymptotic identification condition

d. If e?' 9*, there exists a set Z of z values and analternative ieA such that

f P(i|x, A, e*) du(z) t (^ P(i|x, A, 6) d M (z).

Then, as in the unconditional case, the conditional log likelihood converges

uniformly in 9 to

plim L^( 9) = /^E^ P(i|z, C, 3*) In P(i|x, A, g) du (z)^^ g^

= / P(A|z, C, e*) .E P(i|x, A, 9*) In P(i|x, A, 9) du (z)

with a unique maximum at 9 = 9*, The maximum likelihood estimator 9. is^ A

consistent and Tn" (9 - 9*) is asymptotically normal with mean zero and

covariance matrix plim (- 3^1 ( 9*) / 8939' )

.

The specification test statistic is based on the parameter difference

6=9-9, where 3 = (y > 9 )• When the regularity assumptions hold and

the MNL model is true, plim 6=0. Conversely, when the MNL specification

10

(1) is false, then the IIA property fails, and (7) becomes

plim L^(9) = / P(aU, C, 3*).E^ [ p(^|^; g; gj ] in P(i|x, A 0) dy (z)

with P(i|z, C, 3*) ^ P(A|z, C, B*) * P(i|x, A, 6*). In general, equation

(1.8) is not maximized at 9 = 6*, implying plim 6^0. Thus, a test of 6 =

is a test of the MNL specification. Rejection of 5 = indicates a failure

of the restrictive structure of the MNL form embodied in the IIA property, or

a misspecification of the explanatory variables z in (1), or both.

Acceptance of 6=0 implies that for the given specification of explanatory

variables and distribution of these variables, the IIA property holds. Thus

the test is consistent against this family of alternatives. However it is

not necessarily consistent against all members of the family of alternatives

defined by a given specification of explanatory variables and any

distribution of these variables.

To derive an asymptotic test statistic for 6=0, note that under the

regularity assumptions, vN (6-3*, 9. - 9*) is asymptotically normal with

mean zero and a covariance matrix V calculated below; the argument is a

standard application of a central limit theorem, as used in Manski and

McFadden, (1981).

We require the gradients and hessians of the likelihood functions and

their moments. First, the moments of S- aremE(S^^) = P(i|z", C, 3*) (1.10)

11

(p(i|z",C, 3*)(1-P(i|z", C, e*)) if i=j

"''^în'^jn^ "|-P(i|z",C, 3*) P(jlz", C 3*) if i^jj (l.Il)

The gradients are

N

8L^(3)/83 = - I, -^^ S. 31n P(i|z", C, 3)/93

i nil iic(în-^^^l^"'^'^)^(î-^C^ (l-^^^

where

and

z"" = .E^ z" PCilz'', C, 3) (1.13)C ^ "

, N3L,(e)/8e = 4 ^, -Â s. 8inp(i|x", A, e)/3eA N n=l leA m

= i I, ^, is. - S, PCilx"", A, e)) (x"- x") f. ...N n=l leA in An i A (1.14)

where

n „ n , . , n „.x, = .E^ X. P(i X , A, 6)

,

A leA 1

S. = .E, S. (1.15)An leA m

The hessian for the unconditional log likelihood is

H E - 32 L (6)/33 33' = - E 32l (3)/3333'

N

=-J:

E, .E^ P(i|z", C, 3)(z" - z")(z" - z")'. (1.16)Nn=l].cC lAiAFor the conditional log likelihood,

1N

-32l fe)/3e39' = i- E, -E^ S, P(i|x",A.e)(x" - x;!)(x" - x")' (1.17)A N n=l leC An i C i C

and hence

12

H, E - E 92 L,(e)/3e3e'^

, N ^

= 4 ^1 -^A P(A|z", C, 3*)P(i|x", A, e)(x'? - x")(x" - x")' (1.18)N n=l leA i A i A

Evaluated at 6 = 6*. H. is thenA

Ni v^ r. /-iH „,\/n n.,n n.,

(1.19)

We now turn to calculation of the asymptotic covariance matrix V.

Taylor's expansions of the gradients imply

H„ >rN(B^-3*)

/N(e^-9*)

y/F 3L^(3*)/3B

/n 3L,(e*)/3BA

+ A (1.20)

where A is a vector satisfying plim A = 0. Using the expectations of the

outer products of the gradients of the two likelihood functions we calculate

iN( 3L ( e*)/33' , 3L (6*)/38') E g' is asymptotically normal with mean zero

and covariance matrix

lim E g g' = lim

H,

0'

-Jc

(1.21)

where H is the H from equation (1.16) evaluated at 3=i

*H. is H. evaluated at 0=6*.A A

and likewise

From (20) , the asymptotic covariance matrix is

- i

V = lim

H,

H,

(limEgg') lim

-1

13

= limH H

-1C "C

Define V. = lira H. ^ andA „ A

Cyy CyGVCOy

VC89

where the partition is commensurate with (Yj9). Then

V,

V =

V VCyy CYe Cy9

C9yV Vcqy cee

C06 C(

V.

The asymptotic covariance matrix of N (

Q = ^ - ^cee'

,) is then

the difference of the asymptotic covariance matrices of 9 and 9 .

test statistic

T = N(9^ - 0^)' Q (9^ - 9^),

(1.22)

(1.23)

(1.24)

(1.25)

Thus the

(1.26)

where Q is a generalized inverse of Q, is asymptotically distributed chi-

square with degrees of freedom equal to the rank of Q. This statistic then

coincides with the general specification test statistic developed by Hausman

(1978) and generalized to use of singular covariance matrices by Hausman-

Taylor (1981) when an efficient estimator is available under the null

hypothesis.

The estimated covariance matrices for the maximum likelihood estimator

3 and 9 satisfyt» A

14

GOV (Q ) =C

/ii „ , s , n n.,n n.2,, .L P(i z , C,3„)(z. - z„)(z. - z)n=lLEC Ci Ci C

-JL

ue

plim N cov O ) = V

cov ( e )

A

Nn n, , n

I. S, .Z, P(ilx", A, e,)(x'.' - x'')(x'.' - x';)'n=l An ieA ' A i A i A

-1

plim N cov ( 9 ) = VA A

(1.27)

(1.28)

(1.29)

(1.30)

Therefore, an asymptotically equivalent computational formula for T is

itT = (6^ - 0^)' [cov (e^) - cov (9^)] (e^ - 6^) (1.31)

which should be quite easy to calculate using existing logit programs.

The appropriate degrees of freedom can be computed using

df = tr [cov (6^) - cov (9^)]*^ [cov (6^) - cov (0^)]. (1.32)

The regularity assumptions do not exclude the possibility that Q is less than

full rank; however, deficiencies will occur only for exceptional

configurations of the x. variables. In particular, a sufficient condition

for Q to be non-singular is that

Hp«o - h" = ^ Z, .E^ P(i|z'',C,3*)(x'? - x")(x'' - x")'C90 A Nn=lieD i Di D

n n,, n n.+ ^ J^ P(A|z",C,B*) P(D|z';c,B-")(x;;-xp(x^-x^)' (1.33)

have a non-singular limit, where D = C|A and

This is generally the case if the x variables either vary within D, or

15

take on values within D different from their average within A.

The estimated matrix [cov (9.) - cov ( 9p) ] may fail to be definite in

finite samples even when Q is non-singular. This does not impede calculation

of the statistic (3A) or carrying out the asymptotic test. However, one can

form an asymptotically equivalent estimate of cov (9.) such that

[cov (6 ) - cov (9 )] is always positive semidefinite by evaluating

P(i|x , A, 9) in (32) at 9 and replacing S by P(A|z ,C,e ). We have\j An L»

occasionally found the test statistic of equation (1.32) to be negative due

to lack of positive seraidef initeness in finite sample applications.

Replacement by the alternative covariance matrix always leads to a small

positive number. However, in no case have we found this alternative

statistic to be so large as to come close to any reasonable critical value

for a X test.

16

II. An Application

As an application of the test, we consider consumer choice of clothes

dryers. The alternatives are an electric dryer, a gas dryer, or no dryer.

We consider a three-alternative MNL model of this choice. A plausible

alternative is that the two types of dryers are viewed as having many common

unobserved characteristics in the decision whether to have a dryer. Hence,

we apply the specification test by excluding the no-dryer alternative. The

sample is 1408 households from the 1975 WCMS survey of residential appliance

holdings and energy use patterns; detailed discussions of the sample can be

found in Newman and Day (1975) and Dubin and McFadden (1980).

The variables used in the model are: (1) clothes dryer operating cost

in 1975, (2) clothes dryer capital cost, (3) homeowner dummy variable,

(4) number of persons in the household, (5) gas availability index. The

operating costs were calculated on the assumption that each person in the

household dries two loads of clothes per week. The capital costs we

calculated as the mean price plus a connect charge. The connect charge was

not included for the gas dryer if other gas appliances were already owned.

Similarly, a connect charge for the electric dryer was not included for the

electric dryer if 220 volts using appliances were already owned. It should

be noted that the choice variable reflects holdings rather than purchases.

Our analysis of holdings as a function of 1975 costs requires the strong

assumption that these reflect expected cost at date of purchase. We assume

in addition that the choice is made by the resident as an independent

17

decision, and not by a landlord or as part of an overall dwelling purchase

decision. Finally, while the GASAV variable and capital costs reflect the

probabilities that all alternatives are in fact available to the decision-

makers, we are unable to identify those individuals in the sample who are

captive to one of the alternatives and unable to exercise choice. These

limitations of the data and the specification of the explanatory variables

make it important to emphasize that our test is a joint test of the IIA

structure of the MNL model and of the variable specifications, and a

rejection may be due to either or both.

In the first column of Table 2.1 we present estimates of the MNL model

for the three choice (unrestricted) situation. We have included choice

specific dummy variales for the electric dryer and no dryer options with a

normalization that the gas dryer dummy variable equals zero. Therefore for

instance, homeowners are less likely to buy electric dryers than gas dryers

and are even less likely still to have no dryer. Also the availability of

gas raises the probability quite strongly that a gas dryer will be owned.

Both operating cost and capital cost coefficients have the expected negative

sign, and they are estimated quite precisely. Their estimated ratio is .81

which leads to a calculated discount rate of about 120% per year assuming a

lifetime for dryers of 8 years. This discount rate seems decidedly too high.

We now re-estimate the model after deletion of the no dryer alternative.

Thus we compare the choice between an electric dryer and gas dryer among the

subset of households that have a dryer. If the IIA property holds true then

the coefficient estimates should remain approximately the same. Under the

alternative hypothesis, we might expect the coefficients to change because

18

gas dryers and electric dryers are closer substitutes than the no dryer

choice. Estimates from the restricted choice set are reported in the second

column of Table 2. Note that while the capital cost coefficient remains the

same that the operating cost coefficient has almost tripled. The estimated

discount rate is now about 40% which is high but in line with other estimates

for appliance purchase decisions, e.g., Dubin and McFadden (1981) and Hausman

(1979). The other coefficients which change markedly are the effects of the

number of persons in the household. To calculate the specification test we

use equation (1.31) where the dimension of 6 is 6 since the 4 no dryer

specific variables are eliminated. We find T = 20.9. Since under the null

hypothesis T is distributed as central x with 6 degrees of freedom, we

reject the MNL specification at beyond the 99 percent critical level.

In the last column of Table 2.1 we present an alternative Hausraan-type

specification test where we have eliminated the electric dryer option to form

the restricted choice set. Since the unrestricted logit estimates are

calculated from the weighted average of the binary choice set estimates we

might expect the estimates from the gas dryer-no dryer choice set to also

provide the basis for a test. Examination of the results in Table 2.1

indicates that the estimates differ greatly from the unrestricted estimates.

In fact, the coefficient of operating cost now has the incorrect sign. We

2find T = 94.1. Again under the null hypothesis T is distributed as x with 6

degrees of freedom. Therefore, we reject the MNL specification at beyond the

99 percent critical level. Note, however, that the two specification tests

of this section are not independent as the combined size of the two

2tests is not l-( .99) . We return to this subject subsequently. Since both

19

specification tests lead to such a decisive rejection of the MNL

specification for the dryer choice problem, we may safely conclude that

misspecification is a serious problem with our model. We now consider an

alternative model specification to the MNL, the nested logit model which

allows for more flexible patterns of similarities. We also develop an

alternative specification test based on classical testing procedures. In

cases where the nested logit model is the correct specification, it might be

expected to provide a more powerful test from the omnibus-type Hausraan test

which in this application is not based on a specific parametric alternative

hypothesis

.

20

TABLE 2.1

Unrestricted and Restricted MNL Choice Models

Variable

1. Operating cost

2. Capital Cost

3. House owner x

Elec. dryer dummy

4. Persons x elec.Dryer dummy

5. Gas availabilityX Elec . dryer dummy

6. Elec. dryer dummy

7. House owner xNo dryer dummy

8. Persons x

No dryer dummy

9. Gas availability x

No dryer dummy

10. No. dryer dummy

Log Likelihood

Number of observations

AlternativeUnrestricted Restricted Restricted

Estimate Estimate Estimate(A.S.E.) (A.S.E.)

-.036

(A.S.E.)

-.013 .195

(.005) (.007) (.038)

-.016 -.016 -.073

(.001) (.001) (.012)

-.631 -.695(.280) (.298) —

.068 .227

(.056) (.076) —

-1.229 -1.696(.476) (.507) —

2.096 2.616(.458) (.494) —

-1.691 -1.88(.263) — (.282)

-.376 .117

(.049 — (.090)

-1.512 -.536(.497) — (.609)

.093 -14.4(.558) (2.97)

-1344.9 -498.8 -407.6

1408 961 809

21

III An Alternative Nested Logit Specification and Classical Tests

The use of the Hausraan-type specification test in Sections I-II requires

no specific alternative model. ^ In this section we consider a specific

alternative model, the nested logit model, to base test procedures on. Given

a parametric alternative hypothesis, we can apply classical test procedures

such as the Wald test, likelihood ratio (LR) test, and Lagrange multiplier

(LM) test. It is well known that for local deviations from the null

hypothesis, these tests have certain optimal large sample power properties,

c.f. Silvey (1970) and Cox and Hinckley (1974). Of course, the optimum

properties of these classical tests depend on three factors which may not be

satisfied in a given application: (1) the alternative specification on which

these tests are based is correct, (2) the sample is large enough so that the

asymptotic theory provides a good approximation, (3) deviations from the null

hypothesis model are of order l/VN. For our dryer choice application of the

previous section, it appears that the nested logit model provides an

attractive model for the alternative specification. In many other

applications the choice of the alternative model may not be nearly so

clearcut. The question of sufficiently large samples and local deviations we

postpone to the next section.

^It is interesting to note that the previous test is not equivalent to anANCOVA-like procedure in which 3 would be allowed to vary across each

alternative or some subset of alternatives after a normalization. The

specification test here would involve a test of the equality of the 3's. But

it is straightforward to check that the IIA property still holds under this

specification so that the most likely failing of the MNL model would not be

tested

.

22

The nested logit model for the three choice case has the simple

hierarchial nature shown in Figure 3.1. Alternatives 1 and 2 are assumed to

have more common characteristics than either alternative has with alternative

3.

Fig. 3.1

The idea behind the choice process is that the individual forms a weighted

average of the attributes of alternatives 1 and 2, sometimes called the

inclusive value which is closely related to his consumer's surplus from these

two choices considered above.

= log (e"-'^'^

+e ^2^/^)(3.1)

where X is a scalar parameter of the model. The choice probabilities of the

model are

P(l|z,C,|3,A) = p =

z,3/X Xye ^ e

I e y (e"3e^ e^y)

p(2|z,c,e,x) = P2 =ZpS/X Xy

e ^ e__

y . Z3B Xy.e (e ^ + e ^)

(3. 2. a)

(3.2.b)

23

P(3lz,C,3,X) = P3 = —-r —

-

(3.3.b)e

For A = 1, the nested logit model reduces to 3 MNL model. For 0<A<1 the

model fails to satisfy the IIA property but it does satisfy the properties

required for a random utility model. This proposition is proven together

with a discussion of other features of the general model specification in

McFadden (1981). One property of the model which deserves mention is that on

any subbranch of the tree, the ua assumption is made. However, in our

application since only a binary comparison involving alternatives 1 and 2 is

required, the IIA property does not lead to a testable restriction.

Given the alternative the classical test procedures may be applied by

basing tests on the likelihood function

N

1n=l Aec

IN

L.,(3,X)=- I Is. log P(i|z",C,3,X) (3.4)N N ^, , in

The assumptions for the MNL logit model following equation (1.4) are

sufficient to prove consistency and asymptotic normality of the estimates

(3,X) = 6 where we replace 3 by 6 in the assumptions. The covariance matrix

of the asymptotic normal distribution equals plim (- B^L ( 6*)/ 3686' )

.

The classical tests proceed so that each test leads to a test of A=l, so that

24

under the null hypothesis the test statistic is a central x^ random variable

with 1 degree of freedom. ^ But a potential problem arises in that the

likelihood function of equation (3.4) is not inexpensive to maximize as

application of both the Wald test and LR test require. A ridge in the

likelihood space makes for slow progress for most algorithms. The likelihood

function is not globally concave as is the MNL likelihood function. An

alternative procedure would be to maximize equation (3.4) conditional on X

and to choose the maximized value of L with respect to both g and X.N

However, each of these conditional maximizations requires an optimization

routine so that the entire process is not inexpensive. Since we want to find

an inexpensive test procedure for the MNL model, we instead based our tests on

one Berndt-Hall-Hall-Hausraan (1974) step beginning from consistent parameter

estimates (3,X). At least since R.A. Fisher's 1925 article, it has been

known that the resulting estimates are asymptotically equivalent to the ML

estimates .^

We derive our intial consistent estimates from the so-called "sequential

estimator", McFadden (1981). The sequential estimator first estimates the

vector g/A and an estimate of the inclusive value y from estimation based on

the restricted choice set, A which contains alternatives 1 and 2. The

estimate of A is then derived by use of a MNL program which considers the

^Here we take the alternative hypothesis to be A*l. One might consider themore restricted alternative that A<1 given the random utility modelrequirements. Since only 1 parameter is under test, the critical values for

the test could be easily adjusted.

^Fisher considered the i.i.d. case while here the presence of the z^'s leadto a i.n.i.d. case. However, the proof of asymptotic equivalence is

straightforward

.

25

choice between alternative 1 or 2 and alternative 3, along with the estimate

B Now in theory the BHHH step leads to asymptotically efficient estimates;

in practice, we found it was a common occurence for the likelihood function

to decrease relative to its value at the consistent, but inefficient,

sequential estimates.

Therefore, we took two or more steps until an increase in the value of

the LF occurred. The asymptotic theory remains the same although the

results are somewhat worrisome in light of our sample size of nearly 1500.

The results of these procedures are given in Table 3.1. In the first

column we chose the 'correct' nested model where the two dryer choices are

combined on one branch of the tree. For our nested logit specification

alternatives 1 and 2 are electric and gas dryers with alternative 3 being the

no dryer option. The estimates of g change significantly from Table 2.1.

The implied discount rate is now about 72%, in between the restricted and

unrestricted logit estimates. Gas availability is now estimated to be less

important, but the no dryer dummy variable has a much larger effect. In

terms of our test procedures we find the following:

(1) The estimated extra parameter is X = .364 which is 5.17 asymptotic

standard deviations from the MNL value of 1.0. The Wald test is therefore

calculated to be 26.7 which gives a very strong rejection of the null

hypothesis. (2) The LR test takes twice the difference of the log likelihood

functions and equals 6.8. The MNL specification is again strongly rejected.

26

TABLE 3.1

Nested Logit Estimates and Classical Test Results

Variable

Gas-ElectricNested Logit

Estimate(A.S.E.)

Gas-No DryerNested Logit

Estimate(A.S.E.)

1. Operating Cost

2. Capital Cost

-.0089(.0045)

-.0065(.0020)

-.134(.004)

-.146(.007)

3. House owner xElec. dryer dummy

4. Persons xElec . dryer duirnny

5. Gas availabilityX Elec . dryer dummy

6. Elec. dryer dummy

-.284(.154)

.047

(.035)

-.556(.292)

.944(.381)

.383(.148)

.387

(.041)

-.730(.271)

-2.35(.253)

7. House owner xNo dryer dummy

8. Persons xNo dryer dummy

9. Gas availabilityX No dryer dummy

10. No dryer dummy

-1.50(.197)

-.327(.037)

-.919(.441)

1.01

(.397)

-.763(.061)

.053(.013)

-.256(.119)

-5.62(.319)

11. X

Log Likelihood

Number of obstructions

.364

(.123)

-1341.5

1408

.405

(.025)

-3943.8

1408

27

(3) The LM test calculates ( 9L/ 95) 'Q"

H

3L/ 96) where Q is a consistent

estimate of the covariance matrix of the asymptotic distribution of N (5-6)

under the null hypothesis. Here it is calculated to be 5.65. Thus, all

three classical tests lead to a rejection of the MNL specification. They

also lead to a rejection at a higher level of significance than the Hausman

tests although those tests did reject at beyond the 1% level.

To further investigate the performance of the classical tests, in column

2 of Table 3.1 we present estimates of an 'incorrect' nested logit

specification. Here we made the gas dryer choice correspond to choice 1 in

Figure 3.1 and the no dryer option correspond to choice 2. On the other

branch, the electric dryer option is choice 3. Note that the coefficient

estimates change markedly, especially for those coefficients which interact

with choice specific dummy variables. The calculated discount rate is now over

100 percent since the estimated coefficient of capital cost exceeds the

estimated coefficient of operating cost. The results of the classical tests

are: (I) The estimated parameter is X = .405 which is 23.8 asymptotic

standard deviations from the MNL value of 1.0. The Wald test is therefore

528.9 which is a stronger rejection of the MNL model than the 'correct'

nested logit model gave. (2) The LR test turns out to be unuseable. A

negative LR statistic is calculated even after three iterations at the

initial "consistent" estimates. (3) The LM test is calculated to be 30.9.

Thus both the Wald test and LM tests based on the 'misspecif ied' nested logit

model lead to stronger rejection of the original MNL specification than do

the corresponding tests based on the "correct" nested logit model. As with

the Hausman specification tests, the two sets of tests are not independent.

We next investigate whether the tests can be distinguished on the basis of

28

size and power characteristics. For instance, if the classical tests are

found to have a large power advantage over the Hausman test, they might be

the tests of choice despite their increased computational requirements.

29

IV. Exact and Approximate Comparisons of theHausman Test and the Classical Tests

The results of this section may prove surprising to devotees of

classical test procedures, especially recent advocates of LM tests.

^

We consider a three choice example when the Hausman test and classical tests

have closed form solutions. Exact comparisons of the size and power of the

tests can then be made. To make the comparisons, we choose a nested logit

model as the correct model under the alternative hypothesis. Therefore, the

Wald, LR, and LM tests are based on the correct alternative specification.

The Hausman specification test is, of course, not based on a specific

alternative model. Just these type of conditions lead to theorems of optimal

asymptotic power among the Wald, LR, and LM - the so-called holy trinity - of

asymptotic tests. For our particular example we choose the sample size equal

to 1000 so that asymptotic theory should provide a reasonable guide. Yet we

find two rather unexpected results: (1) The three tests which comprise the

trinity differ markedly in both of their operating characteristics even when

their nominal size is the same.^

^Two classes of LM tests seem worthwhile to distinguish here. One class is

based on initial consistent estimates so that the LM test is equivalent to

taking one Newton type step which under general regularity conditions willproduce asymptotically efficient estimates. The other class of LM tests is

based on initially inconsistent estimates under the alternative hypothesis,and except in the linear case or for local alternatives their finite sampleproperties have not proven to be generally attractive. It is the secondclass of LM tests that we consider in our application.

^That the actual size differs in a predictable way for a class of linearmultivariate regression models was proven by Berndt-Savin [1978].

30

(2) Uncorrected for size, the Wald test and Hausman test have approximately

equal power. Next comes the LR test with the LM test distinctly in the rear.

Both results may well arise because we are considering a non-local

alternative hypothesis.^ That is, the expansions required for the optimal

power theorems hold for parameter vectors 9. = 6 +6 where 6 must beAcsufficiently small, e.g., 6 = d/jN where d is a constant vector. But,

examples have been given such as Peers (1971) where significant differences

can arise. No generally accepted theory exists for the non-local case. Or,

the samples may not be large enough for the reliable application of

asymptotic theory. We see our results as a particular example and as a

caution against relying too heavily on the local asymptotic theory.

For our example we consider a 3 choice MNL specification of equation

(1.1) with only a single explanatory variable. Furthermore, we assume only a

single data configuration occurs, Z|=l, Z2=0, Z3=0. We assume N=1000

repetitions of the choice and cell counts n ,n ,n . We assume that the true

parameter 3=log 2 generates the observations under the null hypothesis.

For these data the MNL choice probabilities take the form:

p^ = e^/(2 + e^) P2 = P3 = ^/^^ + e^) (4.1)

which for 3=log 2 are p = .5 and p = p = .25. The log likelihood function

for the unrestricted choice set is

L^(6) = n^ log pj^ + n^ log p^ + n^ log p^= n R-N log (2 + e^). (4.2)

Maximization of the likelihood function yields the estimates

A discussion of local and non-local alternatives for an application of theHausman test is given in Hausman [1978].

31

^C" ^^g ^ n2 ^ ng ^

V^ = N/n,(n2 + nj) (4.3)

where V is the large sample estimator of the variance of 3 given in equation

(1.16).

For the restricted choice set we eliminate choice 3. Maximization of

the restricted log likelihood function L ( 3) of equation (1.6) yields

estimates

\ = log (—J V^ = (n, ^ n2)/n,n2^^^^^

where V is calculated from equation (1.17). The Hausman test statisticA

based on the deletion of alternative 3 is

2n2

H. = (B -3,)2/(y V ) = do ( ))2 n2(n2+ n3)/n3. (4.5)3 CA CA n2+n3 ^. ^ o o

By symmetry the Hausman statistic based on the deletion of alternative 2 is

2n3Ht = (log ( ))2 n2(n2 + n3)/n3. : (4.6)^ n2+n3 ^^ -^ ^

Note that the last possible test Hjl is not defined under our data

configuration since B is not identified when alternative 1 is deleted.

As the model specification for the alternative hypothesis we use the

nested logit model. For our uses it can be most convenientally written as

Bz3 Bz3 Bzjl/X BZ2/A^

Pi = II|P P2 = n2P P3 = 1-p = e /(e . (e e ) ) (4.7)

Bz ./A Bzi/A BZ2/Xwhere 11. = e /(e + e ) for i = 1,2. The log likelihood has the

form

L(a,X) = n^a - (n|+n2)(l-X) log (e^+l) - N log (^(e^+l)^ (4.8)

32

where we use the parameterization a = 3/ A. The maximum likelihood estimates

"1 "3 n2

a = log X = log (-—-_)/log (-—:—

)

n2 n i+n2 n i+n(4.9)

\.^^2

Denote the partitioned large sample estimate of the information matrix as

-lim E— — —

N-»-oo L \x- A

aa \x_\x hx_ Aa hx_ (4.10)

Then the Wald statistic for the hypothesis H : A=l is

W = ( X - 1) 2 V"

^

n 2n 3

where V~ ^ = A,,-A^jA = —:— t?( t ?'^t|n3/N)/( t ?n2/(n ,+n2) + t§n3/N)AAaAaa N ii^o i^ i^ <io

for ti = log (n2/(n|+n2)) and t2 = log (n 3/(n i+n 2) ) .^

(4.11)

Next the LR statistic is calculated from the unrestricted log

likelihood function of equation (4.2) for the MNL model which sets A=l and

the nested log likelihood of equation (4.8):

2n. 2n3LR = 2(L(a,A)-L (3)) = 2n2 log ( ) + 2n3 log ( ) (4.12)

c "^ n 2+n 3 -^ n 2+n 3

Finally we derive the LM statistic. Under the null hypothesis with A=l, we

^Details of these derivations will be provided by the authors uponrequest

.

33

use the MNL estimate 3 = log (2n|/(n2+n3)) so that in equation (4.7)

II2 = (n2+n3)/(N+ni ) and p = (N+nx)/2N. We then evaluate the gradient of the

likelihood function (4.8) at this point to find

n 3~n2

L. = ( —7 )log (n2+n3)/(N+njt) (4.13)A Z

together with its large sample variance

V"-^ = - (EL,,- (EL, )2/EL ) (/ M\\\ Aa aa (4.14;

3N-ni (n2+n3)(N+nj,)

InTF: ^N(log((n2+n3)/(N+n,)))2.

Therefore we calculate the LM statistic as

LM = l2/V~^ = N(n3-n2)2(3N+nx)/(3N-ni)(n2+n3)(N+ni) (4.15)A

Asjmiptot ically, each of the test statistics H3, H2, W, LR, LM is under

the null hypothesis distributed x?- Note that (nx,n2,n3) has the trinomial

distribution,

N! n| n2 n3

Pr (njL,n2,n3) = —-j—^—y- pj^ p2 P3 ,(4.16)

where Px,P2>P3 ^re given by equation (4.1) with A=l under the null

hypothesis, and for 0<A<1 under alternatives. For the example, we calculate

numerically the exact distribution of the statistics for X=l and for

alternative values A = ( .95 , . 9, .85 , .8 , . 75 , . 7 , .6) . These calculations permit

determination of the exact sizes of each test for various nominal sizes, and

of the power functions, either size-corrected or uncorrected. The means and

34

variances of the parameter estimates and test statistics, conditioned on

positive cell counts, are also computed.

In general when closed form test statistics cannot be found, numerical

calculation of exact power requires expensive Monte Carlo simulation. A

"rough-snd-ready" approximation to power can however be calculated as

follows: Let S(ni,n2,n3) be the statistic in question, and let

U = S(Npi,Np2,Np3), (4.17)

where (Pi»P2>P3) ^^^ the choice probabilities under an alternative, be a

measure of the "location" of the statistic. If S is asymptotically x^ under

the null hypothesis, then under alternative S is roughly a non-central chi-square with noi

centrality parameter ij , and power can be calculated from the distribution

kT + m

P (x'2 =< X) = e-'^' y^-^^"(^-/^^

y(-^/2)-

m=0 m!(4+ra) n=o"-^"'"'- r(-^+n). (4.18)

Table 4.1 gives the exact distribution of the five alternative test

statistics under the null hypothesis: HAUS 3 is the Hausman statistic based

on deletion of alternative 3, HAUS 2 the corresponding statistic for deletion

of alternative 2, and WAl.D, LM, and LR are the Wald, Lagrange multiplier, and

Likelihood ratio statistics, respectively. Table 4.2 gives the means and

variances of these statistics, exact sizes of various tests, and critical

levels.

Table '+.1 shov,s that the exact distributions of the test statistics are

relatively close to their asymptotic limit. All the statistics have

thicker lower tails than the chi square. The Wald test also has a thicker

35

upper tail, and the LM test a thinner upper tail, than the chi square where

we use the upper tail to define the critical point for the tests. The

conditional means and variances given in Table 4.2 also indicate that the

Hausraan statistics and likelihood ratio statistic have distributions close to

the asymptotic limit, while the Wald statistic is distributed with a higher

mean and variance and the LM statistic with a lower mean and variance. The

exact sizes of the LR, HAUS 3, and HAUS 2 tests are close to their nominal

sizes when the asymptotic critical level is used. However, the exact size

36

TABLE 4.1

Exact Cumulat ive Distribuition Function of the Test Statistics

iARGUMENTCHISQUARE HAUS 3 ILAUS 2 WALD LM LR

.0001562 .0100000 .0178485 .0178485 .0178485 .0178485 .0178485

.0009766 .0250000 .0178485 .0178485 .0178485 .0178485 .0178485

.0039119 .0500000 .0535097 .0535097 .0535097 .0535097 .0535097

.0157204 .1000000 .0890643 .0890643 .0890642 .0898139 .0890643

.0356288 .1500000 .1595404 .1595404 .1551770 .1595720 .1595490

.0639752 .2000000 .1943906 .1943906 .194 3880 .1986097 .1943891

.1012512 .2500000 .2571302 .2571302 .2333807 .2625366 .2574044

.1481302 .3000000 .2964287 .2964287 .2956866 .3088109 .2963552

.2055111 .3500000 .3533769 .3533769 .3334747 .3615748 .3547693

.2745780 .4000000 .3964012 .3964012 .3910138 .4110970 .3948666

.3568926 .4500000 .4514412 .4514412 .4352908 .4667838 .4527005

.4545310 .5000000 .5000545 .5000545 .4849164 .5174321 .5005644

.5702946 .5500000 .5501913 .5501913 .5342529 .5663164 .5490352

.7080494 .6000000 .6011664 .6011664 .5828786 .6169712 .5987006

.8732966 .6500000 .6506719 .6506719 .6327198 .6672093 .64904551 .0741902 .7000000 .6995021 .6995021 .6832754 .7179841 .70083061 .3235021 .7500000 .7501350 .7501350 .7341925 .7682552 .75001171 .6428286 .8000000 .8004872 .8004782 .7847090 .8171284 .80030992 .0730254 .8500000 .8504647 .8504645 .8360386 .8652888 .8501056.7067207 .9000000 ,9001828 .9001823 .8881028 .9120306 .9004682

3 .8431482 .9500000 .9499366 .9499344 .9419375 .9570080 .95032315 .0258684 .9750000 .9748535 .9748486 .9696143 .9782958 .97541366 .6369923 .9900000 .9898452 .9898323 .9870385 .9907046 .9904561

10 .8291046 .9990000 .9989388 .9989182 .9984343 .9985983 .999210315 .1371548 .9999000 .9998936 .9998833 .9998088 .9997743 .999960019 .5106637 .9999900 .9999924 .9999901 .9999792 .9999684 .999999923 .9262119 .9999990 1 .0000000 .9999998 .9999989 .9999975 1 .000000028 .3710278 .9999999 1 .0000000 1.0000000 1 .0000000 ]L. 0000000 1 .0000000

37

TABLE 4.2

Characteristics of the Test Statistics

CHISQUARE HAUS 3 HAUS 2 WALD LM LR

Mean^ 1.0 0.99990 1.00009 1.07102 0.92832 0.99554

Variance^ 2.0 2.01212 2.01608 2.30900 1.69288 1.95064

Exact sizeof a nominal10% test^

0.10 0.09982 0.09982 0.11190 0.08797 0.09953

Exact sizeof a nominal

5% test

0.05 0.05006 0.05007 0.05806 0.04299 0.04968

Exact size

of a nominal1% test^

0.01 0.01015 0.01017 0.01296 0.00929 0.00954

Critical level 3.84315 3.84528 3.84535 4.11628 3.57253 3.8231J

for an exact

5% test.

^ Mean and variance are conditioned on the event that all cell counts

exceed 50, which occurs with probability 1-10"^^.

2 Critical level 2.70672.

3 Critical level 5.02587.

38

of the wald test exceeds its nominal size substantially, while the exact size

of the LM test is substantially lower .^ For instance, for a nominal 5% test

size, the Hausman tests are within 1% of the nominal size and the LR test is

within .6%. But the Wald test size is off by 15.8% and the LM test by 15.1%

in the other direction. For a nominal 1% test the discrepancy of the Wald

test is 26%. Table 4.2 gives the adjustments in critical levels required to

achieve an exact 5 percent test with each statistic.

Table 4.3 gives the power of each of the tests against the nested logit

model with values of X less than one. Power curves are given for both the

asymptotic tests and the size corrected 5 percent tests. Also given are the

rough estimations to power based on equations (4.17) and (4.18).

The exact powers of the asymptotic tests, uncorrected for size, are

ranked

WALD > HAUS 2 > LR > LM > HAUS 3

over most values of X. The performance of the WALD and LM tests is explained

in part by the deviations between the exact and nominal sizes of these tests.

The differences in power are of sufficient size and uniformly to suggest that

the WALD and HAUS 2 tests are clearly preferable to the remainder. The LR

and LM tests do not do nearly as well by comparison.

Table 4.3, III gives the powers of the size corrected tests. Here the

rankings are

^These sizes results for the WALD, LR, and LM tests are in accord with theBerndt-Savin [1978] results for multivariate linear models.

39

TABLE 4.3

Exact Power of the Tests

I. Asymptotic test, nominal size 0.10

Alternative X

1.0 0.95 0.90 0.85 0.80 0.75 0.70

HAUS 3

exact .0998 .1447 .3437 .6078 .8434 .9604 .9950

approx. .0999 .1554 .3534 .5993 .8190 .9402 .9871

HAUS 2

exact .0998 .1685 .3893 .6544 .8714 .9700 .9966

approx. .0999 .1583 .3817 .6659 .8937 .9824 .9989

WALDexact .1119 .1829 .4091 .6728 .8817 .9733 .9971

approx. .0999 .1624 .3992 .6911 .9103 .9870 .9994

LMexact .0879 .1416 .3445 .6094 .8445 .9608 .9950approx. .0999 .1531 .3507 .6056 .8349 .9541 .9929

LRexact .0995 .1564 .3669 .6320 .8583 .9656 .9959approx. .0999 .1568 .3670 .6316 .8579 .9648 .9955

40

Table 4.3 (continued)

II. Asjrmptotic test, nominal size = 0.05

Alternative X

1.0 .95 ,90 .85 .75

HAUS 3

HAUS 2

WALD

LM

LR

0501 .0778 .2278 .4730 .7475 .9210 .9869

0500 .0885 .2438 .4745 .7245 .8928 .9722

0501 .1015 .2826 .5419 .7998 .9438 .99190500 .0906 .2682 .5451 .8242 .9633 .9971

0581 .1130 .3029 .5650 .8158 .9501 .9931

0500 .0936 .2835 .5728 .8479 .9719 .9981

0429 .0779 .2332 .4807 .7540 .9240 .9876

0500 .0869 .2415 .4810 .7449 .9148 .9837

0497 .0894 .2557 .5090 .7757 .9336 .9897

0500 .0895 .2555 .5082 .7744 .9324 .9891

41


HAUS 3

HAUS 2

WALD

LM

LR

irrected Test, exact size = 0.05

^

Alternative X

1.0 .95 .90 .85 .80 .75 .70

0500 .0775 .2272 .4722 .7471 .9208 .9868

0499 .0884 .2436 .4723 .7243 .8927 .9722

0500 .1015 .2826 .5418 .7997 .9437 .9918

0499 .0905 .2697 .5449 .8241 .9632 .9971

0500 .1014 .2823 .5413 .7992 .9435 .99180425 .0821 .2607 .5458 .8312 .9672 .9977

0498 .0896 .2562 .5098 .7764 .9340 .9898

0588 .0992 .2641 .5091 .7669 .9252 .9864

0502 .0901 .2570 .5107 .7770 .9342 .9899

0506 .0904 .2571 .5103 .7760 .9931 .9892

42


IV. Asymptotic test, nominal size 0.01

Alternative X

1.0 ,95 .90 .85 .80 .75 .70

HAUS 3

HAUS 2

WALD

LM

LR

.0102 .0166 .0755 .2248 .4911 .7646 .9366

.0134 .0208 .0967 .2475 .4924 .7343 .9028

.0102 .0321 .1283 .3251 .6111 .8479 .9669

.0134 .0215 .1104 .3064 .6243 .8797 .9838

.0130 .0381 .1450 .3529 .6396 .8648 .9720

.0134 .0225 .1193 .3316 .6600 .9022 .9888

.0072 .0188 .0856 .2458 .5187 .7855 .9450

.0134 .0203 .0954 .2526 .5174 .7747 .9359

.0095 .0235 .1007 .2753 .5548 .8112 .9545

.0134 .0212 .1032 ..2 748 .5552 .8100 .9533

43

HAUS 2 > WALD > LR > LM > HAUS 3.

In these rankings, HAUS 2 and WALD have comparable power, and are

substantially better than the LR and LM tests, which have comparable power.

The high exact power of the HAUS 2 test is surprising in view of the

fact that the Wald , LM and LR tests are specific against the alternative

model generating the observations. This outcome may be special to this

example where the Hausraan tests have the same degrees of freedom as the

trinity. However, the result also indicates that for this problem the sample

size of 1000 is not sufficient for the asymptotic equivalency of the trinity

and second order efficiency of the LR test to take effect in the sense that

the largest deviation A. = (1.7)/v^= .009 is not sufficiently local for the

asymptotic power rankings to apply.

The rough approximation to power is within .003 of the exact power of

all these tests, and the ranking of the tests by approxirate power usually

coincides with the exact power ranking. We conclude that the rough power

calculation is potentially a very useful tool for quick comparisons of tests

or judgments on small sample power, without the great expense of Monte Carlo

calculations

.

For comparison purposes, we then recalculated the exact classical tests

based on a misspecified nested logit model. That is, in eqution (4.7) we

interchanged p and p so that choices 1 and 3 now lie on the same branch.

However, we continue to conduct the test procedures based on the original

model specification where choices 1 and 2 were on the same branch of Figure

3.1. In Table 4.4 we present both approximate and exact size corrected test

43a

TABLE 4.4

Approximate and Exact Test Results at Test Size .05

Wrong Model Correct Model

X = .95 X = .9 X = .85 X = .95 X = .90 X =

Nominal size 0,05

HAUS 3 .102 .283 .542 .078 .228 .473

HAUS 2 .078 .228 .473 .102 .283 .542

WALD .089 .250 .501 .113 .303 .565

LM .078 .233 .481 .078 .233 .481

LR .089 .256 .509 .089 .256 .509

Exact size .05

HAUS 3 .101 .283 .542 .077 .227 .472

HAUS 2 .077 .227 .472 .101 .283 .542

WALD .078 .228 .473 .101 .282 .541

LM .090 .256 .510 .090 .256 .510

LR .090 .257 .511 .090 .257 .511

MLE estimate 1.050 1.112 1,171of X

44

results for the Hausraan specification test and classical test procedures.

The two Hausman tests interchange operating characteristics since our example

is symmetric with respect to them. Therefore, HAUS 3 is now the more

powerful test. The Wald test which had excellent operating characteristics

when based on the correct model now loses substantial power. Note that the

Wald test is now based on inconsistent estimates. The power characteristics

of the LR and LM tests are symmetric with respect to the models. Thus, the

HAUS 2 is now significantly more powerful than the classical tests. The

ranking based on exact sizes is

HAUS 3 > LR > LM > WALD > HAUS 2

The HAUS 3 has approximately 10% more power than the LR and LM tests and 15%

more power from the Wald test. Thus, it seems useful to investigate

different tree structures when using the classical tests because of their

sensitivity to model specifications. In the last line of Table 4.4 we also

present the maximum likelihood estimates of X which all exceed the theoretical

maximum of 1.0 which again indicates misspecification , Likewise, it seems

useful to base the Hausman specification test on different restricted choice

sets since its power also depends on the model specification.

45

V. Conclusions

The finding that the Hausraan test has better power than the LR or LM

test for our example may make further evidence on the test statistics of

interest. We also engaged in very limited Monte Carlo examples on the

original data for the application of Sections 2 and 3. We took the

unrestricted MNL estimates of Table 2.1 to be the true values for the g's and

then generated choice outcomes using the nested logit model of equation (3.2)

to be the true model. For a range of X's we simulated choice behavior via

the generalized extreme value distribution, McFadden (1981),

F(.e^,e^,e^] = exp [-exp ([(-e^ + (-e^) ] - e^)] (5.1)

and then assigned the choice for each individual to the alternative with

the highest utility calculated by the random utility model, u.=z.6 for j=l,3. We

then compared the Hausman test of Section 2 with 6 degrees of freedom to the

trinity of tests from Section 3 which all have 1 degree of freedom. The

values of X that we chose were X = ( .25 , .5 , .75 , .9 , .95 , 1 .0) . At the 5% level

all tests rejected for X = ( .25 , . 5 , . 75 ) . For X = (.9, .95,1.0) none of the

tests rejected. As in our example of the previous section the Wald test

always achieved the highest value (after we normalized the Hausman test) when

it was based on the correctly specified model. But we cannot correct for

size here since we do not have exact values. Additionally, we generated

samples using the multinomial probit model of Hausman-Wise (1978).

46

A somewhat disturbing result here is that while the Hausraan test does not

reject the independent probit specification, which is quite similar to MNL

with IIA except for differences in the extreme tails of the underlying

distributions, the trinity of test rejects resoundingly with the statistics

above 5.0. We do not yet have a satisfactory explanation for this outcome.

In terms of applying the Hausraan test or the trinity of tests, the non-

uniqueness of application can arise. For instance in the 3 choice case any of

the 3 alternatives can be dropped or 3 different tree structures can be

defined. As the number of choices grows, the different possible combinations

grow factorially. If more than one test is performed, the problem of

controlling for the size arises because the tests will not be independent.

In the case of the Hausman test the different estimates could be combined to

form a test after their joint asymptotic covariance matrix is calculated

which would not be difficult^. Alternatively, using the IIA property of the

null hypothesis, the restricted choice set A could be successively decreased

when more than 3 choices exist and the size controlled for as we do when

independent F tests in linear models are used. However, we have no

prescription about optimal procedures in any given application.

We conclude that the Hausman test and Wald test are the best choices to

test the IIA assumption in MNL models. The Wald test does require maximum

likelihood, or at least asymptotically efficient estimates, of the nested

logit model and correct specification of the tree structure in the nested

^We attempted such a combination for our example in Section 4. But for that

example we found the tests to have perfect negative correlation. Therefore,it appears that it is possible to increase the power indefinitely for anynominal size test by choosing the appropriate weighted average of the test

statistics. We leave to further investigation the problem of optimalcombination of the tests.

47

logit model. The Wald test has higher power in all our examples than the LR

or LM test for the correct specification. However, for an incorrect

specification the LR and LM tests are superior to the Wald test. But the

Wald test has significantly greater computational requirements than does the

Hausman test. An alternative test based on a consistent estimate of X from

the sequential logit estimator is possible by the use of Neyman's [1959] C

procedure. However, our experience with the sequential logit estimator is

that it gave quite unreliable estimates of X method for testing.^ Since the

Hausman test gave results in general close to that of the Wald test across

the range of our examples, we recommend it as a general purpose specification

test for the MNL model. The Wald test should also be considered when the

analyst feels that the nested logit model provides the correct specification

for the choice problem under consideration. The Wald test requires more

sophisticated computer software; furthermore, once it is size corrected it

no longer is superior in our example to the Hausman test which is very

accurate with respect to its size. Furthermore, for the case of a

misspecified nested logit model, the alternative Hausman test performed much

better than the Wald test. But certainly some test of the IIA property

should be made when the MNL model is used. Our experience is that the

Hausman test has rejected the MNL specification in a number of applications.

^An alternative indication of the unreliability is that in the majority ofthe cases, one BHHH step beginning at the sequential logit estimates led to a

decrease in the value of the likelihood function. Since the C test is baseda

on the one step methodology, we decided against its use. Furthermore, in the

misspecified model case, we were often unable to find an increase in thelikelihood function even when up to three BHHH steps were made.

48

REFERENCES

Berndt, E., B. Hall, R. Hall, and J. Hausman (1974), "Estimation andInference in Nonlinear Structural Models", Annals of Economic and SocialMeasurement, 4.

Berndt, E. and G. Savin (1977), "Conflict Among Criteria for TestingHypotheses in the Multivariate Linear Regression Model", Econometrica,45.

Cox, D.R. and D. Hinckley (1974), Theoretical Statistics , London: Allen andUnwin.

Debreu, G. (1960), "Review of R. Luce, Individual Choice Behavior", AmericanEconomic Review, 50.

Domencich, T. and D. McFadden (1975), Urban Travel Demand, Amsterdam: NorthHolland.

Dubin, J. and D. McFadden (1980), "An Econometric Analysis of ResidentialElectrical Appliance Holdings and Usage", working paper, Department of

Economics, M.I.T.

Hausman, J. (1978), "Specification Tests in Econometrics", Econometrica 46 .

Hausman, J. (1979), "Individual Discount Rates and the Purchase and

Utilization of Energy Using Durables", Bell Journal of Economics , 10.

Hausman, J. and W. Taylor (1981), "The Relationship between Classical Testsand Specification Tests", MIT mimeo

.

Hausman, J. and D. Wise (1978), "A Conditional Probit Model for QualitativeChoice", Econometrica , 46.

Manski, C. and D. McFadden (1981), "Alternative Estimators and Sample Designsfor Discrete Choice Analysis" in C. Manski and D. McFadden (eds)

Structural Analysis of Discrete Data , Cambridge: MIT Press.

McFadden, D. (1973), "Conditional Logit Analysis of Qualitative ChoiceBehavior" in P. Zarembka (ed.) Frontiers in Econometrics, New York:

Academic Press.

McFadden, D., W. Tye, and K. Train (1976), "An Application of DiagnosticTests for the Independence from Irrelevant Alternatives Property of the

Multinomial Logit Model", Transportation Research Record 637.

49

McFadden, D. (1981), "Econometric Models of Probabilistic Choice", in C.

Manski and D. McFadden (eds), Structural Analysis of Discrete Data,

Cambridge, MIT Press.

Newman, D. , and D. Day (1975), The American Energy Consumer , Cambridge:Ballinger

.

Neyman, J. (1959), "Optimal Asymptotic Tests of Composite StatisticalHypotheses", in U. Grenander, ed

., Probability and Statistics, Stockholm :

Alraquist and Wiksell.

Peers, H. (1971), "Likelihood Rates and Associated Test Criteria",Biometrika, 58 .

Silvey, S. (1970), Statistical Inference, London: Penguin.

Date Due

JAN 18 19B9

MAR 3 V ]Q(.

JULoSlfeo

ACMEBOOKBINDING CO., INC.

SEP 1 5 i933

100 CAMBRIDGE STREET

CHARLESTOWN, MASS.

ARIES DUPL

[j'^iDflD DDM ms Tbl

2^(

3 TDflD DDM M77 3fl3

•

MIT LIBRARIES _DUPL __' -^Q

J<^3

[_3 TDfiD DD4^ MIS 17^

3 TDflo DD^ ms Ta7MIT LIBRARIES

3 TDflD DDM 415 TTS

MIT LIBRARIES .^93

3 =^060 DD4 41b DPIMIT LIBRARIE5__ ,.,,, 1111 nil II =i'^V

3 TDfiD DDM M Mb "133

I

^-f

3 TDflD DDM M7fl DSDmiT LIBRARIES '^f'^

3 TDSD DDM MMb "IMl

specification tests for the multinomial logit model · october1981 commentswelcome...

Documents