nonparametric testing for exogeneity with discrete regressors ...nonparametric testing for...

39
Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP11/15

Upload: others

Post on 15-Sep-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Nonparametric testing for exogeneity with discrete regressors and instruments

Katarzyna Bech Grant Hillier

The Institute for Fiscal Studies Department of Economics, UCL

cemmap working paper CWP11/15

Page 2: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Nonparametric testing for exogeneity with discreteregressors and instruments

Katarzyna Bech and Grant HillierUniversity of Southampton

March, 2015

AbstractThis paper presents new approaches to testing for exogeneity in non-parametric

models with discrete regressors and instruments. Our interest is in learning aboutan unknown structural (conditional mean) function. An interesting feature ofthese models is that under endogeneity the identifying power of a discrete instru-ment depends on the number of support points of the instruments relative to thatof the regressors, a result driven by the discreteness of the variables. Observingthat the simple nonparametric additive error model can be interpreted as a lin-ear regression, we present two test-statistics. For the point identifying model,the test is an adapted version of the standard Wu-Hausman approach. This ex-tends the work of Blundell and Horowitz (2007) to the case of discrete regressorsand instruments. For the set identifying model, the Wu-Hausman approach is notavailable. In this case the test-statistic is derived from a constrained minimizationproblem. The asymptotic distributions of the test-statistics are derived under thenull and �xed and local alternatives. The tests are shown to be consistent, anda simulation study reveals that the proposed tests have satisfactory �nite-sampleproperties.

1 Introduction

The possible presence of endogeneity is one of the common problems in econometricmodels. It occurs when the regressor is correlated with the model error term. Typ-ically it is a result of omitting a relevant explanatory variable, of simultaneity in themodel, or measurement error in the regressor. The presence of endogenous regressorsin the nonparametric model produces bias in the identi�ed case, and non-existence ofany consistent estimator in the set identi�ed case. Because of the potentially severeconsequences of endogeneity, applied researchers need to check whether the explanatoryvariables used are exogenous, before providing an inference on the parameters of inter-est. Following the work of Hausman (1978), a vast literature on testing for exogeneityof the regressors has emerged.

1

Page 3: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Recently, with the expansion of interest in nonparametric models, new testing pro-cedures have been developed. The problem of testing the correct speci�cation of anonparametric model of the form

Y = h(X) + " (1)

has been discussed by many authors including Fan and Li (1996), Zheng (1996), Lavergneand Vuong (2000), Lavergne and Patilea (2008) and Blundell and Horowitz (2007).These tests �t in a conditional moment restriction testing framework, and are based onthe earlier work of Newey (1985) and Bierens (1990), among others.All the nonparametric tests of this type assume that the regressors are continuously

distributed. The aim of this paper is to provide a test for exogeneity in a nonparametricmodel with discrete explanatory variables. A model with discrete regressors arisesin many economic problems. Variables such as gender, marital status or educationlevels typically take discrete values. When X is binary it may indicate the occurrenceof the event. In empirical applications, such regressors are called �dummy variables�taking values 0 or 1, for example, an individual is either male or female, working orunemployed. The discrete regressor with multiple categories might measure e.g. thenumber of children in a household, or give the position on an attitudinal scale. Thenonparametric model with discrete regressors has been applied by Hu and Lewbel (2008)to identify and estimate the di¤erence in average wages between individuals who falselyclaim college experience and those who tell the truth about not completing collegeeducation. More recently, Iori, Kapar and Olmo (2014) use nonparametric methodsto explain variation in the continuous variable (bank funding spreads) given a set ofdiscrete regressors (bank characteristics, nationality, size and operating currency) inthe European interbank money market.The most popular method of dealing with endogeneity in econometric models is by

instrumental variable (IV) estimation. Although IV methods are traditionally para-metric in nature, the extension of the approach to a more �exible, non-parametricframework was introduced by Newey and Powell (2003). The method suggests thatresearchers should �nd a set of variables satisfying instrument relevance and exogeneityconditions and use them to consistently estimate the causal relationship between thedependent variable and endogenous regressors. However, the IV method involves someidenti�cation issues. The problem with identi�cation is particularly noticeable in non-parametric models with additive errors when the regressors are discrete. Florens andMalavolti (2003) and Das (2005) show that the identi�cation of the unknown functionof interest depends on the support of instruments relative to the support of the endoge-nous regressor. If the identi�cation condition is violated and point identi�cation is notfeasible, the model still has some partial identifying power. Partial identi�cation can beachieved in models which cannot provide the exact value of the parameter or structureof interest, but contain enough information to bound these values to informative sets.Chesher (2004) discusses the estimation of the regression function h(�) in equation (1)with this framework.

2

Page 4: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

One of the advantages of nonparametric models with discrete endogenous regressorsis that they do not su¤er from the ill-posed inverse problem that arises in nonpara-metric models with continuous endogenous regressors. The problem derives from thediscontinuity of the mapping from the structural to the reduced form, when estimatingan in�nite dimensional function h(�) in continuous speci�cations (Newey and Powell(2003)). This means that h(�) cannot be estimated consistently by replacing the un-known population quantities with consistent estimators. In order to obtain a consistentestimator, it is necessary to regularize the mapping that identi�es the unknown functionof interest. Restricting the endogenous regressors to be discrete eliminates the ill-posedinverse problem. The discrete speci�cation is well-posed, and no regularization of theproblem is required.The plan of this paper is as follows. Section 2 introduces the nonparametric model

of interest and presents the notation that enables us to interpret equation (1) as alinear model. This section also explains the identi�cation problems in the presence ofendogenous regressors and shows some basic estimation results. Section 3 presents thetest for models that point identify the unknown function of interest and establishes theasymptotic properties under the null and alternative hypothesis. Section 4 introducesthe test for models that are set identi�ed. The asymptotic distribution of the teststatistic under the null and alternative hypothesis is also derived in this section. InSection 5, we present the results of the Monte Carlo investigation of the �nite-sampleproperties of the proposed tests. Section 6 concludes. All proofs are in the appendix.

2 Model and assumptions

2.1 Notation

The upper case letters X; Y; Z will denote observed random variables, and xsi ; ysi ; z

si will

denote sample (data) points. Symbols xk for k = 1; :::; K denote the points of supportof a discrete random variable X. I(A) stands for an indicator function, which takesvalue 1 if the event A occurs,and is 0 otherwise. The probability density function of acontinuous random variableW is denoted by fW (w); and the probability mass functionof a discrete random variable X is pX(x). The cumulative distribution function isdenoted by FX(x). For a matrix A of full column rank we de�ne PA = A (A0A)�1A0

and MA = I � PA, both of which depend only on the space spanned by the columns ofA. For any r, lr denotes an r-vector of ones and Cr denotes an r� (r� 1) matrix withthe properties C 0rlr = 0 and C

0rCr = Ir�1.

2.2 Model

We consider the simple additive error model in which a continuous outcome Y is deter-mined by equation (1), with X a single discrete regressor, and " denotes a continuouslydistributed error term. The interest of econometricians typically lies in estimating the

3

Page 5: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

unknown structural function h(�). Consistent nonparametric estimation of h(�) is fea-sible under the assumption that the regressors are exogenous. Numerous de�nitionsof exogeneity have been provided in the literature, see Deaton (2010). The standardexogeneity condition is that of an absence of correlation between the regressor and themodel error term. Here we employ the de�nition proposed by Blundell and Horowitz(2007) for nonparametric regressions: the explanatory variable X is exogenous if theconditional moment restriction E["jX = xk] = 0 holds for all k = 1; :::; K. In that caseE[Y jX] = h(X); i.e. the conditional mean of the dependent variable given X coin-cides with h(X). This de�nition has the advantage that the standard nonparametricregression of Y on X is then appropriate for the consistent estimation of the unknownfunction of interest h(�).In the presence of endogeneity of regressors, further analysis needs to be conducted.

The common strategy to deal with the endogeneity problem is to use instrumentalvariables. However, the choice of a consistent estimation method depends on a char-acteristic of the available instruments. The identifying power of the model varies withthe number of the points of support of the instrumental variable (see Section 2.5). Thecomplete model is characterized by the following set of assumptions:Assumption 1. X is a discrete (scalar) random variable with support fx1; :::; xKgand associated probabilities pk > 0.Assumption 2. There exists a discrete instrumental variable Z with supportfz1; :::; zJg and associated probabilities qj > 0, with the property

E["jZ = zj] = 0; j = 1; :::; J (2)

which de�nes the instrument exogeneity condition.1 The matrix of joint probabilitiesP with elements

pjk = Pr[X = xk \ Z = zj]; j = 1; :::; J ; k = 1; :::; K

is of full rank K when J � K and of full rank J when J < K.Assumption 3. E[XjZ = zj] and E[h(X)jZ = zj] vary with zj. The �rst condi-tion (the instrument relevance condition) together with (2) ensures that Z is a validinstrument.Assumptions 2 and 3 are analogous to the standard assumptions for the validity of

instruments in single equation IV estimation (see, for example, Greene (1993), Section20.4.3)Assumption 4. The data consists of n iid observations on (Y;X;Z); denoted by(ysi ; x

si ; z

si ) for i = 1; :::; n: Under exogeneity, for all j and k,

E["jX = xk; Z = zj] = 0 and V ar["jX = xk; Z = zj] = �2:

1Notice that we include in the support of X and Z only points for which pk and qj are strictlypositive

4

Page 6: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

The complete model consists of equations (1) and (2). We are interested in testingthe null hypothesis of exogeneity of the regressor i.e. E["jX = xk] = 0 for all k.Equivalently, in terms of observables,

H0 : E[Y jX = xk] = h(xk); k = 1; :::; K:

If this condition is satis�ed the unknown function h(�) can be consistently estimatednonparametrically.In equation (1) the function h(�) is unknown and if h(xk) is completely arbitrary,

the null hypothesis would not constrain the conditional density function of Y given X,fY jX(yjx); and would therefore be untestable. Thus, more information than just equa-tion (1) is required for H0 to become a testable hypothesis. This additional informationis acquired by using the fact that there exists a valid instrument Z satisfying (2) forany admissible zj.Let nXk =

Pni=1 I(x

si = xk) and n

Zj =

Pni=1 I(z

si = zj) denote the multiplicities of xk

and zj in the sample, and also njk =Pn

i=1 I(xsi = xk)I(z

si = zj). Under Assumption 2,

the unknown function h(�) satis�es the set of J linear equations

E[Y jZ = zj] =KXk=1

Pr[X = xkjZ = zj]h(xk); j = 1; :::; J: (3)

Let � denote the K-vector with �k = h(xk), k = 1; :::; K, � be the J-vector withthe elements E[Y jZ = zj], j = 1; :::; J , and � be the J � K matrix of conditionalprobabilities Pr[X = xkjZ = zj]; j = 1; :::; J; k = 1; :::; K. Then, (3) can be written asthe system2:

� = ��: (4)

The nonparametric nature of the model is re�ected in the fact that �, the vector ofvalues of h(�) at the support points of X is completely unknown.It is worth noting that equation (4) always has a solution (for �), since for each

j = 1; :::; J; by de�nition

E[Y jZ = zj] =KXk=1

Pr[X = xkjZ = zj]E[Y jX = xk; Z = zj]

so that � is certainly in the space spanned by the columns of �. That � has full rankminfJ;Kg is part of Assumption 2.The hypothesis H0 imposes the constraint that the vector of conditional means

E[Y jX = xk] is a solution to a linear equations � = ��, so in this case the nullhypothesis imposes a restriction on the conditional density function fY jX(yjx) and istherefore testable.

2In the continuous case, equation (4) corresponds to the integral equation for the structural function(2.2) in Blundell and Horowitz (2007).

5

Page 7: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Remark 1 There might be other restrictions that can be imposed on h(�) to make thenull hypothesis testable. In order to make sure that h(�) is not entirely arbitrary, onecould impose some shape restrictions dictated by economic theory. Such restrictions arealready in use in the literature of nonparametric estimation, for example by Hall andHuang (2001) who estimate the conditional mean function subject to a monotonicityconstraint. Monotone estimates are required in many empirical applications, when thetheory suggests that the outcome should be monotonic in explanatory variables e.g. wageincreasing in the years of schooling. Blundell, Horowitz and Parey (2012) use di¤er-ent shape restriction and provide a nonparametric estimator of the demand functionassuming that the unknown function h(�) satis�es the Slutsky condition of consumertheory. The literature suggests that imposing shape restrictions improves the precisionof nonparametric estimates, but in our case it might also act as a tool to ensure thatthe hypothesis of exogeneity of regressors is testable.

The elements of the vector � can be consistently estimated from the data, by aver-aging those yi that correspond to the observations with zsi = zj; i.e. by

b�j = 1n

Pni=1 yiI(z

si = zj)

1n

Pni=1 I(z

si = zj)

=1

nZj

nXi=1

yiI(zsi = zj): (5)

The elements of the matrix of conditional probabilities � can be written as

Pr[X = xkjZ = zj] =Pr[X = xk \ Z = zj]

Pr[Z = zj]

and can be consistently estimated by

b�jk = 1n

Pni=1 I(x

si = xk)I(z

si = zj)

1n

Pni=1 I(z

si = zj)

=njknZj: (6)

Thus, � and � can (ultimately) be learned from the data, and the problem is to usethis information to make inference on h(�):

Remark 2 In the discussion here, and also in what follows, it is implicitly assumedthat all K support points of X; and all J of Z; occur in the sample. That is, that bothnXk and nZj are non-zero for all k = 1; ::; K and j = 1; ::; J . This will ultimately (forlarge enough n) be the case with probability one. The alternative would be to de�neestimates for the �j and �jk only for those points xk and zj that occur in the sample,say Ks � K and Js � J points, and allow these to increase to K and J respectively,as n increases. This would make the arguments and derivations to follow considerablymore cumbersome, without materially a¤ecting the results, so instead we will tacitlyassume throughout that n is large enough to ensure that Ks = K and Js = J:

There is no di¢ culty in extending the results by allowing for additional discreteexogenous regressors in the model as long as there is only one possibly endogenous

6

Page 8: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

explanatory variable. An unresolved issue is how to deal with multiple discrete endoge-nous regressors. Assuming that more than one regressor is endogenous is likely to a¤ectthe identi�cation conditions and existing estimation and testing procedures. The modelwith multiple discrete endogenous regressors will be addressed in future research.

2.3 Linear Model Representation

The above setup can be represented compactly in terms of a linear model. To do so,de�ne the n�K matrix LX with (i; k) element

(LX)ik = I(xsi = xk);

so that (LX)ik = 1 if observation i corresponds to a value xk for X, and is 0 otherwise.Likewise, de�ne the n� J matrix LZ with elements

(LZ)ij = I(zsi = zj):

Note that the row sums of both LX and LZ are 1, because each row of both containsexactly one element that is equal to 1. Both LX and LZ are random matrices, becausethe positions of the non-zero elements, and the multiplicities of each xk and zj; aredetermined randomly in the sample. Let x denote the K-vector with elements xk;k = 1; :::; K, the support points of the regressor, and let xs = LXx denote the n-vectorof sample observations xsi ; i = 1; :::; n. Finally, let y denote the n�vector of sampleobservations on Y; a realization of the random n�vector Y.Using the notation just introduced, (5) can be written as

�̂ = (L0ZLZ)�1L0Zy (7)

and (6) becomes�̂ = (L0ZLZ)

�1L0ZLX : (8)

The inverse in (7) and (8) exists almost surely for large enough sample size3, sincePr[Z = zj] = qj > 0. Note that

n�1L0ZLZ !p diag(qj) := DZ

because 1n

Pni=1 I(z

si = zj) !p E[I(zsi = zj)] = Pr[Z = zj]. Hence, by the Slutsky

Theorem �n�1L0ZLZ

��1 !p D�1Z :

Similarly, the elements of n�1L0ZLX are consistent estimates of the joint probabilitymatrix P: Therefore, b� !p � and b�!p � := D�1

Z P .

3Of course, for existence we require n > K and n > J here. And, as discussed in Remark 2, weare tacitly assuming that n is large enough to ensure that Ks = K and Js = J:

7

Page 9: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Letting X denote the random n�vector of observations on X, the model can bewritten in the familiar form E[YjX = LXx] = LX�+E["jX = LXx], a linear model forthe vector Y with random regressor matrix LX and unknown parameters �k = h(xk);k = 1; :::; K. The null hypothesis then takes the form:

H0 : E[YjX = LXx] = LX�;

Thus, although the model is purely nonparametric, it can be interpreted as a linearmodel. Note that even though in the nonparametric speci�cation there is only onediscrete regressor X, LX is n �K in the linear model speci�cation. Also,observe thatthe support points xk ofX determine the points at which we can learn h(�); i.e., �; but donot appear elsewhere in the linear model. This familiar linear model speci�cation allowsus to connect the nonparametric estimators with well known regression estimators,particularly OLS and 2SLS.

2.4 A complication

There is a relationship between LX and LZ , which has an important implication for thefurther analysis. This is that every sample point is associated with exactly one supportpoint of both X and Z. It follows that, for any regressor X and any instrument Z, therow sums of both LX and LZ are all equal to one. That is,

LX lK = LZ lJ = ln:

Algebraically, this says that the column spaces of LX and LZ always have the vector lnin common, and this needs to be taken into account in adapting existing procedures tothe present problem. Let us, for brevity, call this Property C.Note that Property C implies, in particular,

MLXLZ lJ =MLX ln = 0:

As a consequence of Property C, some matrices involving both LX and LZ have reducedrank. Hence, special attention has to be paid when dealing with these matrices.

2.5 Identi�cation

Newey and Powell (2003) and Das (2005) study identi�cation of the unknown structuralfunction h(�) in the presence of endogeneity of discrete regressors X. Florens andMalavolti (2003) and Das (2005) consider estimation in this framework. They showthat nonparametric identi�cation is achieved if the vector of instruments Z has at leastas many points of support as the endogenous regressor X under a marginal covariationcondition, i.e. E["jZ = z] = c, where c is a constant that is invariant with respect toZ.Using this marginal covariation restriction, one can normalize c = 0; producing the

system of linear equations (4). Since the conditional expectations on the left hand

8

Page 10: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

side and probabilities on the right hand side are observables, (4) forms a set of linearequations in the unknown h(xk); i.e., in �. Hence, the value of the vector � is identi�ed ifand only if the solution to these linear equations is unique. Assuming that equations in(4) represent the only information about h(�) that the data contains, point identi�cationrequires that the matrix � has rank K.

Proposition 1 (Newey and Powell (2003)) The necessary and su¢ cient condition foridenti�cation in the model Y = h(X) + ", with discrete endogenous X and a discreteinstrument Z satisfying E["jZ] = 0; both with �nite support, is that the number of pointsof support of the instrument Z is at least as large as the number of points of support ofendogenous X.4

Hence, if J � K, � is point-identi�ed for known (�;�) and � = (�0�)�1�0�.Even if the identi�cation condition fails, the model still has partial identifying power.Partial identi�cation arises in models, which cannot provide the exact value of theparameter or structure of interest, but contain enough information to bound thesevalues to informative sets. The literature on partial identi�cation has been growingrapidly since the late 1980s. See Tamer (2010) for a detailed review.Chesher (2004) presents the conditions under which the nonparametric model with

discrete endogenous regressors partially identi�es the conditional mean of the outcomeby bounding its value in informative ways when the support of instruments is sparserelative to the support of endogenous regressor. If J < K, even though the exact valueof the vector � in (4) remains unknown, we are able to bound its value by quantitieswhich are easily estimated from the data.

2.6 Estimation

This section presents some basic estimation results under point identi�cation. Thatis, we assume that J � K. Since E[Y jX] = h(X) +E["jX], �k = h(xk) can benonparametrically estimated from the data by averaging the yi corresponding to all xsithat equal xk. Given the linear interpretation of the model, the standard OLS estimatorfor � is

�̂ = (L0XLX)�1L0Xy; (9)

which coincides with the standard nonparametric estimator. The important observationis that the value of the conditional mean of Y given X, does not depend on the valuesxk of X and the con�guration of xk in the sample (the position of non-zero elementsin the matrix LX) does not matter. The only thing that matters is the multiplicity ofeach xk in the sample. Since n�1nXk is a sample proportion, it converges in probabilityto pk i.e. the probability mass on the support point xk.Substituting the linear model y = LX� + " in (9) gives

�̂ = (L0XLX)�1L0Xy

= � + (L0XLX)�1L0X"

4The result can also be found in Matzkin (2007), Chapter 73 in "Handbook of Econometrics"

9

Page 11: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

and since (n�1L0XLX)�1 !p D�1

X whereDX is diag(pk), the matrix of probability masseson each point of support of X on the main diagonal with zero entries elsewhere, andn�1L0X" !p EX [L

0XE["jX]] = 0, under the null hypothesis, we have �̂ !p �; i.e., if X

is exogenous the OLS estimator �̂ is a consistent estimator of �. We can also readilyestablish the asymptotic distribution of the OLS estimator.

Theorem 1 Under the assumptions above, if X is exogenous then the OLS estimatorb� is consistent and pn��̂ � �

�!d N

�0; �2D�1

X

�Remark 3 The primitive components of the elements of �̂ are sums of random numbersof i:i:d:random variables, since the multiplicities and positions of the xk in the sampleare random. At �rst sight, therefore, one might expect to need a central limit theoremadapted to this situation, such as those of, for example, Robbin�s (1948), or Anscombe(1952), both of which deal with this case. However, the problem turns out to be morestraightforward, and Theorem 1 can be proved by using a multivariate version of theLindeberg-Feller central limit theorem (see Appendix).

It can be shown that the covariance matrix �2D�1X , under exogeneity, achieves the

asymptotic Cramer-Rao bound and hence b� is asymptotically e¢ cient. And, the un-known parameter �2 can be consistently estimated by the usual estimator in a linearregression model: n�1y0MLXy !p �2.Of course, if E["jX = xk] 6= 0; i.e. X is endogenous, then

n�1L0X"!p EX [L0XE["jX]] 6= 0

and b� is an inconsistent estimator for �. However, if X is endogenous the unknownfunction h(�) (or vector �) can be estimated using familiar IV methods. When the modelpoint-identi�es the structure of interest, the problem can be treated as a standard IVproblem and the IV estimator for � is

�̂IV =��̂0L0ZLZ�̂

��1�̂0L0ZLZ �̂

=�L0XLZ (L

0ZLZ)

�1L0ZLX

��1L0XLZ (L

0ZLZ)

�1L0Zy

= (L0XPLZLX)�1L0XPLZy:

This is the IV estimator for � in the null model y = LX� + ", in the presence of theinstrument matrix LZ . Even though in the nonparametric speci�cation there is onlyone discrete instrument Z, we have J instrumental values (I(Z = zj), j = 1; :::; J)in the linear regression speci�cation. The matrix of instruments corresponding to thisinterpretation of the model is LZ , so the familiar requirements for the validity of theinstruments are that n�1L0ZLX !p P , a �nite nonsingular matrix; n�1L0Z" !p 0 andn�1L0ZLZ !p DZ , a positive de�nite matrix (Greene (1993), p.601). All these conditions

10

Page 12: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

are covered by Assumption 2. Note that the matrix P is the matrix of joint probabilities,and the full rank assumption for P requires: in the case J � K, that there is no non-zeroK�1 vector x for which Px = 0, and in the case J < K, there is no non-zero J�1 vectorz such that P 0z = 0. The last condition follows from the fact that DZ = diag(q1; :::; qJ)with qj > 0.The IV estimator is consistent in both scenarios: whenX is exogenous and when it is

endogenous, since n�1L0ZLX !p P , n�1L0ZLZ !p DZ and n�1L0Z"!p EZ [L0ZE["jZ]] =

0. The last expression follows because of instrument exogeneity condition (2). Theasymptotic normality of the IV estimator is established through:

Theorem 2 Under assumptions above, the IV estimator �̂IV is consistent and

pn��̂IV � �

�!d N

�0; �2

�P 0D�1

Z P��1�

:

It can be shown that the IV estimator de�ned for the linear representation of thenonparametric model is equivalent to the standard nonparametric estimator (see, forexample Das (2005)). The advantage of our approach is that the estimator can bewritten in a compact matrix notation, which is easier to work with.It is crucial to understand that because K and J are �xed, we cannot estimate

the entire unknown function h(�), but can only learn about speci�c values of h(�) atthe support points. Additional information about h(�), could possibly be acquired ifthe support of the regressor (and instrument) were assumed to be increasing with thesample size. Allowing for growing dimensions could be considered as an abstract wayof generating asymptotic approximations to the distributions of estimators and mightresult in di¤erent limiting behaviour instead of Theorems 1 and 2. Additionally, lettingboth J and K grow at a rate that is proportional to n, would have an impact on theidenti�cation analysis. It is possible that a model that is only set identi�ed (J < K) insmall samples, point identi�es h(�) in large samples if K is �xed and J increases with n;or if J grows faster than K. Therefore, considering such increasing dimensions mightbe an interesting extension of our work, but this topic is left for further research.

3 Testing for exogeneity under point identi�cation

Assume that J � K and the model point identi�es the unknown function of interest h(�)by Proposition 1. The OLS estimator b� is consistent and e¢ cient if X is exogenous, butinconsistent otherwise. The IV estimator is consistent in both cases, but ine¢ cient if Xis exogenous. For this situation, then, the test is really just to decide which estimatorto use (OLS or IV).The standard Wu-Hausman-type statistic for testing exogeneity in this context is

based on a quadratic form in the di¤erence between the two estimators �̂IV and �̂,namely

�̂IV � �̂ = (L0XPLZLX)�1L0XPLZMLXy; (10)

11

Page 13: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

with the matrix of the quadratic form equal to the inverse of Cov(�̂IV � �̂), in order toproduce a �2 variable asymptotically (Hausman (1978)). The covariance matrix of thedi¤erence is given by

Cov(�̂IV � �̂) = (L0XPLZLX)�1L0XPLZMLXPLZLX (L

0XPLZLX)

�1: (11)

However, in this case, Property C implies that this covariance matrix is singular. Tosee this, observe that

l0K (L0XPLZLX) (�̂IV � �̂) = l0KL

0XPLZMLXy

= l0nPLZMLXy (LX lK = ln)

= l0nMLXy (PLZ ln = ln)

= 0 (MLX ln = 0):

That is, for all LX and LZ there is an exact linear relation between the elements of�̂IV � �̂, so its covariance matrix will always be singular.We therefore need to adapt the Wu-Hausman test statistic to this situation. To

do so we simply replace the inverse of the covariance matrix - the matrix that wouldnormally be used in the quadratic form to produce an asymptotically �2 test statistic -by a generalized inverse of that matrix. The covariance matrix in (11) can be writtenas

S = (L0XPLZLX)�1CK [C

0KL

0XPLZMLXPLZLXCK ]C

0K (L

0XPLZLX)

�1;

sinceMLXPLZLX [lK ; CK ] = [0;MLXPLZLXCK ] and [lK ; CK ]�1 = [K�1lK ; CK ]

0 (see sec-tion 2.1 for notation). The middle matrix C 0KL

0XPLZMLXPLZLXCK is a (K�1)�square

matrix of full rank. Thus, the covariance matrix can be expressed as a matrix of theform S = A�1CBC 0A�1, where C is m � p, C 0C = Ip, B is p � p nonsingular andsymmetric, and A is m �m nonsingular and symmetric. The generalized inverse of amatrix with this form is S+ = ACB�1C 0A. To verify this it is su¢ cient to check thatthe two conditions that de�ne a generalized inverse, i.e. SS+S = S and S+SS+ = S+;both hold.Therefore, the generalized inverse of the covariance matrix is

S+ = (L0XPLZLX)CK [C0KL

0XPLZMLXPLZLXCK ]

�1C 0K (L

0XPLZLX) :

Using this matrix to de�ne the test statistic, we have

T �n = y0MLXPLZLXCK [C0KL

0XPLZMLXPLZLXCK ]

�1C 0KL

0XPLZMLXy

= y0WXZ (W0XZWXZ)

�1W 0XZy

where WXZ =MLXPLZLXCK is n� (K � 1).Scaling to eliminate �2, we propose the test-statistic

Tn =y0WXZ (W

0XZWXZ)

�1W 0XZy

n�1y0MLXy: (12)

12

Page 14: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Observe that the values xk of X and zj of Z do not appear in the test statistic,nor does their con�guration in the sample matter. The only things that appear are themultiplicities of each value in the sample, the nXk and n

Zj ; and the multiplicity of the

joint event (X = xk; Z = zj), njk. Note that the numerator of the modi�ed versionof Tn is easily computed from a linear regression of y on WXZ . Since WXZ is easy toconstruct in practice, the value of the test-statistic might be e¢ ciently calculated byany statistical software package.

Remark 4 Using the generalized inverse is not the only way to deal with singularityof the covariance matrix. The naive approach would be to reduce the dimension of thetest-statistic by eliminating for example the �rst element in the di¤erence (10) and pick-ing up the lower-right corner of the covariance matrix in (11). Then the Wu-Hausmantest-statistic of reduced dimension would follow standard results. An alternative ap-proach would be to use the Moore-Penrose inverse of the covariance matrix (built intoall econometric software). All three approaches give similar values of the test-statistic,thus in applications, the researcher could choose the method that is most convenient.

3.1 Asymptotic distribution under the null hypothesis

To discuss the asymptotic distribution of Tn, de�ne the (J � 1)� 1 vector

zn = C0JL

0ZMLXy = C

0JL

0ZMLX": (13)

The primitive components of zn are the two vectors un = L0Z" and vn = L0X". Thus, we

�rst consider the asymptotic behaviour of these two vectors, i.e. the joint asymptoticdistribution of

1pnwn =

1pn

�unvn

�:

This is given in:

Lemma 1 Under H0 and the given assumptions,

1pnwn !d N

��00

�; �2

�DZ PP 0 DX

��:

This result will also be useful in the set-identi�ed model later. Now, zn is a linearfunction of un and vn;

1pnzn =

1pnC 0J

�un � L0ZLX (L0XLX)

�1vn

�with

p limn!1

L0ZLXn

�L0XLXn

��1= PD�1

X :

We therefore immediately obtain

13

Page 15: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Lemma 2 Under H0 and the given assumptions,

1pnzn !d N(0; �2�)

where � = C 0J(DZ � PD�1X P

0)CJ is positive de�nite.

The numerator of the proposed test-statistic in (12) is a quadratic form in zn :

T �n = z0nAnB

�1n A

0nzn

whereAn = C

0J (L

0ZLZ)

�1L0ZLXCK

is a (J � 1)� (K � 1) matrix with probability limit equal to

A = C 0JD�1Z PCK ;

andBn = A

0n (C

0JL

0ZMLXLZCJ)An

is (K � 1)�square matrix.Using these results we obtain the asymptotic distribution of Tn under the null hy-

pothesis:

Theorem 3 Under H0, and the assumptions above,

Tn !d �2K�1:

The asymptotic behaviour of the test-statistic under the null hypothesis is fullycharacterized by the �2 distribution. Therefore, for practical applications, the criticalvalues can be easily obtained from statistical tables. The accuracy of this asymptoticresult is examined in Section 5.

3.2 Asymptotics under the alternative hypothesis

In this section, we establish the asymptotic distribution of the test-statistic under asequence of local alternatives, and in order to show that the proposed test is consistent,i.e. the power of the test approaches 1 as n!1, we discuss the asymptotic behaviourof Tn under a �xed alternative hypothesis.

14

Page 16: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

3.2.1 Local alternatives

Let m(X;V ) be a bounded function, depending on X and another variable V; whichdoes not appear in the model and is independent ofX. Assume now that the conditionalexpectation of the error term is given by:

E["jX = x; V = v] = E["jX] = m(x; v);

and de�ne the K vector

m =

24 m(x1; v):::

m(xK ; v)

35 :In the linear representation of the model, we have

E["jX = LXx] = LXm: (14)

To derive the asymptotic distribution of the test statistic under the alternative hy-pothesis, consider the sequence of local alternatives in whichE["jX = LXx] = n

� 12LXm.

Theorem 4 Under the sequence of local alternatives to (14) and the assumptions above,the test statistic Tn converges to a non-central �2K�1(�L) distribution, with the noncen-trality parameter

�L =�0A (A0�A)�1A0�

�2

where � = �C 0JPm, A = C 0JD�1Z PCK and � = C

0J [DZ � PD�1

X P0]CJ :

The proof of Theorem 4 is based on familiar results for quadratic forms in normalvariables with non-zero mean. The asymptotic behaviour of the test-statistic is capturedby the non-central �2 distribution. For a given size of test, the power increases withnoncentrality parameter �L. The value of this parameter depends on the the distancebetween an inconsistent OLS and consistent IV estimators. Hence, the test is morepowerful if the probability limit of the OLS estimator is far from the true value of theparameter of interest.

3.2.2 Fixed alternatives

Let us next consider �xed alternatives of form H1 : E ("ijX = xk) = m(xk; vi). Buildingon the results used in the previous section, by a simple generalization of Lemma 1, weobtain

1pn

�L0Z"

L0X"

�!d N

��0p

nDXm

�; �2

�DZ PP 0 DX

��Additionally

1pnzn !d N(�S; �

2�)

15

Page 17: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

with �S = �pnC 0JPm :=

pn�; i.e. the mean is proportional to the square root of the

sample size.Therefore, under �xed alternatives the test statistic in (12) converges to a non-

central Chi-square distribution with (K � 1) degrees of freedom and noncentrality pa-rameter �F equal to

�F =�0SA (A

0�A)�1A0�S�2

= n�L

The following proposition establishes the consistency of the test against a �xed alter-native hypothesis.

Proposition 2 Under �xed alternatives and the earlier assumptions, the proposed testis consistent, i.e., for any �xed constant c�;

Pr (Tn > c�)! 1 as n!1

Since �S is a multiple ofpn, the noncentrality parameter is proportional to the

sample size. This implies that if the alternative hypothesis holds, as n ! 1, thechi-square distribution moves to the right and the probability of rejecting a false nullhypothesis increases, i.e. p limn!1 Pr(�

2K�1(�F ) > c�) = 1. Hence, as n ! 1, the

power of the test converges to 1 and the test is said to be consistent.

Remark 5 Under the alternative hypothesis we will have E["jX = xk] 6= 0 for at leastone value of k. For some speci�cations of how these values are determined the testsproposed above will have no power. This occurs if, when the null hypothesis fails,

E[Y jX = xk] = h(xk) + �(xk)

where �(xk) = E["jX = xk] depends only on xk. In this case we will have the model

y = LX(� + �) + e"where e" = " � �, which is identical to the original model with the unknown h replacedby the also-unknown h + �. Thus, it is not surprising that the test should have powerequal to size in this circumstance.

4 Testing for exogeneity under set identi�cation

In this situation (J < K) there is no consistent estimator (in the conventional sense)for � if X is endogenous, so in this case the test is to decide whether point estimationof � is even possible. When J < K the Wu-Hausman approach to testing H0 is notavailable. However, assuming the existence of an instrument Z with the propertiesgiven above, � is constrained to satisfy the linear equations � = ��, but is not pointidenti�ed by them. That is, there is a set of vectors �, a subset of RK , that satisfy

16

Page 18: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

these equations, of dimension K � J . The model maintains that � belongs to this set,and H0 says that E[YjX = LXx] = LX�.Now, consider the empirical counterpart of the system � = ��, namely b� = b��, and

the vector � that, among all solutions to this system, minimizes (y � LX�)0(y � LX�).That is, de�ne b�Z = arg min

�:b�=b��(y � LX�)0(y � LX�):Straightforward algebra gives

b�Z = b� + (L0XLX)�1 b�0 �b�(L0XLX)�1 b�0��1 �b� � b�b��= b� + (L0XLX)�1 L0XLZ (L0ZPLXLZ)�1 L0ZMLXy;

where b� is the OLS estimator de�ned earlier. The minimum achieved by this choice for� is therefore

Qn = (y � LXb�Z)0(y � LXb�Z)= y0MLXy + y

0MLXLZ (L0ZPLXLZ)

�1L0ZMLXy:

Intuitively, a large value for this minimum sum of squares is evidence against H0,because it means that, among all solutions to b� = b��, none produces a small value of(y � LX�)0(y � LX�). This suggests, not that � 6= ��, because this is ruled out, butrather that E[YjX = LXx] 6= LX�; i.e. that the null hypothesis is false. NormalizingQn by dividing by n�1y0MLXy, this argument suggests rejecting H0 when the statistic

Rn =y0MLXLZ (L

0ZPLXLZ)

�1 L0ZMLXy

n�1y0MLXy

is large.Now, in view of Property C,

MLXLZ [lJ ; CJ ] = [MLX ln;MLXLZCJ ] = [0;MLXLZCJ ]

and, the (2; 2) block of

[[lJ ; CJ ]0 (L0ZPLXLZ) [lJ ; CJ ]]

�1 =

�n l0nLZCJ

C 0JL0Z ln C 0JL

0ZPLXLZCJ

��1is given by

(C 0JL0Z [PLX � Pln ]LZCJ)

�1:

Thus, after taking account of Property C, Rn reduces to

Rn =y0MLXLZCJ (C

0JL

0Z [PLX � Pln ]LZCJ)

�1C 0JL0ZMLXy

n�1y0MLXy(15)

with the middle matrix being (J � 1) square. Thus, although at �rst sight a quadraticform involving J variables, the numerator of Rn in fact involves only J � 1 terms.

17

Page 19: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

4.1 Asymptotic distribution under the null hypothesis

The following theorem gives the asymptotic distribution of the test statistic under thenull hypothesis.

Theorem 5 Under H0 and the assumptions above,

Rn !d

J�1Xj=1

!j�2j(1)

where the !j are positive eigenvalues satisfying

det[�� !] = 0

with = p lim

n!1

1

n[C 0JL

0Z(PLX � Pln)LZCJ ]

and� = p lim

n!1

1

n[C 0JL

0ZMLXLZCJ ]

and the �2j(1) variables are independent copies of a �21 random variable.

The proposed test-statistic converges to a quadratic form in a normal vector z, andthe distribution of that quadratic form is given by the distribution of a weighted sumof chi-square (1) random variables.The asymptotic distribution of the proposed test with discrete regressors and instru-

ments is similar to the distribution obtained by Blundell and Horowitz (2007) for thecontinuous case. Their test-statistic follows asymptotically the distribution of an in�-nite sum of weighted chi-square variables with 1 degree of freedom. When calculatingthe critical values, they face the additional problem of approximating an in�nite sumby a �nite number of terms. In the discrete case, the asymptotic distribution is morestraightforward, since it is based on a �nite sum of terms due to the discrete nature ofvariables. Nonetheless, the distribution theory for such variables is complicated, andthere is an incentive to use approximations, and several have been discussed extensivelyin the literature. In Section 4.3 we discuss the approximation proposed by Hall (1983)and further explored by Buckley and Eagleson (1988), which allows us to compute thecritical values in practical applications.

4.2 Asymptotics under the alternative hypothesis

This section obtains the asymptotic distribution of Rn under a sequence of local alter-natives. The test is also shown to be consistent against �xed alternatives.

18

Page 20: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

4.2.1 Local alternatives

Consider the sequence of local alternatives to (14). Using the vector zn de�ned in (13),the numerator of the test statistic in (15) can be written as

R�n = z0n (C

0JL

0Z(PLX � Pln)LZCJ)

�1zn

with n�12 zn !d N(�; �2�) with � = �C 0JPm and � = C 0J [DZ � PD�1

X P0]CJ as before.

The following theorem establishes the asymptotic distribution of the test-statisticunder local alternatives.

Theorem 6 Under the sequence of local alternatives to (14) and the assumptions above,the test statistic Rn converges to a distribution of a weighted sum of non-central chi-square random variables:

Rn !d

J�1Xj=1

!j�21(�

2j)

with the noncentrality parameters

(�1; :::; �J�1)0 = S 0��

12� = �S 0�� 1

2C 0JPm

where S denotes the orthogonal matrix of the eigenvectors of ��12�1��

12 .

Under local alternatives, the test-statistic asymptotically follows the distribution ofa weighted sum of non-central chi-square (1) variables. This result again correspondsto the distribution obtained by Blundell and Horowitz (2007) for the continuous case.

4.2.2 Fixed alternatives

Under �xed alternatives (14) the test statistic in (15) converges to a weighted sum ofnoncentral �2(1) random variables,

PJ�1j=1 !j�

21(�

2j); with

(�1; :::; �J�1)0 = �

pnS 0��

12C 0JPm

Since the noncentrality parameter is again proportional to the sample size for eachterm, the power of the test goes to 1 as n!1.

Proposition 3 Under �xed alternatives, the proposed test is consistent, i.e., for any�xed constant c�;

Pr (Rn > c�)! 1 as n!1

19

Page 21: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

4.3 Computation of critical values

The asymptotic distribution of the test-statistic is non-standard and depends on theweights !j, which, in practice, need to be estimated from the data. Since we cannotprovide statistical tables with the appropriate tail probabilities and cut o¤ points, itis essential to �nd a quick technique for calculating the critical values of the proposedtest.Although the distribution of the weighted sum of chi-square variables has been

studied in the literature since 1960�s and the explicit formulas for the probability densityfunction and a cumulative distribution function have been derived, they are rathercomplicated and di¢ cult to handle in empirical applications. From the practical pointof view, in order to calculate the critical values for the proposed test, it is crucial to beable to approximate the process of interest by a well known structure. Alternatively,one could use the inverse interpolation procedure of �nding the critical values proposedby Sheil and Muircheartaigh (1977). However, this method is computationally intensiveand requires specifying the upper and lower bounds on the weights, which we wouldlike to avoid.There are numerous ways of computing the critical values in this case. Letting b!j

be consistent estimators of the weights !j under H0, the distribution ofPJ�1

j=1 b!j�2j(1)can be simulated and appropriate 1 � � quantiles can be used as critical values inthe standard rejection rule. However, our experiments show that this approach iscomputationally intensive and time consuming. The second method involves simulatingthe quadratic form z0b�1z with z � N(0; b�) and computing the quantiles. This methoddelivers satisfactory results and reduces the simulation time signi�cantly. The thirdmethod is based on using an approximation to the distribution of a weighted sum ofchi-square variables.Even though a linear combination of independent chi-squared variables is, under

regularity conditions, known to be asymptotically normally distributed when the sam-ple size tends to 1 (Johnson, Kotz and Balakrishnan (1994), p.444), the simulationsreveal the unsatisfactory performance of the normal approximation. Hence, we suggestapplying the approximation proposed by Hall (1983) and further explored by Buckleyand Eagleson (1988), where the distribution of a weighted sum of �21 random variablesis approximated by the distribution of a variate ~R = a�2v + b by choosing (a; b; v) sothat the �rst three cumulants of R and ~R agree.The cumulants �l of a random variable are de�ned via the cumulant-generating func-

tion K(t), which is the logarithm of the characteristic function �(t) with the followingexpansion (Muirhead (1982), p.40)

K(t) = log(�(t)) =1Xl=1

�l(it)l

l!:

Since the characteristic function �(t) of a chi-square random variable with r degrees offreedom is

�(t) = (1� 2it)� r2

20

Page 22: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

the cumulant generating function K(t) of �2(r) variable is

K(t) = �r2log(1� 2it) = 1

2r

1Xl=1

(2it)l

l

and the cumulants �l solve1Xl=1

�l(it)l

l!=1

2r

1Xl=1

(2it)l

l:

Let R =PJ�1

j=1 !j�2j(1). The cumulants of this chi-squared-type mixture are given by

5

�l(R) = 2l�1(l � 1)!

J�1Xj=1

!lj:

Therefore, the �rst three cumulants of R are

�1(R) = E(R) =J�1Xj=1

!j = trace(��1)

�2(R) = V ar(R) = 2J�1Xj=1

!2j = 2trace����1

�2��3(R) = E

�(R� E(R))3

�= 8

J�1Xj=1

!3j = 8trace����1

�3�:

The cumulants of ~R = a�2v + b are:

�1( ~R) = av + b; �2( ~R) = 2a2v; �3( ~R) = 8a

3v:

To determine the parameters a; b and v we set �m( ~R) = �m(R) for m = 1; 2; 3 whichleads to

a =�3(R)

4�2(R)(16)

b = �1(R)�2�22(R)

�3(R)

v =8�32(R)

�23(R):

Hence the approximate cumulative distribution of R is

FR(t) = Pr(R � t) � Pr( ~R � t) = Pr��2v �

t� ba

�:

5See Severini (2005), Theorem 8.5, p. 245

21

Page 23: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

The critical value c� solves

1� Pr��2v �

c� � ba

�= �

for � = 1%; 5% or 10%.Note that parameter v is typically not an integer and the �2v distribution here is

in fact a gamma distribution with parameters 12and v

2. In practice, the matrix ��1

is unknown and, in order to calculate the values of parameters in (16), it has to bereplaced by its consistent estimate:

C 0JL0ZMLXLZCJ [C

0JL

0Z(PLX � Pln)LZCJ ]

�1:

An alternative (and popular) procedure of obtaining the critical values, based onthe numerical inversion of the characteristic function, was proposed by Imhof (1961).This procedure is much more computationally intensive, since it requires the knowledgeof all eigenvalues of ��1; while for the three-cumulants approximation only the tracesof powers of this matrix are needed.

5 Monte Carlo simulations

In this section, we discuss the results of Monte Carlo simulations designed to examinethe �nite sample size and power properties of the proposed tests. We modify Blundelland Horowitz�s (2007) setup by generating X and Z as discrete random variables.

5.1 Simulation design

In the experiments, realizations of (X;Z) are generated as Z = Binomial(J � 1; pZ)with pZ = 0:5 and X is a function of Z such that

X = xk if a < X� � b

where a and b are constants, and X� = �Z + (1 � �2)1=2� with v � N(0; 1) and� 2 f0:35; 0:7g. Note that � measures the strength of the relationship between X andZ. Weak instruments are characterized by � = 0:35 and � = 0:7 characterizes stronginstruments. The realizations of a continuous outcome Y are generated from

Y = �0 + �1X + �""

where " = �v + (1 � �2) 12u with u � N(0; 1) and �0 = 0, �1 = 0:5 and �" = 0:2. Theparameter � measures the strength of the relationship between X and ", and its valuevaries across experiments. The null hypothesis is true if � = 0 and false otherwise. Theexperiments use sample sizes of n = 50; 100; 200; 400 and 1000 observations and thereare 2000 Monte Carlo replications in each experiment.

22

Page 24: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

5.2 Size analysis J � KRecall that under the null hypothesis Tn !d �2K�1, so the critical values are easilyobtained from statistical tables. For the size analysis, � = 0 and the errors are generatedas N(0; 1). The empirical size of the proposed test for di¤erent combinations of J andK (satisfying J � K) is presented in Tables 1 and 2.

K=2 J=2 J=3 J=4� sample size 1% 5% 10% 1% 5% 10% 1% 5% 10%

50 1.20 5.25 10.25 0.85 5.85 10.80 0.95 5.05 10.200.35 100 0.90 4.95 10.05 1.00 4.95 9.35 0.95 5.35 10.50

200 1.30 5.10 10.20 1.10 5.05 10.45 1.10 5.10 11.20400 0.85 5.20 10.20 1.10 5.05 10.45 1.10 5.10 10.2550 1.40 5.70 10.95 1.25 5.05 10.05 1.05 5.60 10.50

0.7 100 0.80 4.50 9.55 0.85 4.95 10.10 1.25 4.95 9.65200 1.10 4.55 10.80 1.25 4.85 10.10 0.95 5.40 9.70400 1.25 5.20 10.10 0.95 4.95 10.50 1.10 5.15 10.40

Table 1: Proportion of rejections under the null hypothesis; K=2

K=3 J=3 J=4 J=5� sample size 1% 5% 10% 1% 5% 10% 1% 5% 10%

50 0.85 5.50 11.50 0.75 4.75 9.35 1.10 5.40 10.200.35 100 0.95 5.35 10.75 0.85 5.35 11.20 1.05 5.45 10.30

200 0.80 4.75 10.10 1.25 5.25 10.95 1.25 5.50 10.45400 0.85 4.85 10.25 1.10 5.05 9.85 1.20 5.10 9.5550 0.85 5.70 11.40 0.95 5.60 10.15 0.90 5.10 10.65

0.7 100 1.05 5.90 11.20 1.15 5.15 10.70 0.95 5.15 10.30200 1.00 5.40 10.65 1.25 5.05 10.35 1.10 5.75 10.70400 1.45 5.80 10.65 0.95 5.45 10.55 1.05 5.10 9.65

Table 2: Proportion of rejections under the null hypothesis; K=3

The empirical size is reasonably close to the nominal values of 1%, 5% and 10%,even in small samples of 50 observations. The size seems not very sensitive to changesin the number of points of support of the endogenous regressor and instrument and donot vary with the strength of instrument.

5.3 Power analysis J � KFor the power analysis, the errors are generated as " = �v + (1 � �2) 12u, u � N(0; 1).Recall that this speci�cation excludes the alternatives with E["jX = xk] = �(xk) inwhich the power is equal to the size of the test. The results of power analysis at 5%signi�cance level for di¤erent sample sizes are summarized in Figures 1 and 2.

23

Page 25: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Figure 1: Empirical power for K=2 and J=2 with weak (a) and strong (b) instruments

Figure 2: Empirical power for K=3 and J=3 with weak (a) and strong (b) instruments

24

Page 26: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

The proposed test exhibits satisfactory power properties. The empirical power in-creases with a sample size and converges to 1 quickly. For a �xed number of supportpoints of the endogenous regressor and instrument, the empirical power is higher if theinstrument used in experiment are strong. The test has also higher power if the supportof endogenous regressor is larger.Figures 3 and 4 show how the empirical power changes with the number of points

of support of the instrument.

Figure 3: Empirical power for K=2 and n=400 with weak (a) and strong (b) instruments

Figure 4: Empirical power for K=3 and n=400 with weak (a) and strong (b) instruments

If the instrument is weak, for �xedK, the empirical power of the test increases whenadditional point of support is added. Therefore, for weak instruments, the larger thesupport of Z, the more powerful the test is. This suggests that in practice the researcher

25

Page 27: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

should look for an instrument with many support points to increase the probability ofdetecting the endogeneity of regressor.On the other hand, if the instrument is strong, the empirical power remains roughly

the same if the di¤erence between the support of X and Z is small, but decreases withthe gap between J and K.

5.4 Size analysis J < K

We have experimented with di¤erent methods of computing the critical values for theproposed test. The three methods proposed in Section 4.3 produce very similar resultsfor the empirical size and power of the test. In this section, we present the resultsbased on the chi-square approximation, which minimizes the computational time. Theempirical size of the proposed test is presented in Tables 3 and 4.

K=5 J=2 J=3 J=4� sample size 1% 5% 10% 1% 5% 10% 1% 5% 10%

50 0.85 5.80 11.10 1.25 5.30 9.65 1.45 5.65 10.800.35 100 1.10 5.00 10.45 1.20 5.65 10.85 0.85 5.25 10.80

200 0.90 4.85 10.00 0.80 4.55 9.65 0.95 4.55 9.65400 0.85 5.50 10.50 1.15 6.15 10.55 1.10 5.50 9.7550 1.30 5.95 11.20 1.05 5.65 11.30 1.35 5.60 10.80

0.7 100 1.20 6.10 11.50 1.35 5.85 11.35 1.55 5.80 11.20200 1.15 5.80 10.90 1.05 4.75 9.70 1.10 5.35 10.35400 1.20 5.65 9.80 1.05 5.30 10.10 0.95 4.95 10.50

Table 3: Proportion of rejections under the null hypothesis; K=5

K=6 J=3 J=4 J=5� sample size 1% 5% 10% 1% 5% 10% 1% 5% 10%

50 1.20 5.80 11.10 1.25 6.10 12.05 1.60 5.90 11.200.35 100 1.20 5.70 10.50 0.85 5.25 10.95 1.10 6.15 11.50

200 1.05 4.80 10.15 1.00 5.60 11.10 1.50 5.55 10.40400 0.95 4.65 9.80 0.90 5.05 9.85 0.90 5.25 10.5050 1.20 6.15 11.30 1.10 5.05 9.25 1.25 5.85 9.75

0.7 100 1.05 5.35 10.55 0.95 4.75 10.30 1.60 5.70 10.15200 1.15 5.10 9.85 0.85 5.05 9.90 1.65 5.90 11.20400 0.90 5.40 10.75 1.05 5.65 11.10 0.95 5.20 10.40

Table 4: Proportion of rejections under the null hypothesis; K=6

.The test has adequate size in all cases, even in the small samples of 50 observations.

The size is not sensitive to changes in the number of points of support and the strengthof the relationship between endogenous regressor and the instrument.

26

Page 28: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

5.5 Power analysis J < K

The results of a power analysis at 5% signi�cance level are presented in Figures 5 and6.

Figure 5: Empirical power for K=5 and J=2 with weak (a) and strong (b) instruments

Figure 6: Empirical power for K=6 and J=3 with weak (a) and strong (b) instruments

The empirical power increases with the sample size and in some cases (strong in-struments and large �) converges quickly to 1. The proposed test performs particularlywell if the instruments used in experiment are strong. In general, the results are morethan satisfactory given the fact that the model is only partially identi�ed under the al-ternative hypothesis. A few testing procedures for partially identi�ed models developedrecently are typically complicated and allow to test a limited range of hypotheses. Weprovide the simple exogeneity test based on the standard results that can be applied inthis conventionally untestable context.

27

Page 29: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Figures 7 and 8 show how the empirical power changes with the number of pointsof support of the instrument.

Figure 7: Empirical power for K=5 and n=400 with weak (a) and strong (b) instruments

Figure 8: Empirical power for K=6 and n=400 with weak (a) and strong (b) instruments

For a �xed number of points of support of the regressor, the proposed test detectsendogeneity of the regressor better when the support of the instrument is smaller.Hence, for both, weak and strong instruments, the power of the test is decreasing withthe number of points of support in the Z. Therefore, in applications in order to obtainhigher power in detecting endogeneity, among all the instruments available, the onewith the smallest number of support points should be chosen. Note that if the gapbetween K and J is small, the test tends to be more powerful with weak instruments.This counter intuitive behaviour of the power function might be due to the fact thatthe chi-squared approximation is more accurate with smaller J . Simulations reveal

28

Page 30: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

that the approximation error is small up to 5 terms in the weighted sum. Therefore, inexperiments with large support of instrumental variable, the critical values should becomputed using another method discussed above.

6 Conclusion

The consistency of a standard nonparametric estimation procedures fails in the presenceof endogeneity in the model. Therefore, in order to choose a consistent estimationtechnique, the applied researcher should test whether the explanatory variable(s) usedin the model are exogenous. This paper has provided two consistent tests for exogeneityin nonparametric models, when the single explanatory variable is discrete. To the bestof our knowledge, there exist no such tests for nonparametric models with discreteregressors. In models that point identify the unknown function of interest, the testis built on a quadratic form of a di¤erence between two estimators, one of which isconsistent only under exogeneity and the other is consistent under both scenarios. Thistesting framework follows closely the Wu-Hausman-type of test. It has been shown thatunder the null hypothesis of exogeneity, the test statistic follows chi-square distributionasymptotically and that the test is consistent against �xed alternatives.In models that set identify the structure of interest, the test-statistic is based on a

constrained minimized sum of squares. We have shown that under the null hypothesis,the proposed test-statistic converges to a weighted sum of chi-square (1) random vari-ables. Under the alternative hypothesis the test-statistic converges to a weighted sumof noncentral chi-square (1) random variables. The proposed test is thus shown to beconsistent with asymptotic power approaching 1 as the sample size increases.The results of Monte Carlo simulations have shown satisfactory �nite-sample prop-

erties of the proposed tests. Based on our experiment, we can conclude that:

� both tests have correct size even in small samples,

� empirical power increases with the sample size and converges to 1,

� using a strong instrument leads to better power properties,

� empirical power changes with the number of support points of both endogenousregressor and instrument.

Particularly interesting is the fact that the power increases with the gap betweenthe number of points of support of the variables. Therefore, assuming that there is achoice between valid instruments for the applied researcher, when J � K, they shouldchoose the one with the most points of support (when the instruments are weak), andwhen J < K choose the one with the smallest number of support points in order toincrease the probability of detecting endogeneity of the regressor.

29

Page 31: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Appendix: Proofs

The crucial result underlying the analysis is the asymptotic distribution of a vector("0LX "

0LZ)0. Therefore, we �rst prove Lemma 1 and use it to discuss other results.

Proof of Lemmas 1 and 2

Clearly E[wn] = 0, since for each j = 1; :::J E[unj] = EZ [P

i I(zsi = zj)E["ijzsi ]] = 0

and for each k = 1; :::K,

E[vnk] =Xi

EX [I(xsi = xk)E["ijxsi ]] = 0:

Therefore,

V ar(unj) = E

nXi=1

"iI(zsi = zj)

!2= n�2qj

and

V ar(vnk) = E

nXi=1

"iI(xsi = xk)

!2= n�2pk:

The covariance between unj and vnk is

cov(unj; vnk) = E

nXi=1

"iI(zsi = zj)

! nXi=1

"iI(xsi = xk)

!

= E

nXi=1

"2i I(zsi = zj)I(x

si = xk)

!

= �2E

nXi=1

I(zsi = zj)I(xsi = xk)

!= n�2pjk:

The covariance between two di¤erent elements of un is zero, because for j 6= l

cov(unj; unl) = E

nXi=1

"2i I(zsi = zj)I(z

si = zl)

!= 0;

since I(zsi = zj)I(zsi = zl) = 0; and the indicated events cannot occur simultaneously.

Similarly, for k 6= s

cov(vnk; vns) = E

nXi=1

"2i I(xsi = xk)I(x

si = xs)

!= 0:

30

Page 32: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

The covariance matrix of the vector wn =�unvn

�is therefore

nV = n�2�DZ PP 0 DX

�with V �nite. Because the components of wn are correlated, we need a multivariateversion of the Lindeberg-Feller central limit theorem to establish the asymptotic nor-mality of 1p

nwn (see, for example, van der Vaart (1998), Section 2.8). The stability

condition (�nite V ) is clear, so to establish the result we need to con�rm the Lindebergcondition

1

nE

"nXi=1

jjwijj2Ifjwij >pn�g#! 0 for all � > 0:

Firstly, observe that

jjwijj2 =KXk=1

"2i I(xsi = xk) +

JXj=1

"2i I(zsi = zj)

= "2i

KXk=1

I(xsi = xk) +JXj=1

I(zsi = zj)

!= 2"2i

sincePK

k=1 I(xsi = xk) =

PJj=1 I(z

si = zj) = 1. These results give

jjwijj2Ifjwij >pn�g � jjwijj2 = 2"2i ;

with E[2"2i ] = 2�2 <1; and

limn!1

jjwijj2Ifjwij >pn�g = lim

n!12"2i Ifj

p2"ij >

pn�g

= limn!1

2"2i If2"2i > n�2g = 0:

Therefore, by the dominated convergence theorem (see for example Severini, (2005),Theorem 1.10 (vi), p. 31), we have the Lindeberg condition:

limn!1

E�jjwijj2Ifjwij >

pn�g�= 0:

Thus,1pnwn !d N

��00

�; �2

�DZ PP 0 DX

��; (17)

as claimed. Since zn is a linear combination of un and vn, by Slutsky�s Theorem, Lemma2 follows immediately.Since � represents the covariance matrix, we have to show that it is positive de�nite.

To do so, �rst observe that neither the support of Z; nor that of X; can a¤ect the

31

Page 33: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

properties of wn: That is to say, such properties must be invariant to the support of Z (orX); and hence hold for arbitrary support vectors z (or x). Now, the key matrix in � isDZ�PD�1

X P0 = DZ�PD�1

X DXD�1X P

0. Let a denote a J-vector of hypothetical supportpoints of Z; and consider the quadratic form in the matrix DZ � PD�1

X DXD�1X P

0:

a0DZa��a0PD�1

X

�DX

�D�1X P

0a�: (18)

The �rst term is EZ [Z2] = EX [EZjX [Z2jX]]- the second moment of Z when its supportis a. The term D�1

X P0a is the vector of conditional means E[ZjX = xk], k = 1; :::; K,

so the whole second term is EX [EZjX [ZjX]2]. Hence, the complete expression in (18)can be interpreted as

EX�EZjX

�Z2 � EZjX [ZjX]2

�jX�= EX [V ar(ZjX)] > 0

i.e. the expectation of the conditional variance of Z given X when the support of Z isa: Since this must hold for all a; it follows that the matrix DZ � PD�1

X P0 is positive

de�nite as required. The only exception would be if the conditional variance of Z givenX vanished for each value of X, which we rule out.

Proof of Theorem 1

To determine the asymptotic distribution of the OLS estimator b�, we need to studythe asymptotic behaviour of n�

12L0X", which could be derived by using standard Lin-

deberg CLT. However, in the proof of Lemma 1, we have already derived that the jointdistribution of L0Z" and L

0X". Given (17), we immediately get

n�12L0X"!d N(0; �2DX):

It follows that, under exogeneity,

pn�b� � �� = �L0XLX

n

��1L0X"pn!d N

�0; �2D�1

X

�:

Proof of Theorem 2

To determine the asymptotic distribution of the IV estimator b�IV , we need to studythe asymptotic behaviour of n�

12L0Z". Given (17), we have

n�12L0Z"!d N(0; �2DZ):

Sinceb�IV � � = (L0XPLZLX)�1 L0XPLZ" = �L0XLZ(L0ZLZ)�1L0ZLX��1 L0XLZ(L0ZLZ)�1L0Z"it follows that

pn(b�IV � �) =

L0XLZn

�L0ZLZn

��1L0ZLXn

!�1L0XLZn

�L0ZLZn

��1L0Z"pn

! dN�0; �2

�P 0D�1

Z P��1�

:

32

Page 34: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

Proof of Theorem 3

Note that under H0 we have E[yjX = LXx] = LX� and the standard arguments showeasily that

n�1y0MLXy !p �2:

The representation of T �n as a quadratic form in zn follows from the fact that

WXZ = MLXLZ [lJ ; CJ ][K�1lJ ; CJ ]

0 (L0ZLZ)�1L0ZLXCK

= [0;MLXLZCJ ][K�1lJ ; CJ ]

0 (L0ZLZ)�1L0ZLXCK

= [MLXLZCJ ][C0J (L

0ZLZ)

�1L0ZLXCK ]:

Since the matrices CJ and CK are non-random, we have

An = C0J

�L0ZLZn

��1L0ZLXn

CK !p C 0JD�1Z PCK = A

and1

nBn !p A0�A:

By Lemma 21pnA0nzn !d N(0; �2A0�A)

It follows thatT �n !d �2�2K�1

and therefore

Tn =nT �n

y0MLXy!d �2K�1:

Proof of Theorem 4

Recall that we are interested in obtaining the asymptotic distribution of

1pn

�L0Z"

L0X"

�=

1pn

�unvn

�under the alternative hypothesis. Clearly E[un] = 0, since for each j = 1; :::J E[unj] =EZ [P

i I(zsi = zj)E["ijzsi ]] = 0 by Assumption 2. For each k = 1; :::K, E[vnk] =P

iEX [I(xsi = xk)E["ijxsi ]] = n�

12pkP

im(xk; vi). Since we consider local alternativesof order n�

12 , it follows that E[vn] = n�

12L0XLXm and under the alternative

E

�1pn

�unvn

��!�

0DXm

�The covariance between unj and vnk and the variances of unj and vnk remain the sameas under the null hypothesis. The stability condition in the Lindeberg-Feller CLT is

33

Page 35: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

satis�ed and the Lindeberg condition is exactly the same as under H0. Therefore underH1,

1pn

�unvn

�!d N

��0

DXm

�; �2

�DZ PP 0 DX

��:

Since we are interested in zn, a linear function of un and vn; it is clear that

1pnzn !d N(�; �2�)

with � = �C 0JPm and � = C 0J [DZ � PD�1X P

0]CJ ; and that

1pnA0nzn !d N (A0�;A0�A)

with A0� = �C 0KP 0D�1Z CJC

0JPm. Note that the only di¤erence in distribution between

null and alternative hypotheses is a non-zero mean in the asymptotic distribution of zn:Therefore, the test statistic is a quadratic form in normal variables (with non-zero

mean). This implies that

Tn = nT �n

y0MLXy!d �2K�1(�L)

i.e. non-central chi-square distribution with non-centrality parameter

�L =�0A (A0�A)�1A0�

�2:

Proof of Theorem 5

Using zn de�ned in (13), the numerator of the test-statistic can be written as

R�n =�n�

12 zn

�0�C 0JL0Z [PLX � Pln ]LZCJn

��1 �n�

12 zn

�The matrix in the middle converges to = C 0J

�PD�1

X P0 � pZp0Z

�CJ , where pZ is a

J-vector with elements qj. Therefore,

R�n !d �2z0�1z

with z � N(0;�). It follows that

Rn = nR�n

y0MLXy!d z0�1z �

J�1Xj=1

!j�2j(1);

where the !j are the eigenvalues of ��1; i:e:; those of �12�1�

12 .

34

Page 36: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

We have to show that all weights !j are positive. The argument is similar to thatused to prove that � is positive de�nite, which we have already shown in the Proof ofLemma 1. Since the inverse of a positive de�nite matrix is positive de�nite itself, weonly have to prove that is positive de�nite. This will be so if

PD�1X P

0 � qZq0Z = PD�1X DXD

�1X P

0 � qZq0Z

where qZ = (q1; :::; qJ)0, is itself positive de�nite. Let a denote J- vector of the hypo-thetical support points of Z. Then the �rst term in

a0PD�1X DXD

�1X P

0a� a0qZq0Za

is familiar (it also appears in the proof of positive de�niteness of �) and equals

EX�EZjX [ZjX]2

�:

The second term is simply (EZ [Z])2 =

�EX�EZjX [ZjX]

��2. Hence, the complete ex-

pression is

EX�EZjX [ZjX]2

���EX�EZjX [ZjX]

��2= V ar

�EZjX [ZjX]

�> 0;

the variance of the conditional expectation of Z given X. Since this must again be truefor every support vector a. It follows that the matrix PD�1

X P0�qZq0Z is positive de�nite

as required. The only case, in which this term would be zero is when EZjX [ZjX] is aconstant i.e. the expectation of Z doesn�t vary with X.

Proof of Theorem 6

Recall that for a quadratic form T = Y 0AY with A being a symmetric matrix andY � Nr(�;�) the standard results deliver T �

Pri=1 �i�

2(1)(�

2i ) with �i denoting the

eigenvalues of �A and (�1; :::; �r)0 = S 0L�1� with L coming from the decomposition of

� = LL0. S is an orthogonal matrix of the eigenvectors of L0AL.Since n�1C 0JL

0Z(PLX � Pln)LZCJ !p , it follows that,

Rn = nR�n

y0MLXy!d

J�1Xj=1

!j�21(�

2j)

with !j denoting the eigenvalues of ��1 and non-centrality parameters

(�1; :::; �J�1)0 = S 0��

12� = �S 0�� 1

2C 0JPm

35

Page 37: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

References

[1] Anscombe, F.J. (1952) "Large sample-theory of sequential estimation" Proceedingof Cambridge Philosophical Society, 48, 600-607

[2] Bierens, H.J. (1990) "A consistent conditional moment test of functional form"Econometrica, 58, 1443-1458

[3] Blundell, R.W., Horowitz, J.L. (2007) "A nonparametric test of exogeneity"Reviewof Economic Studies, 74, 1035-1058

[4] Blundell, R.W., Horowitz, J.L., Parey, M. (2012) "Measuring the price responsive-ness of gasoline demand: economic shape restrictions and nonparametric demandestimation" Quantitative Economics, 3, 29-51

[5] Buckley, M.J., Eagleson, G.K. (1988) "An approximation to the distribution ofquadratic forms in normal random variables" Australian Journal of Statistics, 30A,150-159

[6] Chesher, A. (2004) "Identi�cation in additive error models with discrete endoge-nous variables" CeMMap working paper, CWP11/04

[7] Das, M. (2005) "Instrumental variables estimators of nonparametric models withdiscrete endogenous regressors" Journal of Econometrics, 124, 335-361

[8] Deaton, A. (2010) "Instruments, randomization and learning about development"Journal of Economic Literature, 48, 424-455

[9] Fan, Y., Li, Q. (1996) "Consistent model speci�cation tests: omitted variables,parametric and semiparametric functional forms" Econometrica, 64, 865-890

[10] Florens, J.P., Malavolti L. (2003) "Instrumental regression with discrete en-dogenous variables" Working paper, GREMAQ, Universite des Sciences Sociales,Toulouse

[11] Greene, W.H. (1993) "Econometric Analysis", Second Edition, Macmillan Pub-lishing Company, New York

[12] Hall, P. (1983) "Chi squared approximations to the distribution of a sum of inde-pendent random variables" The Annals of Probability, 11, 1028-1036

[13] Hall, P., Huang, L.S. (2001) "Nonparametric kernel regression subject tomonotonicity constraints" Annals of Statistics, 29, 624-647

[14] Hausman, J.A. (1978) "Speci�cation tests in econometrics" Econometrica, 46,1251-1272

36

Page 38: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

[15] Hu, Y., Lewbel, A. (2008) "Idenifying the returns to lying when the truth isunobserved" CeMMap working paper, CWP06/08

[16] Imhof, J.P. (1961) "Computing the distribution of quadratic forms in normal vari-ables" Biometrika, 48, 419-426

[17] Iori, G., Kapar, B., Olmo, J. (2014) "Bank characteristics and the interbank moneymarket: a distributional approach" accepted in Studies in Nonlinear Dynamics andEconometrics

[18] Johnson, N.L., Kotz, S., Balakrishnan, N. (1994) "Continuous univariate distribu-tions. Volume 1", Second Edition, John Wiley & Sons Inc., New Jersey

[19] Lavergne, P., Patilea V. (2008) "Breaking the curse of dimensionality in nonpara-metric testing" Journal of Econometrics, 143, 103-122

[20] Lavergne, P., Vuong, Q. (2000) "Nonparametric signi�cance testing" EconometricTheory, 16, 576-601

[21] Matzkin, R.L. (2007) "Nonparametric identi�cation" in Handbook of Economet-rics, Volume 6, Part B, 5307-5368

[22] Muirhead, R.J. (1982) "Aspects of multivariate statistical theory" John Wiley &Sons Inc., Hoboken, New Jersey

[23] Newey, W.K. (1985) "Maximum likelihood speci�cation testing and conditionalmoment tests" Econometrica, 53, 1047-1070

[24] Newey, W.K., Powell, J.L. (2003) �Instrumental variable estimation of nonpara-metric models�Econometrica, 71, 1565-1578

[25] Robbins, H. (1948) "The asymptotic distribution of the sum of a random number ofrandom variables" Bulletin of the American Mathematical Society, 54, 1151-1161

[26] Severini, T.A. (2005) "Elements of distribution theory" Cambridge UniversityPress

[27] Sheil, J., Muircheartaigh, I. (1977) "Algorithm AS106: The distribution of non-negative quadratic forms in normal variables" Journal of the Royal Statistical So-ciety. Series C (Applied Statistics), 26, 92-98

[28] Tamer, E. (2010) "Partial identi�cation in econometrics" Annual Review of Eco-nomics, 2, 167-195

[29] van der Vaart, A.W. (1998) "Asymptotic statistics" Cambridge University Press

37

Page 39: Nonparametric testing for exogeneity with discrete regressors ...Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech Grant Hillier The Institute

[30] Zhang, J.T. (2005) "Approximate and asymptotic distributions of chi-squared-typemixtures with applications" Journal of the Americal Statistical Association, Vol.100, No. 469, 273-285

[31] Zheng, J.X. (1996) "A consistent test of functional form via nonparametric esti-mation techniques" Journal of Econometrics, 75, 263-289

38