tesis doctoral de la universidad de alicante. tesi...

Some Practical Problems of Recent Nonparametric Procedures: Testing, Estimation and Application. Jorge Barrientos-Marín

Tesis doctoral de la Universidad de Alicante. Tesi doctoral de la Universitat d'Alacant. 2007

www.adobe.es/products/acrobat/readstep2.htmlwww.adobe.es/products/acrobat/readstep2.htmlwww.adobe.es/products/acrobat/readstep2.htmlwww.adobe.es/products/acrobat/readstep2.html

SI • .

1.

Some Practical Problems of Recent Nonparametric Procédures: Testing, Estimation,

and Application.

Jorge Barrientos-Marin

Advisor: Stefan Sperlich

Quantitative Economies Doctorale Departamento de Fundamentos del Análisis Económico

Universidad de Alicante

January 2007

mff' /',*. i~



To my wife and my family.

1



Agradecimientos

Los artículos que componen esta tesis es el resultado de cinco años de trabajo continuo. Pero sin duda, esto no habría sido posible sin la colaboración de muchas personas. Quiero expresar mi gratitud para con todas ellas. Sí alguno se queda sin mencionar, lo más posible es que mi memoria, como es usual, me juegue una mala pasada. Quiero entonces expresar mi reconocimiento a los miembros del departa-mento de Fundamentos del Análisis Económico, a mis profesores y especialmente a mis condiscípulos, ellos hicieron estos cinco años soportables lejos de casa. Especial reconocimiento entre profesores merecen Antonio Villar, quien confió en mi siempre y fue consejero en momentos difíciles, a Juan Mora por proveerme ánimo y ratos agradables discutiendo resultados y teoremas, al igual que Javier Alvarez y a Lola, quienes con sus excelentes cursos me animaron a seguir el camino de la econometría.

Entre mis condiscípulos agradezco a Alicia quien siempre ha sido una amiga. Agradezco a Ricardo su ayuda e innumerables favores (muchos de ellos pecuniarios) y a Paco, Silvio y Szaby su compañía placida y su amistad sincera. A Fafael López, su buen humor e intelegiencia fueron un reto para mi. Agradezco a José Maria su gran aprecio para conmigo, algo que es mutuo, y su generosidad, estos años habrían sido menos divertidos y algunas navidade tristes sin su amistad.

No puedo dejar de mencionar al personal administrativo (Mercedes, Mariló, Julio, Carlos y Lourdes) siempre estuvieron atentos a ayudarme y tuvieron paciencia para mis innumerables solicitudes.

Agradezco también a Frédéric Ferraty y a Philippe Vieu su dedicación, ellos me proveyeron la mejor atmósfera para hacer uno de los capítulos que componen esta tesis. Aquí merece mención Juan y Mónica, quienes me acogieron en su casa y siempre fueron compañía, además de introducirme, en modo nada superficial, en los aspectos de la vida francesa.

Agradezco a mi familia, en especial a Patricia, mi esposa, quien me ha poyado todos estos años de semi-soledad a la espera de que esto acabara, siempre con pa-ciencia y optimismo. A mi madre, quien sé que mi ausencia siempre la entristeció. A mis tías, para quienes soy un orgullo. A José y Leticia Restrepo por ayudar a Patricia a llevar la carga de la soledad.

Agradezco a mis amigos en Colombia, a Mauricio Alviar y a Pedro, quienes desde un comienzo creyeron que esto era posible de alcanzar. Menciono también a Alejandro Gaviria, quien continua enseñándome a pensar como un economista, me dio además consejos acertados en el momento justo.

Finalmente, un reconocimiento especial merece Stefan Sperlich, quien me ha en-señado mucho de econometría semi y noparametrica. Estaré siempre agradecido con él, porque se preocupó de que esto terminara bien y ha sido un director excepcional aún desde la distancia.

2



Contents

Agradecimientos 2 Introduction and Summary 5 Introducción y Resumen en Español 8

1 The Size Problem of Kernel Based Bootstrap Tests when the Nuil is Nonparametric 12 1.1 Introduction 12 1.2 Statistical Methods: Estimators and Test Statistics 14

1.2.1 Estimators 14 1.2.2 Test Statistics 15

1.3 Resampling and Choice of Parameters 17 1.3.1 Bootstrap Tests 18 1.3.2 The Choice of Bandwidths h 19 1.3.3 The Choice of Bandwidths k 19 1.3.4 The Choice of Bootstrap Residuals 20 1.3.5 An Alternative: Subsampling 21 1.3.6 The Choice of Bootstrap Bandwidth/i;, 23

1.4 Simulation Results 23 1.5 Conclusions 27 Références 29

2 Estimating and Testing An Additive Partially Linear Model in a System of Engel Curves 37 2.1 Introduction 37 2.2 Additive Partially Linear Model and Testing Hypothesis 40 2.3 The Shape of Engel Curves and Spécification Testing 45

2.3.1 Data Used in this Application 48 2.3.2 Some Pictures of the Expenditure expenditure-Log Total Ex-

penditure Relationship 49 2.3.3 Spécification Testing 56

2.4 Conclusions and Future Research 59 Références 61

3



3 Locally Modelled Régression and Functional Data 64 3.1 Introduction 64 3.2 Position of the Problem 67 3.3 Functional locally modeled régression 68

3.3.1 The p-dimensional case 68 3.3.2 The infinite-dimensional case: the functional setting 69

3.4 FFLM kernel-type estimator: asymptotic behavior 73 3.5 FFLM régression in action 77 3.6 Conclusions 80 Références 88

4



Introduction and Summary

This thesis is composed of three chapters, in which we focus on three related, but

différent, issues regarding testing, estimation and theoretical developments1. More

precisely, in Chapter 1, "The Size Problem of Kernel Based Bootstrap Tests when

the Nuil is Nonparametric", we are interested in clioosing an appropriate smooth-

ing parameter, a problem that is fundamental for the reasonable use of non- and

semiparametric methods. In particular for testing, we make note the this problem

is not équivalent to the one in régression. At least from a theoretical point of view,

the optimal smoothing parameter for testing has différent rates from those which

are optimal for estimation.

While there exists an increasing literature on how to find a proper smoothing

parameter for the nonparametric alternative, almost nothing is known on how to

choose a smoothing parameter in practice for the nuil hypothesis if it is also semi- or

nonparametric. We do know that at least asymptotically oversmoothing is necessary

in the pre-estimation of the nuil model for generating the bootstrap samples, see

Hardie and Marron (1990,1991). However, in practice this knowledge is of little

help. The same can be said about various parameters and procédures to be chosen

in practice when performing such tests. In this Chapter, we discuss ail thèse choice

questions. In particular we study the problem of bandwidth choice for the pre-

estimation to genérate bootstrap samples. As an alternative, we also discuss briefly

the possibility of subsampling.2.

In Chapter 2, "Estimating and Testing An Additive Partially Linear Model in

a System of Engel Curves ", we focuses on an application of additive partial linear

model and some ideas extracted from applications on Chapter 1. Our main goal is

to make an application to consumer theory. More exactly, to Engel curves Systems.

The form of the Engel curve has long been a subject of discussion in applied econo-

Chapter 1 is a joint work with Stefan Sperlich and Chapter 3 is a joint work with Frédéric Ferraty and Philippe Vieu.

2The authors gratefully acknowledge financial support from the Spanish DG de Investigación del Ministerio de Ciencia y Tecnología. SEJ2004-04583/ECON.

5



metrics and until now there has no been definitive conclusion about its form. In this

Chapter an additive partially linear model is used to estimate semiparametrically

the effect of total expenditure in this context. Additionally, we consider the non-

parametric inclusion of some regressors which traditionally have a non linear effect

such as age and schooling. To that end we compare an additive partially linear

model with the fully nonparametric one using recent popular test statistics. Be-

cause of inference in nonparametric regression can take place in a number of ways,

the most natural is to use nonparametric regression as an alternative against a fully

parametric or semiparametric null hypothesis. Then, for investigating purpose we

check whether an additive PLM provides a reasonable adjustment to our data using

different resampling schemes to obtain critical (p-values) computed by bootstrap

and subsampling schemes for the proposed test statistics.

Additionally, in this Chapter, we dealing with a well-known problem very com-

mon in the context of Engel curves, it is that total expenditure may well be jointly de-

termined with expenditure on different goods. Therefore, endogeneity problem may

arise. In order to solve this problem we are interested in applying nonparammetric

constructed regressors as instrumental variables. In particular, we use the nonpara-

metric two step with generated regressors and constructed variables (NP2SCV) due

to Sperlich (2005). Our feeling is that a generated variables approach in combination

with additive PLM can help us to overcome to some extent any possible endogeneity

problem and that is exactly the procedure implemented in this Chapter.

In Chapter 3, "Locally Modelled Regression and Functional Data"3, we are in-

terested in extend nonparametric methods when the regressors are functions (i.e.

one observation could be curve, surface or any other object lying into an infinite

dimensional space). From a statistical pint of view, this corresponds to a functional

regression setting because on wishes to predict a response Y from an explanatory

functional variable X. In addition, only regularity conditions on regression operator

J Acknowledgement . The authors thank gratefully the members of the working group STAPH (http : //www.lsp.ups — tlse.fr/staph) for their helpful comments and discussions. In addition, the first author acknowledges financial support from the Spanish Ministry of Education and Science, under project BEC2001-0535

6



http://www.lsp.upshttp://�http://tlse.fr/staph

are assumed. Then, this leads us to the nonparametric context. So, the problematic

of this work deal with the nonparametric functional regression. Recently, there are

several works dealing with the nonparametric functional regression (see for instance

Ferraty and Vieu (2002, 2005)). This nonparametric functional regression method is

essentially based on an extension of the well-known Nadaraya(1964)-Watson(1964)

kernel regression estimator of the regression, to the case of functional explanatory

variable. On the other hand, local linear ideas have been developed in the regression

context for univariate and multivariate explanatory variable, see Wand and Jones

(1995) for an overview of this topic. Therefore, our work can be considered as an

extension, which is a combination, of the nonparametric local constant method with

the ideas of functional variable. So, the aim of this setting does not make easy both

the asymptotic study and the implementation of a natural generalization of the mul-

tivariate local linear method. Therefore, one focuses on a simpler and faster local

approach. Asymptotic properties are stated, and a functional dataset illustrates the

good behavior of this fast functional local modelled regression method.

7



Introducción y Resumen en Español

Esta tesis esta compuesta por tres capítulos, los cuales se centran en tres diferentes

problemas, aunque relacionados, estos van desde estimación y contrastes de hipótesis

hasta desarrollos teóricos. Más exactamente, en el Capítulo 1, " The Size Problem of

Kernel Based Bootstrap Tests when the Null is Nonparametric", nosotros estamos

interesados en la selección apropiada de un parámetro de suavización, un problema

que es fundamental para un razonable use de los métodos semi y noparamétricos.

En particular, para contrastes de hipótesis, nosotros notamos que este problema no

es equivalente aquel que se presenta en análisis de regresión, esto es en la simple

estimación. Al menos desde un punto de vista teórico, la selección del parámetro

para contrastes de hipótesis tiene tasas (de convergencia) diferents a las que se

supone debe tener los parámetros que son óptimos para la estimación.

Mientras que existe una creciente literatura sobre el modo de hallar un parámetro

apropiado para la hipótesis alternativa, casi nada es sabido sobre como elegir un

parámetro de suavización en la práctica para la hipótesis nula, si esta es también

semiparamétrica o incluso noparamétrica. Solo sabemos que asintóticamente una

parámetro sobresuavizado es necesario en la preestimación del modelo bajo la nula

para generar las muestras bootstrap, ver al respecto Hárdle and Marrón (1990,1991).

Sin embargo, en la práctica este conocimiento es de poca ayuda. Lo mismo puede

decirse acerca de varios parámetros y procedimientos a ser elegidos en la práctica

cuando hacemos un uso de un procedimiento de contraste. En este Capítulo en-

tonces, nosotros discutimos estas cuestiones acerca de la selección. En particular,

nosotros estudiamos el problema de la selección del parámetro de suavizado en la

pre-estimación para generar las muestras bootstrap. Como alternativa, también

discutimos brevemente la posibilidad de submuestras.

En el Capítulo 2, "Estimation and Testing An Additive Partially Linear Model

in a System of Engel Curves", nosotros nos centramos en la aplicción de modelos

aditivos parcialemente lineales basados en algunas ideas del Capítulo 1. Nuestra

meta es hacer una aplicación en teoría del consumidor. Específicamente a sistemas

8



de curvas de Engel. La forma de la curva de Engel ha sido por mucho tiempo objeto

de investigación en econometría aplicada y hasta el momento no hay conclusiones

definitivas sobre su forma. En este capítulo un modelo parcialmente aditivo es us-

ado para estimar semiparametricamente el efecto del gasto total. Adicionalmente,

consideramos la inclusión noparamétrica de algunos regresores que tradicionalmente

tienen un efecto no-lineal como la edad y la escolaridad. Para llevar a cabo este

trabajo, comparamos un modelo aditivo parcialmente lineal con un modelo plena-

mente noparamétrico usando algunos estadísticos de contraste recientemente desar-

rollados. Puesto que infererencia en regresión noparamétrica puede ser hecha de

varias maneras, lo más natural es usar la regresión noparamétrica como hipótesis

alternativa contra una hipótesis nula semiparametrica. Entonces, para propósito

de investigación nosotros chequeamos si un modelo PLM proporciona un razonable

ajuste a los datos usando diferentes métodos de reemuestreo para obtener valores

críticos calculados con bootstrap y submuestras de los mencionados estadísticos de

contraste.

En este capítulo, nosotros también tratamos un problema común el contexto

de las curvas de Engel, y es que el gasto total esta conjuntamente determinado

con el gasto en los diferentes bienes. Por ello existe una endogenidad potencial.

Para resolver este problema usamos regresores construidos como variables instru-

mentales, en adición a variables en otras bases de datos. En particular, nosotros

usamos el método desarrollado por Sperlich (2005) llamado regresores noparametri-

camente generados o construidos en dos pasos (NP2SCV). Nuestra sensación es que

ciertamente (NP2SCV) en combinación con modelos aditivos parcialmente lineales

ayudan a eliminar la endogeneidad en la estimación de las curvas de Engel.

En el Capítulo 3, "Locally Modelled Regression and Functional Data", nosotros

estamos interesados en extender los métodos noparametricos cuando los regresores

son funciones (i.e. una observación podría ser una curva, una superficie o cualquier

otro objeto perteneciente a un espacio de dimensión infinita). Desde un punto

de vista estadístico, esto corresponde a una regresión funcional, porque deseamos

predecir un^ variable respuesta Y de una variable explicativa funcional X. Además,

9



solo condiciones de regularidad con impuestas al operador de regresión son asumidas.

Esto conduce entonces a un contexto noparamétrico. Así que la problemática de

este trabajo trata de la regresión funcional noparamétríca. Recientemente, varios

artículos tratan con la regresión funcional noparamétríca (ver por ejemplo Ferraty

and Vieu (2002, 2005)). Estos consiste esencialmente en la extensión de estimador

kernel Nadaraya(l964)-Watson(1964) a el caso de variable explicativa funcional. De

otro lado, ideas de regresión local han sido desarrolladas en el contexto de regresión

univariante y multivariante, ver Wand and Jones (1995). Por tanto, nuestro método

es una extensión, que es una combinación de los métodos de regresión locales con

las ideas actuales de variables funcionales. Así pues, la meta nos es fácil en cuanto

al estudio asintótico y la implementación de una más que natural generalización

del método lineal local multivariante. Por tanto, nos centramos en una más simple

y rápida aproximación local. Las propiedades asintóticas son establecidas y datos

funcionales ilustran el buen comportamiento de este método rápido de regresión

local.

10



REFERENCES

Ferraty, F and P. Vieu (2004). Nonparametric Models For Functional Data, with

Applications in Regression, Time Series Prediction and Curve Discrimination. Non-

Parametric Statistics, 16, 1-2, 111-125.

Ferraty, F and P. Vieu (2006). Nonparametric Modelling for Functional Data Analy-

sis. Theory and Practice. Springer, New York (In print).

Hardle, W and J.S Marrón (1990) Semiparametric Comparison of Regression Curves.

Annals of Statistics, 18, 63-89.

Hardle, W and J.S Marrón (1991) Bootstrap Simultaneous Bars For Nonparametric

Regression. Annals of Statistics, 19, 778-796.

Sperlich, S. (2005). A Note on Nonparametric Estimation with Constructed Vari-

ables and Generated Regressors. Working Paper. Universidad Carlos III.

Wand, M. P and M. C. Jones (1995) Kernel Smoothing. Monographs on Statistics

and Applied Probability, 60. Chapman & Hall.

Watson, G. S (1964) Smooth Regression Analysis. Sankhya Ser. A 26.

11



Chapter 1

The Size Problem of Kernel Based Bootstrap Tests when the Null is Nonpar ametric

1.1 Introduction

IN B O T H A P P L I E D AND M A T H E M A T I C A L STATISTICS, N O N - AND S E M I P A R A M E T R I C

SPECIFICATION TESTING is still quite a popular research field. Any internet search

engine can find several hundred papers dealing with this topic even when looking

at the last five years only. Therefore, it is surprising that so few of them study

the problem of choosing an appropriate smoothing parameter, a problem that is

fundamental for the reasonable use of these methods. Unfortunately, for testing this

problem is not equivalent to the one in regression. It is well known that, at least

from a theoretical point of view, the optimal smoothing parameter for testing has

different rates from those which are optimal for estimation.

In the last couple of years there has been a growing amount of literature on

adaptive testing. In most cases, the adaptiveness refers to the smoothness of the

alternative and deals with the choice of smoothness parameter for the alternative, or

the test statistic, see e.g. Ledwina (1994), Spokoiny (1996,1998), Kallenberg & Led-

wina (1995), Hardle et al (2001), Horowitz & Spokoiny (2001), Guerre & Lavergne

(2005). Even though these methods have so far had little direct impact in the sense

that we could not find published papers using these methods (in practice or in the-

12



Chapter 1 The Size Problem of Kernel Based Bootstrap Tests when the Null is Nonparametric

ory), they have been useful in determining a better understanding of the problem.

However, to our knowledge, all these papers concentrate on testing problems where

the null hypothesis is fully parametric. It is not clear to what extend these meth-

ods help if the null hypothesis is semi- or nonparametric. This is not such a rare

situation, since additivity tests already belong to this family. When bootstrap is

used to determine the critical value, these tests entail at least one more parameter

choice problem: pre-estimating the model under the null hypothesis to later gen-

erate the bootstrap samples. This is necessary as in most cases the bandwidths

for the estimation and the bootstrap should have different rates, see e.g. Hardle &

Marrón (1990,1991). Although these authors have already mentioned the problem

of choosing an appropriate bandwidth, in practical applications this problem has

hardly been addressed. As a consequence, in most published procedures for test-

ing or constructing confidence bands with a semi- or nonparametric null hypothesis,

there is no guarantee that the test holds the level, or the bands the nominal coverage

probability. This has recently been confirmed in the work of Dette et al. (2005)

and Rodríguez-Póo et al (2004). However, in the former it is not referred to as a

bandwidth problem but rather as a problem of correlated designs and dimension-

ality because the size distortion is much smaller for uncorrelated design. In the

latter paper the problem is avoided by using subsampling instead of bootstraps. It

should also be mentioned that in that simulation study, the authors face basically a

parametric bootstrap drawing the bootstrap errors from a distribution known up to

a certain parameter. Although that unknown parameter depends on nonparametric

nuisance parameters, knowledge of distribution greatly mitigates the impact of the

bandwidth on the critical value.

To study the problem outlined in more detail we concentrate on the problem

of testing additivity. We limit ourselves to test statistics proposed in Dette et al.

(2005) and Rodriguez-Poo et al. (2004) but we try different modifications, methods

of bandwidth choice, and subsampling. The aim is not to find the most efficient

additivity test or to propose new ones. Our focus is only directed at finding a

method that guarantees that the level will be held by non trivial power when the

13




null hypothesis and the resampling method are non- or semiparametric. So, after

a review of the additivity tests considered here, we study different procedures for

bandwidth choice. Unfortunately, we have not found a generally valid method. Our

conclusion is basically that further research is necessary.

The rest of the paper is organized as follows. In the next section we review the

estimation and testing procedures considered in this work. In Section 1.3 we discuss

the different, scenarios from which the practitioner has to make his choice, including

modifications of test statistics, and resampling methods. Section 1.4 summarizes

the main findings from our simulation results, and Section 1.5 concludes.

1.2 Statistical Methods: Estimators and Test Sta-tistics

1.2.1 Estimators

We consider the following model:

Yi = m(Xi)+ui i = l,2:....n, '(1.1)

with {{XtYl)}"=1 e Md xK i.i.d., m : Ud -» K the unknown function of interest,

m[x) = E(Y\X = x), and IÍ¿ i.i.d. random errors with E[u{] — 0 and finite variance.

The internalized Nadaraya-Watson estimator is defined as

n - i mk(x) = ] T vk{x, Xi)Yi, with vk{x, Xz) = (/ fc(X ;)) Kk(x - X,) (1.2)

where fk{Xj) = ^ J^"=l Kk(Xj — Xr) is a kernel density estimator (unlike standard

Nadaraya-Watson, here Í fk{Xt) j appears internally to the summation, see Jones

et.al (1994)), and Kfc(u) = \\d

a=l Kk (u) a product kernel with Kk{u) = k^Kiuk"1).

Commonly, the kernel is assumed to be Lipschitz continuous with compact support

and / \K(x)\dx < oo, / K{x)dx = 1. Furthermore, k is the bandwidth, assumed to

go to zero for sample size n going to infinity, but nk^ going to infinity. Let Vk be

the n xn matrix whose (j,i) element is vk{Xj,Xi)1 then rhk{x) = VkY, where Y

and mi; (•) are n x 1 vectors with rhk(Xj) and Yj is its jth entry respectively.

14



.Chapter 1 The Size Problem of Kernel Based Bootstrap Tests when the Null is Nonparametric

We are interested in the additive model, which we write in terms of

d

E (Y\X = x) = ms(x) = x{; + J2m« (x«) , (1-3)

a=l

where we set Exa {irLa(Xa)} = J ma(x)fa(x)dx = 0 Va for identification. Here,

ma, a = 1,. . ., d are the marginal impact functions for each regressor. Therefore,

^ is a constant equal to the unconditional expectation of Y. Writing m{X) =

ma(Xa)+m_a(X_a) where X_a is the vector X of all explanatory variables without

Xa, i.e. Xâ = (Xii,... ,X¿(Q._i), Xi(a+i),... ,Xid) we can use the identification

condition directly to estimate ma. The so called marginal integration idea is based

on the fact that for xa fix we have

Ex-a [m {xa,X-a)\ = I m (xa, x_Q) /_„ (x_Q) dxâ = i< + ma (xa) .

Substituting for m(-) a nonparametric pre-estimator such as the one given in (1.2), a

sample average for the expectation, and for ip simply ip = - Y17-1 V* &ves (neglecting

the constant for a moment for the sake of simplicity):

n

fTla\%a) / t lâh yât -A-ia) *i j

¿=1

where

wh (xa, Xai) = Kh (xa - Xia) j - ^ lsz£iL . (1.4)

Finally, we set rhs(Xj) = ip+^2a=1 rha(Xja) for each j = 1, 2,..., n. Note that defin-

ing Wh •= J2a=i Wah (xa) with Wah (xa) being the nxn matrices with wah (Xj, X{)

as elements, one has rhs (x) — ip + Wh (x) Y. For more details see Dette el al (2005).

Some of the test statistics we will consider here are also introduced and discussed

there.

1.2.2 Test Statistics

As mentioned above, we do not introduce new testing procedures but rather study

two statistics which have already been studied in Dette et al (2005) together, with

15




other additivity tests, and which have turned out to perform best. We add a new

test statistic motivated by one that was introduced recently by Rodríguez-Poó et al

(2005), and which performed excellently in the study by Roca-Pardiñas & Sperlich

(2006). For more details on the test statistics readers are referred to these papers.

The null hypothesis of interest is Ho : rrc(-) = ms(-) versus Hi : m(-) ^ ms{-).

We consider the following two test statistics from Dette et al (2005) :

n = Í ¿ ( m ( * i ) - m s ( * i ) ) M * í ) , n /—J ¿=i

1 " T2 = -S^ei{rh{Xi)-rns(Xl))w{Xl),

n ¿—•* n ¿ = 1

where é¿ = Y¿ — rhs(Xi), i.e. the residuals under the null hypothesis, and ñ¿ =

Yi — m(Xi), the residuals without restrictions. Obviously, T\ calculates directly the

integrated squared difference between the null and alternative models. Alternatively,

T2 seeks to mitigate the bias problem inherited from the estimate m, which suffers

from the curse of dimensionality. In Dette et al (2005) it is proved that for all r¿,

the nkz (jj — /¿-) converge under the null to a normal variable with mean zero

and variances v\ for j — 1,2 with

¡ix = EH0{TI} = —^ / a2(x)w(x)dx / K2(x)dx +

¡i2 = EHo {r2} =-r-j

Chapter 1 The Size Problem of Kernel Based .Bootstrap Tests when the Null is Nonparametric

where for ease of presentation and implementation K (•) is the same kernel function

as in the last subsection, and k again its bandwidth. It is straightforward to derive

from the above mentioned paper that nkz (T 3 — fi3) converges under the null to a

normal variable with mean zero and variance v\ for

¡d3 = EH0{TÍ}— I (K * K) (x) dx I a2(x)f2(x)w(x)dx

All tests have been proven to be consistent in the sense that under the alternative

they converge with n to infinity.

Finally let us mention that we have also studied other test statistics, e.g. those

given in Dette et al (2005) but not presented here. These, however, showed even

less satisfactory performance, so we have skipped them in our presentation.

1.3 Resampling and Choice of Parameters

As is well known, asymptotic expressions are of little help in practice, for calculating

the

exact critical value, for several reasons: bias and variance contain unknown ex-

pressions which have to be estimated nonparametrically, and the convergence rate

is quite slow for large d. For this reason it is common to use resampling methods

to approximate the critical value for the particular sample statistic. These can be

bootstrap methods or subsampling procedures. Unfortunately, unlike subsampling,

for the bootstrap it is not known how to choose the smoothing parameter in practice

for the pre-estimation of the model that is used to generate the bootstrap samples.

From theory it is known that one should somewhat oversmooth (see for instance

Hardle and Marrón (1991) and discussion below). For the choice of k (when esti-

mating the alternative), some procedures are provided in the literature (see our brief

discussion of adaptive tests in the introduction). We will come back to this point

later in this section.

17




1.3.1 B o o t s t r a p T e s t s

We give the general procedure first and then discuss some details:

1. With bandwidth ft-, calculate the estimate rhs under the null hypothesis of

additivity and its resulting residuals é¿, i = 1,. . ., n.

2. With bandwidth k, calculate the estimator m for the conditional expectation

without the additivity restriction, and the corresponding residuals ü¿, i =

1,. . ., n.

3. With the results from step 1 and 2 we can calculate our test statistics TI, T2,

and T3.

4. Repeat step 1 but now with a bandwidth hb which depends on h from step 1. We

call the outcome rhbs, respectively e¿ = Yi—rhb

s(Xi), i = 1,. . . , n. Draw random

variables e* with E[(e*Y] = u\ (respectively e\ or e¿, see discussion below) for

j = 1,2,3 (respectively j = 1,2, see below again). Set Y* = rhbs(Xi) + e*,

i = 1 , . . . , n. Repeat this B times. This defines B different bootstrap samples

{{Xi,Y;fi)}Z=1,b=l,...,B.

5. For each bootstrap sample from step 4 calculate the test statistics r*' , j — 1, 2, 3,

b = 1,... ,B. Then, for each test statistic r¿, j — 1,2,3, the critical value

is approximated by the corresponding quantiles of the distribution of the B

bootstrap analogues: F*(ü) = j¡ Ylb=i ^iT*j' — ^ } - R-ecaH that they are

generated under the null hypothesis.

This procedure is well known, has proved to be consistent for many test sta-

tistics and has therefore been applied, certainly with slight modifications, to many

non- or semiparametric testing problems. However, several questions of practical

importance remain open: bandwidth choice h in step 1., bandwidth choice k in step

2., how to generate the bootstrap residuals e* in step 4. (see above), and how to

choose hb. Finally, how many bootstrap samples are necessary to get a reasonable

18




approximation of the distribution in step 5. In this paper we will discuss all these

questions except the last one.

1.3.2 The Choice of Bandwidths h

The problem of finding an optimal h is somewhat different from that of finding the

optimal smoothing parameter k which is directly linked to the optimal rate of the

test statistic. In that case it is clear that a theoretical optimal choice depends on the

optimal rate at which the test can detect a deviation from the null hypothesis. For

further details see the next subsection. In most cases, the estimator of the null model

can have faster convergence rates than that of the alternative, so the asymptotics

of the test statistics provide no theoretical guideline for an optimal choice of h. In

other words, we have to rely on practical issues.

As there are exist data adaptive methods for finding the optimal bandwidth k for

the alternative (compare next subsection) one could argue that h should be chosen

according to k. This way one could guarantee that the same smoothness is imposed

on the regression function regardless of whether it is estimated under the null hy-

pothesis or not. However, it is not clear whether this is always wanted. Moreover,

we will see later that on the one hand the adaptive choice of k is computationally

intensive, and on the other hand /i¡, depends on h. For k one needs a grid search

which then has to be extended to the choice of h (as it then depends on k) and thus

to the choice of h¡,. Altogether we would get a procedure that is computationally

quite unattractive.

Intuitively, it seems to be desirable to look for a reasonable estimation of the null

model. This is only guaranteed with a reasonable bandwidth choice of h beforehand.

We therefore recommend cross validation or plug-in methods.

1.3.3 The Choice of Bandwidths k

It is known that a bandwidth k which is optimal for estimation is usually suboptimal

for testing. More specifically, for testing the optimal smoothing parameter has faster

convergence rates, i.e. we should undersmooth. As for regression, cross validation

19




bandwidths have a tendency to undersmooth in practice, and they are also quite

popular for nonparametric testing.

As an alternative, let us consider the adaptive testing approach introduced e.g.

in Spokoiny (1996,1998). It has been extended by Rodríguez-Poó et al (2004) to

nonparametric testing problems such as those we consider here. The method is the

same for each of our three test statistics, so we can skip the index j of Tj, j — 1,2, 3

in this subsection. Adapted to our problem it works as follows:

We consider simultaneously a family of tests {rfc, k 6 &}, where 8. — {fcj, ¿2,...., kp)

is a finite set of reasonable bandwidths. The theoretical maximal number P depends

on n but is of no practical relevance, for details see Horowitz & Spokoiny (2001).

Define rk - Eo[rk} .

Tmax = m a x t > w h e r e

keK Varl/2[Tk]

EQ[] indicates the expectation under HQ. This studentizing under the null is only

to correct for the deviations in distribution caused by the different bandwidths k.

Therefore, instead of Varl^2[rk] we could take something proportional to it without

loosing consistency, as long as it corrects for the standard deviation caused by the

different k — k\,..., kp.

A particularity of the bootstrap analogues of rmax is that one first needs to cal-

culate the bootstrap statistics (rfc)*'6 for all k E 8. to afterwards get (Tmax)*.6. Note

that for each k, the empirical moment of the bootstrap statistics (jk)*'b (average,

respectively standard deviation) can be used as a substitute for EQ [rh\, respectively

Var1//2[Tk], in practice. This is what we do in our simulation study.

1.3.4 T h e Choice of B o o t s t r a p Res idua l s

From a theoretical point of view, wild bootstrap errors should be drawn from the

residuals of the alternative model, i.e. t¿¿ should be used in Subsection 1.3.4 instead

of e¿ or é¿. It is clear that this should maximize the power as the variance of e¿ (and

é¿) can increase greatly with increasing distance between HQ and the true model.

Arguments in favor of using e¿ exist only under practical aspects: often the size

20




distortion in bootstrap tests is worse when using ui or é¿; when using adaptive

procedures as described in Subsection 1.3.3, then it is not that clear which of the

Ui to use or whether the t¿¿ should even be estimated independently of the fc-choice

for the test; at least in the study of Dette et al (2005) to which our study comes

closest, the power loss is negligible so the size argument is decisive.

We conclude so far that if no adaptive choice of k is made, it would be desirable

to use m as long as one can control for the size distortion.

The second question is what kind of distribution for generating the random errors

should be used. In step 4 of the bootstrap procedure described in Subsection 1.3.4 a

distribution is often taken that gives e* with E[(e*y] = ej for j = 1 up to 3 (or even

more). The so called golden-cut wild bootstrap is also quite popular, see e.g. Hardle

& Mammen (1993). More recently, in the context of size distortion of bootstrap tests,

Davidson & Flachaire (2001) argue that for problems with moderate sample size

the disadvantages of the higher-order-moment adapting bootstraps outweigh their

(asymptotic) advantages. We therefore compare different methods in our simulations

(see Section 1.4).

1.3.5 An Alternative: Subsampling

A more and more popular alternative to bootstrapping is the subsampling proce-

dure, see Politis et al (1999). To date, as subsampling is commonly believed to

converge slower in practice than bootstrapping, it has been used almost exclusively

when the bootstrap fails, i.e. has been proven not to converge. See Neumeyer &

Sperlich (2006) as an example in a purely nonparametric testing context. However,

Rodríguez-Poó et al (2004) introduce subsampling in the context that we discuss

here, although the bootstrap is consistent, because of the size distortion their boot-

strap test suffered from (until the sample size was huge). In both papers subsampling

works well. The former also studies the automatic choice of subsample size m (with

m < n) which turns out to work in their simulations. As this method might be

remodeled to serve as a procedure for finding hb, we briefly introduce subsampling

and the automatic choice of the subsample size m:

21



Chapter 1 The Size Problem of Kerne] Based Bootstrap Tests when the Null is Nonparametric

Let y = {{X.¿, Y¡) \i — 1,..,, n) be the original sample, and denoted by r (y) the

original statistic calculated from this sample, leaving aside index j = 1,2,3 for a

moment. To determine the critical values we need to approximate

Q{z) = P (nVi?T Q>) < z\ . (1.6)

Recall that under HQ this distribution converges to an iV(/¿,t>2), for ¡i and v see

Subsection 1.2.2. For finite sample size n, drawing B subsamples y¡, - each of size

m - we can approximate Q under HQ by

1 B

¿W :== QT,I{myñ^Tkm^m) = 1

Note that the awkward notation comes from the fact that we have to adjust all

bandwidths for the new sample size m. For example, imagine k = ko • n's for fco

being constant. Then, Tfcm is calculated like T but with bandwidth km — konsm'6.

Certainly, under the alternative Hi, not only nVk^T (y) but also m^/k^jkm (ym)

converges to infinity. When demanding m/n —> 0 guarantees that ny/k^r (y) con-

verges (much) faster to infinity than the subsample analogues. Then, Q underesti-

mates the quantiles of Q which yields the rejection of HQ-

The problem here is to find a proper subsample size m. Actually, the optimal

m is a function of the level a. Again we apply resampling methods: Draw some

pseudo sequences y*>1, i = 1 , . . . , L of y of size n with the same distribution as JA

For the desired level a, test HQ : m(x) ~ ms{x) = rh{x) — rhs{x) the same way as

you want to test HQ : m(x) = m-s(x), i.e. applying your particular test statistic to

HQ and using subsampling. From the L repetitions you can determine the empirical

rejection level (estimated size) for your given a. Now find an m such that this

empirical rejection level is ^ a. In practice, you choose from a grid of possible m

the one whose estimated rejection level for HQ is closest to a from below. Note that

HQ is always true up to an estimation error that should be almost the same as in your

original test. The only drawback of this procedure is the enormous computational

effort. For further details and examples see Politís et al (1999), Delgado et al (2001),

and Neumeyer & Sperlich (20P6).

22




1.3.6 T h e Choice of B o o t s t r a p B a n d w i d t h hi

In general, for many test statistics one could repeat the arguments outlined in Hardle

& Marrón (1990,1991): For the mean of fhh(x) — m(x) under the conditional distri-

bution of Yi,..., Yn\Xi, ...,Xn, respectively of rh*h(x) — rhhb(x) under the conditional

distribution of Yf, ...,Y*\X\, ...,Xn, we know from Rosenblatt (1969) that

EY\x{mh(x)-m{x)) « h2^-m"{x) , (1.8)

Er(m-h{x)-mg(x)) « h^^-m^x) , (1.9)

where fj,(K) = J u2K(u)du. Obviously, we need that vnl'h (x) — m"(x) •—> 0. The

optimal bandwidth /i6 for estimating the second derivative is known to be much

larger (in rates) than the optimal h for estimating the function itself. We can even

give the optimal rate. For example, the optimal rate to estimate ras" is of the order

n - 1 ' 9 (instead of n~1//5), an observation we make use of in our simulation studies in

Section 1.4.

As will be seen once more in Section 1.4, the typical comment that /ib has to be

oversmoothing, is unhelpful in practice. We therefore try the following automatic

bandwidth choice: apply the same procedure used for the automatic choice of a

proper subsample size m (last subsection) to find an adequate hi, for a given level a.

This is what we explain in more detail and afterwards try in our simulation study.

1.4 Simulation Results

To study all the points listed in the last section, we perform a huge number of

simulations. We give here only a summary of them; for example, limiting the pre-

sentation to Tj, j = 1, 2, 3, one particular model, one specific (random) design, and

sample size n = 100.

The model considered is as follows: We consider the same data generating process

23




as Dette et al (2005). That is, we draw i.i.d. three dimensional explanatory variables

/ 1 0.2 0.4 \ Xi ~ N(0, Ex) with Ex = 0 . 2 1 0.6

\ 0.4 0.6 1 /

and i.i.d. error terms e¿ ~ iV(0, al) to generate

Yi = Xhi + Xli + 2 sin(7rX3ii) + aX2,iX3s + eu i = l,...,n

with a = 0 to generate an additive separable model, or a = 2 for the alternative.

Recall that the target is a test for additivity. Unless otherwise indicated, we set

ae — 1. Dette et al (2005) show that for the rather unrealistic situation that if

Ex is the identity matrix (i.e. with an uncorrelated design), then the problem is

greatly simplified, whereas a (much) stronger correlated design than ours leads to

identification problems for moderate sample sizes.

All results in the tables are calculated from 250 replications using 200 bootstrap

samples (or subsamples respectively). For real data applications 200- bootstrap sam-

ples are certainly very few; but in our simulations the results differed little when we

increased the number to 500. We used the (multiplicative) quartic kernel through-

out.

In all three test statistics we use the weighting function w(-) for different trim-

ming: we cut the outer 10%, 5% or 0% of the sample, where "outer" refers to the

tails of the explanatory variables. This is done to get rid of the boundary effects in

the statistics. The tables only give results for 5% and 0% as the boundary effects

turned out not to be a major problem.

To further speed up our simulation studies, we first looked for an average cross

validation bandwidth k, which turned out to be kopt = 0.78. Then we did all our

simulations for the non-adaptive tests (compare Subsection 1.3.3) with kopt. This

was done not only for computational reasons but also because otherwise the size of

the tests would also depend on the randomness induced by the estimation of k. For

the adaptive test procedure, k ran over a grid of 10 bandwidths placed around kopt.

We verified that in most cases Tmax did not refer to the boundary, i.e. to kmin or

h ^rnax •

24



Chapter 1 The Size Problem of Kernel Based Bootstrap Tests when the Nul) is Nonparametric

As discussed above, the bandwidth choice problem is different for h. Here, the

parameter responsible for the size, /ifr, depends on both a (the level) and h. Alto-

gether, it is no problem here that h is chosen by cross validation in each simulation

run as recommended in Subsection 1.3.2. For the internalized marginal integration

estimator, cross validation bandwidths were introduced by Kim et ai (1999). For

the nuisance directions X-a (see equation (1.4) in Section 1.2.1) we used h_a = 6 • h

as recommended in Dette et al (2005) and Hengartner & Sperlich (2005).

We tried different bootstrap residuals (compare Subsection 1.3.4). Our simu-

lations mainly seem to confirm the findings discussed above. Therefore, below we

report only results referring to e* = £¿e¿, where the e¿ are i.i.d., drawn either from

the golden-cut distribution

f - ( \ / 5 + l ) /2 with probability p = (>/5 + l ) / (2v5) €i ~ \ (\/5 + l ) /2 with probability 1 - p

or from the Gaussian normal N(0,1). However, we admit that it may be interesting

to try more, different automatic choice procedures for h¡,, in order to study again

what effect the choice of residuals taken has (ult ¿, or ¿;).

Probably the most interesting and challenging point is the choice of h¡,. We first

give the results for several choices of /i¡, with different bootstrap generating methods,

/c-adaptive and non adaptive procedures. To have h¡, as a function of h, to take also

into account h/hb —* 0, and perhaps validate the rate n~l//9 (motivated in Subsection

1.3.6) we set hb = /in1/5-1/" and try different tc < 9.

Table 1.1 shows the results for the non-adaptive golden-cut bootstrap test. These

results basically i) confirm the statements of Dette et al (2005) for our context;

and ii) show that the problem is not solved simply by different smoothing in the

pre-estimation. Undersmoothing, as generally stated from a theoretical point of

view, seems to go in the wrong direction. In particular, the hope that the results

of Rosenblatt (1969) (see equations (1.8) and (1.9)) might give us a hint or even

provide a rule of thumb for the choice of h^, is not confirmed here. T3 , introduced

by Rodríguez-Poó et al (2004) clearly outperforms the others in this study (as it

does in the following).

25




The results for the fc-adaptive analogues, see Table 1.2, show hardly any im-

provement. In particular, the problem of choosing h^ or, in other words, the size

problem is only mitigated for r%.

Following to some extent the findings of Davidson & Flachaire (2001) we then

repeated these two studies but with the Gaussian bootstrap, see above. Though

there is some improvement in both, size and power, the results in Table 1.3 and 1.4

give us hope only for test statistic T%. Note that the observation that a slight un-

dersmoothing is produces much better results than oversmoothing has not changed

over the four different trials.

Next, for comparison we also provide a small simulation study where the critical

values are approximated by subsampling, trying several subsample sizes m. The

results are given in Table 1.5 for non-adaptive tests, and in Table 1.6 for fc-adaptive

tests. We tried more sizes m for the non-adaptive test but got reasonable results

only for T3 . In contrast, looking at the A;-adaptive versions, ryax, T™ax seem to

work, too - though with a rather weak power. Table 1.6 unfortunately is misleading

concerning r™ax\ one needs a much smaller m to get reasonable results here. A small

simulation study evaluating the automatic choice of m seems to indicate that this

procedure might work and therefore should be tried for what is our main focus: the

automatic choice of hb-

Therefore, we adjusted the automatic choice of the subsample size to find an

adequate hb (see Subsection 1.3.6). This was done as follows, described here in

detail for r3. Let {Y*, £*}™=1 := 3̂ * be a member of the pseudo sequence introduced

in Subsection 1.3.5. Then, for testing HQ : m{x) — ms(x) = rh(x) — rhs(x) with

sample 3^*, an analogue to T3 would be

1 ^ = IT, 2

-K^X'-XAiYj-msiXj)} w{X¡) . (1.10)

26




Other statistics are thinkable certainly, e.g.

2

w(X-- ¿ K , ( X t - X*){Y* ~ rhs(X;)} - ¥Lh{Xt - X3){Y0- - ms(X3)} nkd n /—1

but they should all be asymptotically equivalent to (1.10). The procedure was

performed with only L = 100 pseudo samples 3^*- As the results varied widely

we were forced either to enlarge L considerably or to reduce ae considerably. For

computational reasons we decided on the second option and repeated the study with

ae = 0.1.

Some results are summarized in Table 1.7. As can be seen, this time we emphasize

the possibility of undersmoothing much more. You first have to look at T\ to find

the K giving the rejection level closest to a = 5% from below. Here, this is always

K = 3. Note that this might also change depending on the trimming, a, sample size,

etc. It is important to understand that the lines of T^ can always be calculated, i.e.

without knowing the true data generating process. Therefore we call this method

fully automatic. Now look at the lines for T%, the test of interest. Obviously, K — 3

is indeed the best possible choice; it holds the level and has strongest power of

any K respecting the level. This could be taken as indicating that our suggestion

for selecting /ib works. Unfortunately, this method does not work that well for all

possible a; specifically, it becomes quite incorrect for a > 10%. We repeated this

study also for j \ and T-I- The results were always somewhat worse than for T3 so

they do not change our conclusion that this procedure seems to be an interesting

and promising approach but further research is necessary.

1.5 Conclusions

We discuss the choices of all "parameters" a practitioner has to use when facing a

kernel based specification test where the null hypothesis is non- or semiparametric.

We have set parameters in quotation marks because we refer here also to questions

such as how to generate bootstrap errors, etc. However, our main focus is the boot-

strap and its size distortion in practice when the sample size is small or moderate.

27




These points are illustrated along the popular problem of additivity testing. Natu-

rally, one looks for an optimal trade-off between controlling for size under the null

hypothesis HQ and maximizing power. Even though these problems have already

been discussed and studied in theory, as yet, it is unclear how to set these para-

meters in practice. We show that theory is not just unhelpful here; at present, a

reasonable application of tests of these kinds is questionable.

We try and compare many modifications that can be found in the literature

without finding any clue to an optimal - or even a reasonable - parameter choice.

While there are different suggestions for singular problems such us which residuals

to take for the bootstrap or an adaptive choice of k, combining them gives puzzling

results. Sometimes, in practice, combining these suggestions, the power goes down

where it should increase or size becomes less precise where it should come closer to

the level.

Altogether, we have recommend certain procedures for particular test statistics.

However, the main open question seems to be how to find an automatic choice of

lib- We suggest a new procedure, taken from subsampling theory, that seems to

be a good way to go. However, further research is necessary to provide reliable

procedures for the nonparametric testing problems considered here.

28




REFERENCES

Davidson, R. and Flachaire, E. (2001) The Wild Bootstrap, Tamed at Last, Working

Papers 1000, Queen's University, Department of Economics.

Delgado, M. A., Rodriguez-Poó, J. M. & Wolf, M. (2001). Subsampling Cube Root

Asymptotics with an Application to Manski's MSE. economics letters, 73, 241-250.

Dette, H., von Lieres und Wilkau, C , and Sperlich, S. (2005) A Comparison of Dif-

ferent Nonparametric Method for Inference on Additive Models. J. Nonparametric

Statistics, 17, 57-81.

Guerre, E. and Lavergne, P. (2005). Data-driven rate-optimal specification testing

in regression models. Annals of Statistics, 33(2), 840-870.

Hardle, W and J.S Marrón (1990) Semiparametric Comparison of Regression Curves.

Annals of Statistics, 18, 63-89.

Hardle, W and J.S Marrón (1991) Bootstrap Simultaneous Bars For Nonparametric

Regression. Annals of Statistics, 19, 778-796.

Hardle, W. and E. Mammen (1993) Comparing Nonparametric Versus Parametric

Regression Fits. Annals of Statistics, 21, 1926-1947.

Hardle, W., Sperlich, S., and Spokoiny, V. (2001) Structural tests in additive regres-

sion. J. Am. Statist. Assoc, 96, 1333-1347.

Hengartner, N.W. and Sperlich, S. (2005) Rate Optimal Estimation with the Integra-

tion Method in the Presence of Many Covariates. Journal of Multivariate Analysis,

95, 246-272.

Horowitz, J.L. and Spokoiny, V. (2001) An Adaptive, Rate-optimal Test of Paramet-

ric Mean-Regression Model Against A Nonparametric Alternative. Econometrica,

69, No. 3, 599-631.

Jones, M., C , Davies, S.,J and B. U. Park. (1994) Versions of Kernel-Type Regres-

sion Estimators. Journal of the American Statistical Association, Vol 89, 825-832.

29




Kallenberg, W.C.M. and Ledwina, T. (1995), Consistency and Monte-Carlo simula-

tioins of a data driven version of smooth goodness-of-fit tests, Annals of Statistics,

23, 1594-1608.

Kim, W., Linton, O.B., and Hengartner, N. (1999) A computationally efficient oracle

estimator of additive nonparametric regression with bootstrap confidence intervals.

The J. of Computational and Graphical Statistics, 8, 278-297

Ledwina, T. (1994), "Data-driven version of Neyman's smooth test of fit," J. Amer.

Stat. Ass., 89, 1000-1005

Neumeyer, N. and S. Sperlich, S. (2006) Comparison of Separable Components in

Different Samples. Forthcoming in the Scandinavian Journal of Statistics

Politis, D.N., Romano, J.P., and Wolf, M. (1999) Sub sampling. Springer Series in

Statistics. Springer.

Roca-Pardiñas, J. and Sperlich, S. (2006) Testing the link when the index is semi-

paramtric - A comparison study. Working Paper Universidad de Vigo, Spain.

Rodriguez-Póo, J.M., Sperlich, S., and Vieu, P. (2004) And Adaptive Specification

Test For Semiparametric Models. Working Paper Carlos III de Madrid, Spain.

Rosenblatt, M. (1969) Conditional Probability Density and Regression estimators.

Multivariate Analysis I I , 25-31.

Spokoiny, V. (1996) Adaptive hypothesis testing using wavelets. Annals of Statistics,

24, 2477-2498.

Spokoiny, V. (1998) Adaptive and spatially adaptive testing of a nonparametric

hypothesis. Math. Methods of Statist, 7, 245-273

30




Trimming

0%

5%

a{%)

5

10

5

10

K

4 5 6 7 8 9 4 5 6 7 8 9 4 5 6 7 8 9 4 5 6 7 8 9

under HQ a=0.0 T\ T2 r 3

.000 .000 .008

.040 .000 -008

.068 .000 .008

.128 .000 .012

.176 .000 .012

.256 .000 .012

.024 .000 .032

.068 .000 .024

.120 .000 .024

.184 .000 .024

.272 .000 .020

.344 .000 .020

.012 .000 .008

.060 .000 .008

.108 .000 .008

.172 .000 .012

.284 .000 .012

.340 .000 .012

.040 .000 .024

.084 .000 -020

.168 .000 .024

.288 .000 .024

.364 .000 .020

.440 .000 .020

under H} a=2.0 T\ T2 T 3

.000 .032 .248

.004 .012 .184

.012 .012 .184

.016 .012 .196

.028 .012 .228

.028 .024 .252

.004 .060 .448

.008 .028 .312

.020 .020 .292

.032 .024 .300

.036 .028 .304

.056 .032 .340

.016 .052 .248

.020 .032 .192

.028 .028 .184

.040 .028 .184

.064 .032 .228

.080 .032 .244

.024 .112 .448

.036 .076 .308

.044 .052 .284

.064 .048 .292

.076 .052 .308

.116 .056 .332

Table 1.1: Rejection levels of the three original test statistics with and without trimming. Critical values are determined with golden-cut wild bootstrap, using hb = hn1^"1^ for the pre-estimation.

31



Chapter 1 The Size Problem of Kerne] Based Bootstrap Tests when the Null is Nonparametric

Trimming

0%

5%

a(%)

5

10

5

10

K

4 5 6 7 8 9 4 5 6 7 8 9 4 5 6 7 8 9 4 5 6 7 8 9

under Ho a—0.0 T\ T2 r3

.004 .004 .028

.004 .004 .020

.000 .000 .012

.000 .000 .000

.000 .000 .000

.000 .000 .000

.016 .012 .076

.012 .008 .072

.008 .004 .056

.000 .000 .028

.000 .000 .008

.000 .000 .008

.008 .004 .016

.000 .000 .016

.000 .000 .008

.000 .000 .004

.000 .000 .004

.000 .000 .004

.020 .012 .080

.008 .004 .064

.004 .000 .040

.000 .000 .024

.000 .000 .008

.000 .000 .008

under Hi a=2.0 T\ r2 r 3

.044 .032 .176

.064 .056 .204

.048 .036 .204

.036 .032 .196

.036 .012 .196

.032 .008 .188

.096 .072 .316

.140 .120 .308

.132 .092 .296

.104 .052 .316

.072 .044 .296

.064 .036 .284

.080 .052 .196

.068 .024 .184

.036 .016 .188

.016 .012 .184

.008 .008 .200

.008 .004 .192

.136 .120 .328

.120 .096 .296

.116 .060 .296

.100 .036 .292

.084 .024 .284

.056 .016 .288

Table 1.2: Rejection levels of the three ¿-adaptive test statistics with and without trimming. Critical values are determined with golden-cut wild bootstrap, using h¡, — hn1^5"1^ for the pre-estimation.

32




Trimming

0%

5%

a(%)

5

10

5

10

K

4 5 6 7 8 9 4 5 6 7 8 9 4 5 6 7 8 9 4 5 6 7 8 9

under HQ a=0.0 T\ r2 r 3

.004 .000 .008

.036 .000 .012

.080 .000 .012

.132 .000 .012

.188 .000 .012

.260 .000 .012

.020 .000 .044

.072 .000 .044

.116 .000 .032

.196 .000 .028

.276 .000 .016

.352 .000 .020

.012 .000 .008

.052 .000 .012

.116 .000 .012

.176 .000 .012

.268 .000 .012

.352 .000 .012

.028 .000 .048

.088 .000 .032

.164 .000 .024

.252 .000 .020

.380 .000 .016

.436 .000 .020

under H] a=2.0 T\ T2 r 3

.000 .036 .340

.004 .024 .236

.008 .012 .216

.016 .016 .224

.028 .012 .240

.032 .016 .248

.012 .064 .560

.012 .036 .380

.020 .024 .336

.036 .032 .332

.044 .032 .344

.068 .032 .372

.008 .080 .324

.008 .036 .236

.028 .036 .212

.040 .028 .220

.064 .028 .244

.096 .032 .260

.036 .172 .556

.036 .092 .372

.048 .072 .332

.060 .056 .328

.092 .048 .340

.120 .064 .376

Table 1.3: Rejection levels of the three original test statistics with and without trimming. Critical values are determined with Gaussian bootstrap, using hi, = ^ni/5-i/K £or t k e pre-estimation.

33




Trimming

0%

5%

a (%)

5

10

5

10

K,

4 5 6 7 8 9 4 5 6 7 8 9 4 5 6 7 8 9 4 5 6 7 8 9

under i í 0 c

Tl

.000

.000

.000

.000

.000

.000

.028

.020

.004

.004

.004 .000 .004 .000 .000 .000 .000 .000 .012 .004 .000 .000 .000 .000

T2

.000

.000

.000

.000

.000

.000

.008

.004

.000

.000

.000

.000

.004

.000

.000

.000

.000

.000

.012

.000

.000

.000

.000

.000

1=0.0

r3

.036

.028

.020

.008

.008

.008

.096

.088

.056

.032

.024

.016

.024

.024

.012

.008

.008

.008

.096

.072

.048

.040

.020

.012

under Hi a=2.0 T\

.048

.048

.052

.032

.024

.016

.124

.184

.172

.116

.092

.076

.064

.048

.032

.016

.016

.016 • 136 .124 .100 .072 .052 .040

T7

.028

.048

.032

.016 .008 .008 .096 .156 .124 .072 .048 .032 .036 .020 .012 .004 .004 .004 .100 .092 .048 .032 .012 .012

T3

.220

.204

.216

.200

.204

.200

.364 .340 .328 .324 .296 .304 .220 .200 .196 .200 .200 .204 .368 .332 .300 .312 .292 .296

Table 1.4: Rejection levels of the three fc-adaptive test statistics with and without trimming. Critical values are determined with Gaussian bootstrap, using h>, = hnl/5~1/K for the pre-estimation.

34




Trimming

0% i

5%

a{%)

5

10

5

10

m

50 40 50 40 50 40 50 40

under Ho a

.000

.000

.004

.028

.000

.000

.000

.000

.000

.000

.000

.000

.000

.000

.000

.000

=0.0

7-3

.000

.040

.020

.224

.000

.052

.028

.240

under Hi a T\ T2

.004 -004 .004 ,004 .000 .000 .000 .000

.000

.004

.004

.004

.000

.000

.000

.000

=2.0 T 3

.028

.212

.248

.744

.032

.202

.232

.732

Table 1.5: Rejection levels of the three original test statistics with and without trimming. Critical values are determined with subsampling, using subsamples of sizes m.

Trimming

0%

5%

a(%)

5

10

5

10

m

90 80 70 60 90 80 70 60 90 80 70 60 90 80 70 60

under HQ a=0.0 T\ T2 r 3

.000 .000 .000

.000 .000 .000

.056 .088 .000

.244 .336 .000

.000 .000 .000

.028 .072 .000

.208 ,328 ,000

.584 .680 .000

.000 .000 .000

.000 .000 .000

.008 .016 .000

.060 .084 .000

.000 .000 .000

.008 .012 .000

.048 .096 .000

.196 .304 .000

under Hx a= T¡ T2

.140

.148 -156 .196 .192 .192 .276 .416 .080 .080 .076 ,064 .128 .140 .132 .136

148 160 168 236 196 208 308 484 104 104 088 076 152 148 160 168

=2.0

000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000

Table 1.6: Rejection levels of the three ¿--adaptive test statistics with and without trimming. Critical values are determined with subsampling, using subsamples of sizes m.

35



Chapter ] The Size Problem of Kernel Based Bootstrap Tests when the Null is Nonparametric

Trimming

# 0 (a = 0)

Hi ( a = 2)

0%

5%

0%

5%

T 3

T3

T3

T 3

T3

1

.012

.680

.012

.676

.001

.972

.001

.968

2

.063

.392

.062

.380

.019

.932

.019

.936

3

.028

.032

.028

.024

.042

.632

.042

.620

K

4

030 012 030 012 022 380 023 368

5

.032

.012

.032

.012

.015

.272

.015

.260

6

.031

.012

.031

.012

.011

.260

.011

.252

7

.029

.016

.029

.020

.009

.264

.010

.264

Table 1.7: Rejection levels of T?, and T\ for a = 5%, with and without trimming, using Gaussian bootstrap with hb = /in1/5-1/* for the pre-estimation.

36



Chapter 2

Estimating and Testing An Additive Partially Linear Model in a System of Engel Curves

2.1 Introduction

T H E SPECIFICATION OF ENGEL CURVES IN EMPIRICAL MICROECONOMICS has

been an important problem since the early studies of Working (1943) and Leser

(1963) and the well-known work of Deaton and Muellbauer (1980a), in which they

developed parametric structures such as the Almost Ideal and Translog demand

model. Many Microeconomic examples are provided in Deaton and Muellbauer

(1980b) in which a separable structure is convenient for analysis and important

for interpretability. However, there is increasing empirical evidence pointing to the

conclusion that a sort of nonlinearity is present in the specification of Engel curves.

An alternative way of investigating nonlinear effects is to model consumer behav-

ior by means of semi- and nonparametric additive structures. Moreover, non and

semiparametrie regression provides an alternative to standard parametric regression,

allowing the data to determine the local shape of the conditional mean.

From an economic point of view there are many reasons why it is interesting to

recover a correct specification of Engel curves. Firstly, a correct specification allows

us to examine the nature of the effect of changes in indirect tax reforms. Secondly,

it is important to specify the response of consumers in the face of changes in total

37



Chapter Estimating and Testing An Additive Partially Linear Mode! in a System of Engel Curves

income. Changes of this kind allow us to assess the impact on consumers' welfare.

Consumer demand has become a very important field for applying non and semi-

parametric methods. An interesting analysis of the cross-sectional behavior of con-

sumers in the context of a fully nonparametric model can be found in Bierens and

Pott-Buter (1990). Papers which consider the implementation of semiparametric

methods in empirical analysis of consumer demand include Banks, Blundell and

Lewbel (1997) and Blundell, Duncan and Pendakur (1998). This latter paper is of

special interest because its analysis regression is based on semi- and nonparametric

specifications of Engel curves. It also tests Working-Leser and Piglog's null hypoth-

esis against the well-known partial linear model in which budget expenditures are

linear in the log of total expenditure. In this paper we estimate the Engel curves

directly as in Lyssiotou, Pashardes and Stengos (2003) among others.

We estimate an additive partially linear model (PLM) in order to investigate

consumer behavior using individual household data drawn from the Spanish Expen-

diture Survey (SES) and use the result obtained from semiparametric analysis to

examine the modelling-of age, schooling and expenditure in a system of Engel curves.

The importance of using an additive PLM models lies in the fact that in the context

of this model the effects of expenditure, the age and schooling on consumer demand

can be investigated simultaneously in the semiparametric context1. There are several

ways to get estimations of nonparametric additive structure, and we mention only

the most important: smooth backfitting, series estimators and marginal integration.

In this paper we use internalized marginal integration to estimate nonparametric

components in the additive PLM mainly because at the present time there is no

applied or theoretical study on the testing procedure using smooth backfitting.

Most of the papers that investigate consumer behavior in a nonparametric con-

text are focused on the appropriate way of modeling the form of the Engel curves.

Those focused on the unidimensional nonparametric effect of log total expenditure on

budget expenditures, taking in to account some parametric indexes to reflect demo-

1 Analysis of consumer behavior can he carried out with fully nonparametric models. However, for sake of interpretability and implementation, additive models overcome the well-known problems coming from multidimensional Nadaraya-Watson and Local Polynomial regression estimators.

38



Chapter Estimating and Testing An Additive Partially Linear Mode! in a System of Engel Curves

graphic composition include Blundeli, Browning and Crawford (2003) and references

therein. In this paper we investigate consumer behavior in semi and -nonparametric

terms focused on the nonparametric effect of total expenditure the age and the

schooling. In this study, unless stated otherwise, the effect of age and schooling

refer to the age and schooling of the household head. There is evidence suggesting

that these have deeper effect than generally assumed in parametric demand analysis

(see Lyssiotou, Pashardes and Stengos (2001)). In fact, it is common practice to in-

clude the square of age and/or schooling as well as their higher terms in parametric

models to capture possible nonlinear effects.

Inference in nonparametric regression can take place in a number of ways. The

most natural is to use nonparametric regression as an alternative against a fully

parametric or semiparametric null hypothesis. With this in mind, we investigate

whether an additive PLM provides a reasonable adjustment to our data using differ-

ent resampling schemes to obtain critical values of the test statistics. In this paper

we are interested in applying some recently developed test statistics which are very

popular in the literature about testing semiparametric hypotheses against nonpara-

metric alternatives. These test statistics are in the spirit of Hardle and Mammen

(1993) and Gózalo and Linton (2001), among others. On the other hand there is a

growing interest in the so called adaptive testing methods, in which the test statis-

tics are adaptive to the unknown smoothness of the alternative, see among others

Horowitz and Sponkoiny (2001) and Rodrigue2-Poo, Sperlich and Vieu (2005). in

this paper we adapt their ideas with some differences, where are considered kernel

smoother for our problem.

It should be remarked that a problem that we may well have to consider is the

endogeneity of regressors. Note that in the context of Engel curves total expenditure

may well be jointly determined with expenditure on different goods. The approach

used to solve this problem is instrumental variable estimation. We remark two

recently developed procedures in the context of nonparametric regression to tackle

the problem of endogenous regressors. The so called nonparametric two step least

square (NP2SLS) due to Newey and Powell (2003), and the nonparametric two

39



Chapter Estimating and Testing An Additive Partially Linear Model in a System of Engel Curves

step with generated regressors and constructed variables (NP2SCV) due to Sperlich

(2005). Newey and Powell (2003) 's approach is a cumbersome procedure involving

the choice of basis expansion in the first step. However, Sperlich's approach only

requires a non, semi or even parametric construction of regressors of interest in the

first step. Our feeling is that a generated variables approach in combination with

additive PLM can help us to overcome to some extent any possible endogeneity

problem and that is exactly the procedure implemented in this paper.

The contribution of this work can be summarized as follows. Firstly, we are the

first (to our knowledge) to carry out an exploratory analysis of consumer behavior

with data drawn from the Family Expenditure Survey for Spain using semiparamet-

ric models. Second, we apply recently developed methods to estimate, test (vari-

ous model specifications) and correct for possible endogeneity of total expenditure.

Third, our estimations of the additive model are accompanied by a reasonable mea-

surement of discrepancy between the fully nonparametric model and the additive

estimation. An adequate model check is necessary whenever estimations of additive

models are carried out (Dette, von Lieres and Sperlich (2004)). Additionally, our

measure of discrepancy adapts to the unknown smoothness of the non-parametric

model and this constitutes a novelty in empirical economics.

The rest of the paper is organized as follows. In Section 2 we provide some back-

ground to understand both the estimating and the testing procedures. In Section

3, we discuss the shape of Engel curves and report empirical results obtained from

the application of additive PLM. We also provide the results of testing the additive

specification as well as the linearity of each nonparametric component in additive

PLM regression. In Section 4 concludes.

2.2 Additive Partially Linear Model and Testing Hypothesis

There are many fields of empirical economics in which explanatory variables and

their second power are included in regression analysis to capture nonlinear effects;

40



Chapter Estimating and Testing An Additive Partially Linear Model in a System of Engel Curves

In order to estimate the functions ma (xQ) we first estimate the function m (x) with

a multidimensional local smoother and then integrate out the variables different

from Xa. This method can be applied to estimate all the components, and finally

the regression function m(-) is estimated by summing an estimator ifi of tp, so we

get that:

ms(X3) = 4, + ¿ ¿ Kh [X3a - Xia) filM-Yi ¡4}

for j=l,...,n. The expression to get the estimation of each component rna (•) defined

in [4], is called the internalized marginal integration estimator (IMIE) because of

the joint density that appears under the summation sign. For a detailed explanation

see Dette, von Lieres and Sperlich (2004) and references therein. Note that IMIE

does not provide exactly the orthogonal projection onto the space of additive func-

tions. In other words, the sum of the estimated nonparametric components does

not necessarily recover the complete conditional mean because

tesis doctoral de la universidad de alicante. tesi...

Documents