tobit - 1

Estimating Tobit models for panel data with

autocorrelated errors

Giorgio Calzolari∗ Laura Magazzini†

Paper submitted for presentation at ESEM 2008(Milano, 27-31 August, 2008).

PRELIMINARY AND INCOMPLETE‡.

Abstract

The performance of simulation-based estimators for panel data To-bit models (censored regression) with random effects and autocorrelatedAR(1) errors is evaluated with Monte Carlo experiments. Examples showthat poor identifiability of parameters can arise in this context when theautocorrelation parameter is moderately high. An example of applicationis provided on a model analysing the patent-R&D relationship.

1 Introduction

This paper aims at evaluating the performance of simulation-based estimationtechniques in the context of a Tobit model for panel data with autocorrelatederrors and random effects. The availability of panel data (i.e. repeated obser-vations over time on the same unit) offers a great number of advantages forestimation over single cross-section or time-series data. First, panel data allowto control for time-invariant or unit-invariant characteristics, whose omissioncan result in biased estimates in a cross-section or a time-series setting. Then,the availability of repeated observations over the same unit allows answeringquestions about the dynamic behaviour of economic variables that could not behandled in a time-series or cross-section context.Despite that, the application of limited dependent variable models to panel datahas been hampered by the intractability of the likelihood function, that containsintegrals with no closed form solution, unless restrictive (and not realistic) hy-potheses are imposed on the structure of the model.

∗Department of Statistics, Universita di Firenze, <[email protected]>†Department of Economics, Universita di Verona, <[email protected]>‡We gratefully acknowledge suggestions from C. Rampichini, E. Sentana, G. Fiorentini,

DM. Drukker, A. Gottard, D. Lubian, MJ. Lombardi, but retain full responsability for thecontents of this paper.

1

Simulation-based estimation procedures, developed in recent years, offer a sim-ple solution to the problem, making it feasible to estimate models with in-tractable objective functions. These methods have initially been applied for theestimation of models with multi-dimensional integrals in the likelihood equationsor moment conditions (as it is in our field of investigation), due, for example,to the trasformation of a latent-variable model in a model that describes theobserved data, or to the presence of missing data, or in models where randomcoefficients or heterogeneity factors are considered.We will consider the method of indirect estimation (Gourieroux, Monfort andRenault 1993, Smith Jr 1993, Gallant and Tauchen 1996) and simulated maxi-mum likelihood (Lerman and Manski 1981, Pakes 1986).The simulated maximum likelihood method replaces the (intractable) likelihoodfunction with an approximation obtained with simulations. As a result, theobjective function is computationally tractable and can be used to obtain theparameter estimates.On the contrary, indirect estimation makes use of an auxiliary model, that canbe easily estimated, and calibrates the structural parameters in order to havesimilar characteristics of the observed endogenous variables and of the valuesobtained via simulation of the structural model.In a previous work (Calzolari, Magazzini and Mealli 2001), the performanceof the simulation methods has been highlighted in the context of Tobit modelfor panel data with uncorrelated disturbances. Autocorrelation of disturbancesintroduces additional difficulties which are treated in this paper.As an alternative, a fixed effect approach might be considered. Till recent years,application of the fixed effect limited dependent variable models for panel datahas been limited, due to the well-known “incidental parameter problem” (Ney-man and Scott 1948). Inconsistency in the estimates of the fixed effects (that,in the limited dependent variable case cannot be removed by transforming thedata) leads to inconsistent estimates of the variable coefficients. Alternativeestimators for reducing the bias have been proposed (see Arellano and Hahn(2005) for a review). In the case of the censored regression model, Greene(2004) shows that, differently from probit and logit models, the estimation ofthe location coefficients is consistent, whereas inconsistency emerges in the es-timation of the variance components, with implications for the estimation ofstandard errors and marginal effects. As an advantage over the random effectspecification, estimation in the fixed effect models do not rely on the assump-tion of independence between the individual effects and the variables includedin the regression. Nonetheless, dependency can be modeled and the random ef-fect specification can be adjusted in order to take into account the relationshipbetween the individual effects and the exogenous variables (Chamberlain 1980).The paper proceeds as follows. In the next section we describe the econometricmodel. Section 3 discusses the application of simulation method to this setting,whereas Section 4 provides the results of a preliminary Monte Carlo experiment.Section 5 presents an application of the method for the study of the patent-R&Drelationship. Section 6 summarizes our findings.

2

2 The random effect Tobit model (panel data)

The first application of the censored regression model dates back to 1959, whenTobin (1958) analyzed the level of expenditure on durable goods. Other applica-tions have been developed in the field of labor participation, for the analysis ofthe number of hours at work, where the dependent variable is set to zero whenthe person is not employed. All in all, the Tobit model, or censored regres-sion model, can be used when a large number of observation on the dependentvariable assumes the value zero (or some other limiting value).The data generating process can be thought of in terms of a latent variablecrossing a threshold. The variable of interest is observed only if it lies withina certain range, otherwise the observation is censored and the limiting value isreported. Without loss of generality we will set the limiting value to zero.The latent variable, y∗it, is expressed as a linear function of a set of independentvariables, Xit, and an error term, νit:

y∗it = X ′itβ + νit (1)

where i denotes the unit (household, firm, country, ...), and the index t denotesthe time, with i = 1, ..., N ; t = 1, ..., T .Observation on the dependent variable is driven by the following rule: yit =max{0, y∗it}, i.e. the dependent variable is observed only if non negative, oth-erwise a zero is recorded. As a result, the likelihood function for the wholeobserved vector y is a mixture of discrete and continous distributions.When analysing panel data, the error structure can be decomposed into threeindependent terms:

νit = αi + λt + eit (2)

where αi is the individual effect, representing all the time-invariant (unobservedor unobservable) characteristics of unit i, λt is the time effect, representing allthe characteristics of time t, invariant across all the cross-sectional units in thesample, and eit is a random term that varies over time and individuals.In standard settings, the error term eit is assumed to be serially uncorrelated.This assumption is not suited to situations where the effect of unobserved vari-ables vary sistematically over time, as in the case of serially correlated omittedvariables or transitory variables whose effect last more than one period. Recentresearch in linear models with random effects is considering serial correlation inthe time dimension (Karlsson and Skoglund 2004). Rather, we consider distur-bances that are correlated over time, produced by an AR(1) process:

eit = ρei,t−1 + wit (3)

with wit omoschedastic, uncorrelated, and with mean zero. The error term λt

is not considered in the analysis, since it can be easily accounted for in a typical(short-T ) panel setting by inserting time-dummies in the regression.

3

As a result of these assumptions, the variance-covariance matrix of the errorterm νit has the following structure:

E[νitνjs] =

σ2

α + σ2e if i = j, t = s

σ2α + ρ|t−s|σ2

e if i = j, t 6= s0 if i 6= j

(4)

Autocorrelated disturbances in panel data linear models with random effectshave been considered for the first time by Lillard and Willis (1978). The authorsestimate the model parameters by first applying OLS on the data pooled overindividuals and years, and then the variance components and the autocorrelationparameter are estimated by applying maximum likelihood to the OLS residuals.If data are not censored, estimation can be easily handled both in the randomand fixed effect approaches1.Difficulties arise when observations on the dependent variable are censored.Even if a single time series is considered, maximum likelihood estimation ofthe autocorrelated models require the evaluation of multiple integrals (Zegerand Brookmeyer 1986). If the time dimension is sufficiently small, the integralmay not be difficult to compute, and in special cases the likelihood can bedecomposed into the product of unidimensional integrals. Consistent estimatesmay be also obtained ignoring serial correlation, and treating the observations asindependent (Robinson 1982). However, alternative estimation procedures havebeen devised which are shown to perform better than the estimator obtainedby ignoring serial correlation (Dagenais 1989).In the case of panel data, the issue is further complicated by the introduction ofindividual effects, that allow to capture the effect of unit-specific unobservablesor unobserved ‘heterogeneity’, therefore reducing the omitted variable bias thatmight arise in time-series or cross-section analysis.Empirical studies have already analyzed the Tobit model with random effect andautocorrelated disturbances where the problem of intractability of the likelihoodfunction has been solved by the application of simulated maximum likelihood.However, to our knowledge, a Monte Carlo experiment evaluating the perfor-mance of the estimator is still lacking.Hajivassiliou (1994) applies simulation technique for the estimation of the inci-dence and extent of external financing crises of developing countries, allowingflexible correlation structure in the unobservables. Both multiperiod Tobit andprobit models are considered and the author assumes a one-factor plus AR(1)structure, whose coefficients are estimated via smoothly simulated maximumlikelihood (based on a smooth recursive conditioning simulator) and via themethod of simulated scores (based both on a smooth recursive conditioningsimulator and on a Gibbs sampling simulator).More recently, Schmit, Gould, Dong, Kaiser and Chung (2003) consider paneldata on household cheese purchases to examine the impact of U.S. generic cheese

1See Bhargava, Franzini and Narendranathan (1982) for the discussion of autocorrelatedmodels within the linear fixed effect framework.

4

advertising on at-home consumption. The model accounts for the panel and cen-sored nature of the data, as well as for an autoregressive error structure. Theproblem of high-order integrals appearing in the likelihood function is solvedusing techniques for simulating the probabilities and partitioning the data, ex-tending the procedure proposed by Zeger and Brookmeyer (1986) for the analysisof censored autocorrelated data. The authors have applied the methodology alsoin a study of the purchase process for frequently purchased commodity (Dong,Schmit, Kaiser and Chung 2003).

3 Simulation-based estimation

Simulation-based estimation allows us to overcome the problem of intractabilityof the likelihood function. Two approaches are considered and compared: (con-strained) indirect estimation and simulated maximum likelihood, employing theGeweke-Hajivassiliou-Keane, henceforth GHK, simulator (see e.g. Hajivassiliouand McFadden, 1998 for details).Early research on the one-way Tobit model with no autocorrelation showed thatsimulation-based estimation performs well enough for estimation (Calzolari et al.2001).

3.1 Indirect Estimation

Indirect estimation methods2 represent an inferential approach which is suitablefor situations where the estimation of the statistical model of interest is too dif-ficult to be performed directly, while it is straightforward to produce simulatedvalues from the same model. It was first motivated by econometric models withlatent variables, but it can be applied in virtually every situation in which thedirect maximization of the likelihood function turns out to be difficult.The principle underlying the so-called Efficient Method of Moments, henceforthEMM, (Gallant and Tauchen 1996) is as follows. Suppose we have a sampleof observations y and a model whose likelihood function L(y; θ) is difficult tohandle and maximize.3 The maximum likelihood estimate of θ ∈ Θ, given by

θ = arg maxθ∈Θ

lnL(θ; y),

is thus unavailable. Let us now take an alternative model, depending on aparameter vector β ∈ B, which will be indicated as auxiliary model, easier tohandle, and suppose we decide to use it in the place of the original one. Sincethe model is misspecified, the quasi-ML (or pseudo-ML) estimator

β = arg maxβ∈B

ln L(β; y),

is not necessarily consistent: the idea is to exploit simulations performed underthe original model to correct for inconsistency.

2See, for a general treatment, the fourth chapter of Gourieroux and Monfort (1996).3We remark that the model could also depend on a matrix of explanatory variables X.

5

One now simulates a set of S vectors from the original model on the basis of anarbitrary parameter vector θ, and denotes each one of those vectors as ys(θ).Of course, using observed data y, the score function of the auxiliary model:

∂ ln L(β; y)∂β

, (5)

is zero when evaluated at the quasi-maximum likelihood estimate β. However,using simulated ys(θ), the score at the same parameter value β is usually notzero. The idea is to make the score “as close as possible to zero”, minimizing

S∑s=1

∂ ln L [β; ys(θ)]∂β

∣∣∣∣∣β=β

′

Ψ

S∑

s=1

∂ ln L [β; ys(θ)]∂β

∣∣∣∣∣β=β

, (6)

where Ψ is a symmetric nonnegative definite matrix defining the metric.4 Thisapproach is especially useful when a closed form expression for the score of theauxiliary model is available. In this case, the procedure is computationally fasterthan other indirect estimation procedures, like “indirect inference” (Gourierouxet al. 1993) that would require iterated numerical re-estimation of the auxiliarymodel. In our specific case the score is available in closed form.

3.1.1 Constrained indirect estimation

The estimation of the auxiliary model needs to incorporate inequality con-straints to rule out negative variances and autocorrelation parameter greaterthan 1, but also to avoid poorly identified regions of the parameter space of theauxiliary model. Indirect estimation in the presence of constraints is examinedin Calzolari, Fiorentini and Sentana (2004): rather than the quasi-likelihood ofthe auxiliary model, one has to consider the Lagrangian function

Λ(δ; y) = ln L(β; y) + r′(β)λ, (7)

where r is the functional vector containing the restrictions, λ are the multipliersand δ = (β′;λ′)′.Assuming that both the log likelihood function and the vector of constraints aretwice continuously differentiable with respect to β, the latter with a Jacobianmatrix ∂r′(β)/∂β whose rank coincides with the number of effective constraints,the first-order conditions that take into account the constraints will be givenby:

∂Λ(δ; y)∂β

=∂ ln L(β; y)

∂β+ λ

∂r′(β)∂β

= 0. (8)

Under the constraints, the quadratic form (6) to be minimized thus becomes{∂Λ [δ; ys(θ)]

∂β

}′

Ψ{

∂Λ [δ; ys(θ)]∂β

}, (9)

4Details on how to obtain an optimal weighing matrix can be found in Gallant and Tauchen(1996).

6

which is not more complex than (6) if we consider that the second term of (8)

λ∂r′(β)

∂β

does not depend on simulated data and is therefore equal to the “score of thequasi-likelihood” computed at β with observed data. It is thus equal to zero ifthe restrictions are not binding, and different from zero in case they are.To wrap up, in our specific case, we will conduct constrained EMM with non-negativity constraints for σ2

α and σ2e and an upper bound (of course < 1) for

|ρ|. It is important to remark (Calzolari et al. 2004) that when ρ clashes onits upper bound, the information on β will be contained in the Kuhn-Tuckermultiplier

λ =∂ ln L(β; y)

∂β;

therefore a by-product of the constrained approach will be to eschew the pooridentification problem arising when ρ is close to 1, as it will be explained in thenext section.The procedure is implemented in Fortran 77.

3.1.2 The auxiliary model

We use, as auxiliary model, the same model of interest (1), where we treat thecensoring process as a random cancellation process, independent from the data.In other words, the missing variables are simply disregarded, as if missingnesswas ignorable. This implies a misspecification, since missingness is not at allignorable in a Tobit model: thus, parameters estimated applying quasi-ML tothe auxiliary model will be biased (inconsistent). Correcting the bias is thepurpose of the indirect estimation procedure. In addition, there will be anobvious loss of efficiency (not cured by the indirect estimation procedure) dueto the complete cancellation of the censored values.If no censoring occurs, the (T × T ) covariance matrix of the i−th individualerror terms is

Σ = Cov(νi) = σ2αιι′ + σ2

eCorr(ρ) (10)

where ι is a (T×1) vector of elements = 1, and Corr(ρ) is the (T×T ) correlationmatrix of an AR(1) process with coefficient ρ.Thus, the contribution of the i−th individual data to the log-likelihood is

−12

ln |Σ| − 12(yi −Xiβ)′Σ−1(yi −Xiβ)

When some observations are cancelled, the vector, still indicated as yi − Xiβ,is compacted (thus it has less than T elements), and compacted is also the Σmatrix, after rows and columns corresponding to the cancelled data are dropped:Σi will be the resulting matrix. The contribution to the log-likelihood of thenot cancelled data of the i−th individual data is

−12

ln |Σi| −12(yi −Xiβ)′Σ−1

i (yi −Xiβ)

7

3.1.3 Identifiability and poor identification

Still in absence of censoring, if T = 2 the covariance matrix would be

Σ = Cov(νi) =[

σ2α + σ2

e σ2α + σ2

eρσ2

α + σ2eρ σ2

α + σ2e

](11)

thus it contains only two independent elements, from which it is impossible toidentify separately the three parameters of the auxiliary model σ2

α, σ2e and ρ.

If T = 3, then the covariance matrix would be

Σ = Cov(νi) =

σ2

α + σ2e σ2

α + σ2eρ σ2

α + σ2eρ2

σ2α + σ2

eρ σ2α + σ2

e σ2α + σ2

eρ

σ2α + σ2

eρ2 σ2α + σ2

eρ σ2α + σ2

e

(12)

where the independent elements are three, making identification possible.Of course the situation becomes even better for larger values of T .In practice, however, identification can be very poor when ρ is moderately high,even for values that would not be considered dangerously close to 1 in a timeseries context. Some numerical examples could well exemplify the problem.Let be T = 4, σ2

α = 35, σ2e = 5 and ρ = 0.9. The covariance matrix is

40.00 39.50 39.05 38.6539.50 40.00 39.50 39.0539.05 39.50 40.00 39.5038.65 39.05 39.50 40.00

and the logarithm of its determinant is 3.664.Let however be σ2

α = 20, σ2e = 20 and ρ = 0.975. In this case the covariance

matrix is

40.00 39.50 39.01 38.5439.50 40.00 39.50 39.0139.01 39.50 40.00 39.5038.54 39.01 39.50 40.00

and the logarithm of its determinant is 3.670.Finally, if σ2

α = 5, σ2e = 35 and ρ = 0.9857, the covariance matrix is

40.00 39.50 39.01 38.5239.50 40.00 39.50 39.0139.01 39.50 40.00 39.5038.52 39.01 39.50 40.00

and the logarithm of its determinant is 3.673.Of course, the matrices and the corresponding determinants are not equal, butquite close to each other, thus suggesting that an estimation procedure would

8

not reach convergence in a simple or straightforward way. It would be necessaryto have larger values of T (so that higher powers of ρ could make the differ-ence), or a very large number of observations to ensure reliable (and meaningful)estimation results.Just one more example, to stress the point of poor identification. σ2

α = 9, σ2e = 1

and ρ = 0.9082, produce this covariance matrix

10.00 9.91 9.82 9.759.91 10.00 9.91 9.829.82 9.91 10.00 9.919.75 9.82 9.91 10.00

very close to the matrix that would be produced by σ2α = 5, σ2

e = 5 andρ = 0.98205

10.00 9.91 9.82 9.749.91 10.00 9.91 9.829.82 9.91 10.00 9.919.74 9.82 9.91 10.00

It is quite unlikely that censoring can help identification. For this reason, ourfirst set of Monte Carlo experiments will use moderately low values of ρ (till0.7). Experiments with larger values of ρ are in progress.

3.2 Simulated Maximum Likelihood

The likelihood function for the panel data Tobit model with random effects andautocorrelated disturbances can be written as:

L =N∏

i=1

Li(yi) =N∏

i=1

∫{y∗i |yi=max(0,y∗i )}

φT (y∗i −X ′iβ; Σ)dy∗i (13)

where Li(yi) represents the likelihood function for the i-th unit (i = 1, ..., N),and φT is the T -variate normal density with mean zero and variance covariancematrix Σ, given in (10). Let us indicate with yi0 the censored observation forunit i, and with yi1 the uncensored (positive) observation for unit i.We can distinguish three cases:

1. All the observations for unit i are positive. In this case no problem arisesfor computing the contribution to the likelihood of unit i which is equalto Li = φT (yi −X ′

iβ; Σ).

2. All the observations for unit i are equal to zero. The contribution to thelikelihood of unit i is equal to Li = ΦT (−X ′

iβ; Σ), where a T -fold integralneeds to be evaluated. The GHK simulator5 is used for the evaluation ofthe normal probabilities.

5The algorithm was the most reliable among those examined by Hajivassiliou, McFaddenand Ruud (1996). See e.g. Hajivassiliou and McFadden (1998) for details.

9

3. Observations for unit i display both positive and zero values. We par-tition the T observations of unit i into two mutually exclusive sets: onecontaining the censored observation, indexed by i0, and one containingthe uncensored observations, indexed by i1: T = Ti0 + Ti1. As a result,Li can be decomposed into

Li(yi) = Li(yi0, yi1) = φTi1(yi1 −X ′i1β; Σ1)× ΦTi0(−X ′

i0β|yi1; Σ0|1).

The log-likelihood is composed of two term: one (φTi1) has a closed formexpression, and the second one (ΦTi0) is the multinomial probability thatall components of {yi0} are zero (i.e. the components of {y∗i0} are nega-tive), conditioning on the set {yi1}. The GHK simulator is employed forthe evaluation of integrals of dimension higher than 1 (i.e. if Ti0 > 1, oth-erwise ΦTi0 requires the evaluation of a one-dimensional integral, posingno computational problems).

The major drawback of this estimation method is its inconsistency when thenumber of pseudo-random values used to approximate the likelihood functionis fixed. This is due to the fact that the likelihood function is approximated,whereas the log-likelihood is maximized. However, the simulated maximumlikelihood estimator has the same performance as maximum likelihood estimatorif the number of observations (in our case N × T ) and the number of simulatedpseudo-random values (S) tend to infinity in a way that

√N × T/S tends to

zero.

4 Monte Carlo results

To study the properties of these methods, we perform a simulation study andapply the methods to the pseudo-observed data produced by simulation of thedata generating process.We considered the following equation:

y∗it = β0 + β1Xit + νit (14)

Observations are censored, since y∗it is observed only if it is greater than 0, it is0 otherwise. Xit were generated i.i.d. ∼ N(6, 4).In the set of experiments discussed in this paper, β1 is set equal to 2, and theintercept equals -12, corresponding to approximately 50% of censored observa-tions.The error term is obtained as:

ν∗it = σααi + eit (15)

with eit = ρei,t−1 + σwwit, wit i.i.d.∼ N(0, 1), and σ2w = σ2

e(1− ρ2).Each estimation considered N = 500 and T = 10 for a total of 5000 observa-tions. As an example, the sample can resemble a panel of firms or a sample ofhouseholds observed over a 10-year time period (approximately the number of

10

β0 β1 σ2α σ2

e

True values -12.00 2.000 1.000 1.000Results of indirect estimationMC mean -12.01 2.001 0.9973 1.003MC var. 0.2423e-1 0.3391e-3 0.6968e-2 0.1407e-2Results of maximum likelihood (Gauss-Hermite quadrature)MC mean -12.00 2.001 0.9969 1.002MC var. 0.8926e-2 0.1193e-3 0.2346e-2 0.3847e-3Indirect estimation: mean and var. of 1000 MC replications.

Max. Lik.: mean and var. of 100 MC replications.

Estimation by Gauss-Hermite quadrature (25 points) is obtained in STATA.

Table 1: Results of Monte Carlo experiments uncorrelated error terms.

observations in our empirical application; see Section 5). For indirect estima-tion we used S = 10, thus 50000 simulated observations are used to computethe score of the quasi-likelihood.Three true values of ρ are used in the experiments: 0, 0.5, 0.7. Experimentswith larger values of ρ are still in progress (they need a careful use of theconstraints to reduce the problems of poor identification of the auxiliary modelparameters). In each experiment 1000 Monte Carlo replications are performed,and means and variances of estimated parameters are displayed in Table 2. Theresults for the case of ρ = 0 in Table 2 allow the comparison with the sameestimation method applied to the model without autocorrelation (Table 1).In the case of simulated likelihood S is set to 75 (results at the bottom of Table2). Due to large increases in computational times, the method of simulatedlikelihood is evaluated on the basis of 100 Monte Carlo replications. Resultswill be extended before Conference time.Some considerations can be derived from the results of Tables 1 and 2.1. When results with both methods are available, Maximum Likelihood (withGauss-Hermite quadrature or simulated) has variances between one third and ahalf of the corresponding variances of indirect estimation parameters. Roughlyspeaking, this is more or less what we might expect, since indirect estimationhas completely ignored the censored values (about a half of the total).2. Indirect estimation gets rid of the bias due to misspecification of the auxiliarymodel. This is obtained with a considerable reduction of the computational costswith respect to Maximum Likelihood.3. (Work in progress, no result displayed yet) Use of the constrained indirectprocedure may help in producing reasonable results, when a large value of theautocorrelation parameter causes poor identification of the auxiliary model.

11

β0 β1 σ2α σ2

e ρResults of indirect estimationTrue values -12.00 2.000 1.000 1.000 0.0000MC mean (a.m.) -10.49 1.829 0.8849 0.9186 -0.1834e-2MC mean (m.i.)a -11.99 1.999 1.002 1.002 0.1451e-2MC var. (m.i.) 0.2309e-1 0.3044e-3 0.9585e-2 0.1751e-2 0.1935e-2True values -12.00 2.000 1.000 1.000 0.5000MC mean (a.m.) -10.74 1.858 0.9328 0.9131 0.4669MC mean (m.i.) -11.99 2.000 1.002 1.003 0.5000MC var. (m.i.) 0.1768e-1 0.2355e-3 0.1304e-1 0.3705e-2 0.1257e-2True values -12.00 2.000 1.000 1.000 0.7000MC mean (a.m.) -10.98 1.886 0.9948 0.9005 0.6651MC mean (m.i.) -11.99 2.002 1.000 1.011 0.6992MC var. (m.i.) 0.1519e-1 0.1915e-3 0.2064e-1 0.9426e-2 0.1091e-2True values -12.00 2.000 5.000 5.000 0.0000MC mean (a.m.) -7.347 1.518 3.070 3.873 -0.4718e-3MC mean (m.i.)b -12.01 2.000 5.044 5.014 0.1413e-2MC var. (m.i.) 0.1386 0.1450e-2 0.4040 0.6150e-1 0.2615e-2True values -12.00 2.000 5.000 5.000 0.5000MC mean (a.m.) -7.936 1.590 3.394 3.814 0.4054MC mean (m.i.) -12.01 2.000 5.055 5.020 0.4996MC var. (m.i.) 0.1235 0.1119e-2 0.5727 0.1216 0.1557e-2True values -12.00 2.000 5.000 5.000 0.7000MC mean (a.m.) -8.529 1.664 3.827 3.679 0.5992MC mean (m.i.) -12.02 2.000 5.052 5.055 0.6997MC var. (m.i.) 0.1309 0.8381e-3 0.9111 0.3026 0.1329e-2Results of simulated maximum likelihood estimationTrue values -12.00 2.000 1.000 1.000 0.7000MC mean -12.01 2.000 1.002 1.008 0.7006MC var. 0.9927e-2 0.1182e-3 0.1005e-1 0.4916e-2 0.5420e-3m.i.: model of interest; a.m.: auxiliary model.a The algorithm did not converge in 8 cases.b The algorithm did not converge in 4 cases.Indirect estimation: mean and var. of 1000 MC replications.Simulated Max. Lik.: mean and var. of 100 MC replications.

Table 2: Results of Monte Carlo experiments, autocorrelated error terms.

12

5 An application to the patent-R&D relation-ship

Innovation and technological change are largely recognized as the main driversof long-term economic growth. Despite that, the empirical account of the dy-namic relationship between the inputs and outputs of technological activities ishindered by the difficulties in devising indicators that can proxy in a consistentand systematic way the inputs and the outputs of the technological activities.Against this background, the literature on the source of technological growthhas relied on the level of R&D expenditure as a proxy for R&D input, and thereis increasing acknowledgment of the idea that patents can be fruitfully employedas a proxy for R&D output (Griliches 1990, Jaffe and Trajtenberg 2002, Cincera1997).Most available empirical studies rely on count data models for investigating therelationship between patents and (log)R&D (Hall, Griliches and Hausman 1986,Hausman, Hall and Griliches 1984, Cincera 1997). Even though patent data canonly take integer values, the variable lies on a large support6, allowing also theestimation of a linear model. Nonetheless, a large proportion of observationsreport zero patent, therefore the censored regression model is more appropriate.We apply the proposed methodology to the data employed by Hall et al. (1986),which cover information about the patenting and R&D activity of a sample of346 US manufacturing firms over the period 1970-19797. Additional informationare available for a larger set of firms but on a more limited time frame (642 USmanufacturing firms observed over the period 1972-1979). The model considersthe number of patents as a function of log R&D expenditure. Since the analysisof Hall et al. (1986) reveals that R&D and patents appear to be dominated by acontemporaneous relationship with little effect of leads and lags, we only focuson contemporaneous relationship, considering the possibility of autocorrelationof the error terms.Estimates obtained by indirect estimation and simulated maximum likelihoodare reported in Table 3.Computation of standard errors is in progress, and presumably it will helpexplaining the difference in the estimated coefficients, particularly the coefficientof LogR, which is the variable of primary interest. However, the estimated ρshows that substantianl correlation (about 0.7) exists across the error terms inour data.

6 Summary

This paper has shown the performance of simulated estimators in the contextof the Tobit model with random effects and autocorrelated disturbances.

6In Cincera (1997) the number of patents applied for by a firm ranges from 0 to 925,whereas in Hall et al. (1986) the maximum number of patent is 831.

7In a future version of the paper we will also take into consideration the data analysed byCincera (1997).

13

Sample 1 Sample 2Variable N = 346, T = 10 N = 642, T = 8Indirect EstimationConstant -159.1 -181.2Scientific sector 15.05 16.70LogK 14.24 12.66LogR 39.54 41.86σ2

α 4418. 5348.σ2

e 162.6 332.3ρ 0.8190 0.7629Simulated Maximum LikelihoodConstant -47.75 -43.24Scientific sector 17.91 14.41LogK 17.20 16.47LogR 5.63 3.17σ2

α 4443 4443σ2

e 589.4 300.6ρ 0.7610 0.6430% censored 17.49 22.88

Table 3: Estimation results

Monte Carlo experiments hightlight the good performance of the methods inthis context. A problem of poor identifiability arises in this context for values ofthe autocorrelation parameter that are moderately high. Constrained indirectestimation is proposed as a tool for solving the problem.An example of application is presented for the analysis of the patent-R&D re-lationship. High autocorrelation (about 0.7) is estimated.

References

Arellano, M. and Hahn, J.: 2005, Understanding Bias in Nonlinear Panel Mod-els: Some Recent Developments, Invited Lecture at the Econometric SocietyWorld Congress, London .

Bhargava, A., Franzini, L. and Narendranathan, W.: 1982, Serial Correlationand the Fixed Effects Model, The Review of Economic Studies 49(4), 533–549.

Calzolari, G., Fiorentini, G. and Sentana, E.: 2004, Constrained Indirect Esti-mation, Review of Economic Studies 71(4), 945–973.

Calzolari, G., Magazzini, L. and Mealli, F.: 2001, Simulation-Based Estima-tion of Tobit Model with Random Effects, in R. Friedmann, L. Knuppel

14

and H. Lutkepohl (eds), Econometric Studies - A Festschrift in Honour ofJoachim Frohn, LIT Verlag Berlin-Hamburg-Munster, pp. 349–369.

Chamberlain, G.: 1980, Analysis of Covariance with Qualitative Data, TheReview of Economic Studies 47(1), 225–238.

Cincera, M.: 1997, Patents, R&D, and Technological Spillovers at the FirmLevel: Some Evidence from Econometric Count Models for Panel Data,Journal of Applied Econometrics 12(3), 265–280.

Dagenais, M.: 1989, Small sample performance of parameter estimators fortobit modesl with serial correlation*, Journal of Statistical Computationand Simulation 33(1), 11–26.

Dong, D., Schmit, T., Kaiser, H. and Chung, C.: 2003, Modeling the House-hold PUrchasing Process Using a Panel Data Tobit Model, Cornell Uni-versity, Dept. of Applied Economics and Management Research BullettinRB 2003-07.

Gallant, A. and Tauchen, G.: 1996, Which Moments to Match?, EconometricTheory 12(4), 657–681.

Gourieroux, C. and Monfort, A.: 1996, Simulation-Based Econometric Methods,Oxford University Press, USA.

Gourieroux, C., Monfort, A. and Renault, E.: 1993, Indirect Inference, Journalof Applied Econometrics 8(S1), S85–S118.

Greene, W.: 2004, Fixed Effects and Bias Due to the Incidental ParametersProblem in the Tobit Model, Econometric Reviews 23(2), 125–147.

Griliches, Z.: 1990, Patent Statistics as Economic Indicators: A Survey, Journalof Economic Literature 28(4), 1661–1707.

Hajivassiliou, V.: 1994, A Simulation Estimation Analysis of the External DebtCrises of Developing Countries, Journal of Applied Econometrics 9(2), 109–131.

Hajivassiliou, V. and McFadden, D.: 1998, The Method of Simulated Scores forthe Estimation of LDV Models, Econometrica 66(4), 863–896.

Hajivassiliou, V., McFadden, D. and Ruud, P.: 1996, Simulation of multivari-ate normal rectangle probabilities and their derivatives: Theoretical andcomputational results, Journal of Econometrics 72(1-2), 85–134.

Hall, B., Griliches, Z. and Hausman, J.: 1986, Patents and R and D: Is Therea Lag?, International Economic Review 27(2), 265–283.

Hausman, J., Hall, B. and Griliches, Z.: 1984, Econometric Models for CountData with an Application to the Patents-R & D Relationship, Econometrica52(4), 909–938.

15

Jaffe, A. and Trajtenberg, M.: 2002, Patents, Citations, and Innovations: AWindow on the Knowledge Economy, MIT Press.

Karlsson, S. and Skoglund, J.: 2004, Maximum-likelihood based inference inthe two-way random effects model with serially correlated time effects,Empirical Economics 29(1), 79–88.

Lerman, S. and Manski, C.: 1981, On the Use of Simulated Frequenciesto Approximate Choice Probabilities, in C. Manski and D. McFadden(eds), Structural Analysis of Discrete Data with Econometric Applications,Vol. 10, MIT Press, pp. 305–319.

Lillard, L. and Willis, R.: 1978, Dynamic Aspects of Earning Mobility, Econo-metrica 46(5), 985–1012.

Neyman, J. and Scott, E.: 1948, Consistent Estimates Based on Partially Con-sistent Observations, Econometrica 16(1), 1–32.

Pakes, A.: 1986, Patents as Options: Some Estimates of the Value of HoldingEuropean Patent Stocks, Econometrica 54(4), 755–784.

Robinson, P.: 1982, On the Asymptotic Properties of Estimators of ModelsContaining Limited Dependent Variables, Econometrica 50(1), 27–41.

Schmit, T., Gould, B., Dong, D., Kaiser, H. and Chung, C.: 2003, The Impact ofGeneric Advertising on US Household Cheese Purchases: A Censored Au-tocorrelated Regression Approach, Canadian Journal of Agricultural Eco-nomics 51(1), 15–37.

Smith Jr, A.: 1993, Estimating Nonlinear Time-Series Models Using SimulatedVector Autoregressions, Journal of Applied Econometrics 8, 63–84.

Tobin, J.: 1958, Estimation of Relationships for Limited Dependent Variables,Econometrica 26(1), 24–36.

Zeger, S. and Brookmeyer, R.: 1986, Regression Analsis with Censored Autocor-related Data, Journal of the American Statistical Association 81(395), 722–729.

16

tobit - 1

Documents

household cheese purchases

monte carlo experiments

simulated maximum likelihood

covariance matrix is40

error term

maximum likelihood

individual eects

indirect estimation