ersa training workshop lecture 4: estimation of … · lecture 4: estimation of production...

84
ERSA Training Workshop Lecture 4: Estimation of Production Functions with Micro Data Mns Sderbom Thursday 15 January 2009

Upload: truonganh

Post on 05-Jul-2018

215 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

ERSA Training WorkshopLecture 4: Estimation of Production

Functions with Micro Data

Måns Söderbom

Thursday 15 January 2009

Page 2: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

1 Introduction

Earlier in this course you will have seen how panel data methods can be used toestimate the parameters of a production function, which may take the followingform:

yjt = �kkjt + �lljt +�!j + ujt

�;

where y; k; l denote output (or value-added), capital, labour, respectively, j; tdenote �rm and time (panel data), respectively, !j is a �rm-speci�c unobservede¤ect, ujt is a time varying residual, and �k; �l are unknown parameters.

The main reasons for using a panel estimator in this context are as follows:

� The researcher might suspect there is time-invariant unobserved het-erogeneity across �rms in underlying productivity - controlling for ��xed

Page 3: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

e¤ects�, either by means of di¤erencing, by going within, is meant to takecare of this.

� The researcher might suspect that the time varying component of theresidual is serially correlated in levels - pseudo-di¤erencing the levelsequation which results in a dynamic model is meant to take care of this.Recall: if

ujt = �uj;t�1 + ejt;

we have

yjt = �kkjt + �lljt +�!j + ujt

�yjt = �kkjt + �lljt +

�!j + �uj;t�1 + ejt

�;

and since, by de�nition,

�uj;t�1 = �yj;t�1 � ��kkj;t�1 � ��llj;t�1 � �!j;

Page 4: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

the production function can be written as a dynamic equation (with com-mon factor restrictions):

yjt = �yj;t�1+�kkjt���kkj;t�1+�lljt���llj;t�1+!j (1� �)+ejt:

� The researcher might suspect that the time varying component of theresidual is correlated with the factor inputs (e.g. capital, labour) -using instrumental variables is meant to take care of this.

The panel data methods discussed earlier in this course are obviously verygeneral, and not speci�c to production functions. In this lecture we discuss inmore detail the econometrics of estimating production functions using paneldata, typically at the �rm level.

Page 5: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

2 Why are we interested?

In so far as there is one thing on which economists appear to be able to agree it isthe desirability of higher productivity. The production function is an importanttool that can be used to analyze various aspects of productivity. Here are someresearch questions/issues that can be addressed using a production functionapproach:

� Scale and productivity. In most datasets on enterprises in Sub-SaharanAfrica, labour productivity (usually de�ned as value-added per worker) ismuch higher large than small �rms (see e.g. the survey paper by Bigsten& Söderbom, WBRO, 2006). Is this because large �rms have more capitalper worker, or because there are increasing returns to scale? If we believethe production function above is correctly speci�ed, we can answer thisquestion by estimating �k and �l.

Page 6: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Suppose we convince ourselves there are increasing returns to scale, i.e.�k+�l > 1. One implication would be that if a �xed set of inputs (at thenational level) gets allocated to a small number of large �rms this results inmore aggregate output than if allocated to a large number of small �rms.This may be important for policy.

� In contrast, if we convince ourselves returns to scale of constant, �+� = 1,a reallocation of resources between �rms of di¤ering size may not impacton aggregate output (e.g. two small �rms will produce as much outputas one large �rm using the same amount of inputs as the two small onesbetween them).

� In fact, the evidence on returns to scale in developing countries is mostconsistent with constant returns to scale (see e.g. Söderbom and Francis

Page 7: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Teal, 2004, for evidence on enterprises in Ghana). That is, while thereare many small �rms in developing countries, this does not imply foregonescale economies. You will have seen that Blundell-Bond obtain the sameresult for �rms in the US.

� Although I won�t be taking about farms in this lecture, production func-tions are commonly used in agricultural economics too. For example, acommon view is that small farms are more productive than large farms;however the empirical evidence on the matter is somewhat mixed (e.g.Lamb, 2001, refutes this notion, concluding that large farms are as pro-ductive as small ones)�.

� Rates of technological change.�Lamb, R. L. �Inverse productivity: land quality, labor markets and measurement error�Journalof Development Economics, 2003, 71: 71-95.

Page 8: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Rates of return on, for example, R&D or exporting (�learning-by-exporting�)

� The contribution of various forms of inputs to output (e.g..skilled & un-skilled labour).

Page 9: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

3 Production functions & the basic endogeneity

issue

We focus on the simple 2-factor Cobb-Douglas production function:

Yj = AjK�kj L

�lj ;

or, in natural logarithms,

yj = �0 + �kkj + �llj + �j;

where

lnAj = �0 + �j

is log TFP. �0 is a constant, interpretable as the mean of log TFP, while�j measures the deviation in productivity from the mean, for �rm j. TFP istypically assumed unobserved (at least partially).

Page 10: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Suppose we have micro data on output, capital and labour. How can theparameters of this equation be estimated?

� As you know, for OLS to consistently estimate the �-parameters, the er-ror term must have zero mean and be uncorrelated with the explanatoryvariables:

E��j�= 0;

Cov�kj; �j

�= 0; (1)

Cov�lj; �j

�= 0 (2)

The zero mean assumption is innocuous, as the intercept �0 would pickup a non-zero mean in �j.

Page 11: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� The crucial assumption is zero covariance. Is this likely to hold in thepresent context?

� No - because it seems quite possible that the �rm�s capital and labourdecisions are in�uenced by factors that are observed to the �rm�s managerbut unobserved to the econometrician, i.e. by �j. This would set up acorrelation between the regressors and the residuals, rendering the OLSestimates biased and inconsistent.

Page 12: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

3.1 Illustration

Assumptions:

� Firms operate in perfectly competitive input and output markets (so thatinput and output prices are not a¤ected by the actions of �rm j);

� Capital is a �xed input (decided upon one period in advance, say) rentedat rate r;

� Firms observe �j before hiring labour (at rateW ), and labour is a ��exibleinput�that can be altered without dynamic implications.

Page 13: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

The �rm�s pro�t is given by

�j = pYj �WLj � rKj

�j = p

�AjK

�kj L

�lj

�� wLj � rKj;

where p is the output price. Assuming the �rm maximizes pro�ts, it will chooselabour such the following �rst-order condition is ful�lled:

�lpAjK�kj L

�l�1j =W;

which implies

Lj =

�lpAj

W

! 11��l

K

�k1��lj ;

or, in logs,

lj =1

1� �l

hln�l + ln p� lnW + ln�0 + �j + �kkj

i:

Page 14: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Clearly in this case lj depends on unobserved TFP (which is the inter-pretation assigned to the residual �j) and so estimating the productionfunction

yi = �0 + �kkj + �llj + �j:

by means of OLS will give biased and inconsistent results.

� Note that, since the �rst-order condition for labour implies a positive cor-relation between lj and �j, we would expect the OLS estimate of �l to beupward biased.

Page 15: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

3.2 Other endogeneity issues

� Attrition. Suppose the probability of exit is a negative function of thevalue of the �rm, and suppose the value of the �rm depends on unobservedproductivity and the level of capital stock installed:

Pr�exitj;t+1 = 1j�j; kj

�= Pr

�Vj��j; kj

�<

�= f

��j; kj

�;

where f1 < 0; f2 < 0. That is, the typical �rm that would exit would beone with a low level of productivity and a low level of capital (this wouldbe a low value �rm).

� Think about what this means for the correlation between unobserved pro-ductivity and observed capital in the "selected sample", i.e. in the sampleof survivors.

Page 16: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Firms with a lot of capital are likely to survive even if they have lowproductivity, because they have high values.�

� However �rms with little capital will only survive if they have high levelsof productivity.

� Hence, in the sample of survivors there will be a negative correlationbetween kj and unobserved productivity �j.

� Thus, if we estimate the production function

yj = �0 + �kkj + �llj + �j;

this mechanism would tend to yield a downward bias in the coe¢ cient onkj.

Page 17: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Measurement errors. In general, we expect measurement errors in inputsto lead to downward bias (attenuation bias) in the estimated coe¢ cients.Recall the attenuation bias formula:

yit = �x�it + vit;

where x�it is the true but unobserved value of the explanatory variable, andvit is a non-autocorrelated, homoskedastic error term with zero mean. Weobserve an imperfect measure of x�it , namely xit such that

xit = x�it + uit;

where uit is a random measurement error uncorrelated with x�it. Ourestimable equation is

yit = �xit + (vit � �uit) ;

so the regressor xit is correlated with the error term (vit � �uit). It canbe shown that this will lead to a downward bias in the OLS estimate of �

Page 18: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

- that is, estimated � is lower than true �. To give you an idea of whatthe bias looks like, consider the following formula showing the bias causedby measurement errors:

p lim �OLS

= �

�2x�

�2x� + �2u

!;

where �2x� is the variance of the true, unobserved explanatory variable, and�2u is the variance of the measurement error. The operator p lim can bethought of as showing the value of estimated � in a large sample. Looselyspeaking, this is what we can expect to get if there are measurementerrors in the explanatory variable. Clearly the higher the variance of themeasurement error, the more severe is the bias.

Page 19: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� What happens if we take �rst di¤erences? Clearly,

p lim �FD

= �

�2dx�

�2dx� + �2de

!;

where d indicates that the variance refers to the di¤erenced variable. As-sumed that the variance is constant over time and that the mean of z iszero, it can be shown that

p lim �FD

= �

0B@ �2x�

�2x� + �2e(1��e)(1��x�)

1CAwhere �e is the serial correlation of the measurement errors and �x� is theserial correlation of the true values of the regressors.

Now compare the following expressions:

Page 20: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

p lim �OLS

= �

�2x�

�2x� + �2e

!;

p lim �FD

= �

0B@ �2x�

�2x� + �2e(1��e)(1��x�)

1CA :Which one has the most severe bias?

The bias of the FD estimator will be more severe than that of the levels esti-mator if (1��e)

(1��x�)> 1, i.e. if �x� > �e.

This is an important result. In most applications we assume that the serialcorrelation of the measurement errors typically is quite small or zero, while theserial correlation of the true unobserved explanatory variable is positive. In thiscase �rst di¤erencing the data is bound to exacerbate the measurement error

Page 21: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

bias, and OLS estimation of the levels equation would be preferable to the FDmodel.

� In practice, estimating the coe¢ cient on the capital stock whilst controllingfor �xed e¤ects has proved di¢ cult - see Söderbom and Teal, 2004, fordetails.

Page 22: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

4 Traditional solutions to the endogeneity prob-

lem

The two traditional solutions to endogeneity problems are instrumental vari-ables and �xed e¤ects. We are now going to write the production functionas

yjt = �kkjt + �lljt + !jt + �jt;

i.e. we have added time subscripts re�ecting the panel dimension in the data;and we have decomposed the residual � into two components, !jt + �jt

� !jt represents the part of TFP observable to the �rm but not to theeconometrician - hence this is the source of endogeneity problems. You

Page 23: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

can think of !jt as a measure of the managerial quality of the �rm. Fromnow on, we will refer to !jt as �unobserved productivity�.

� �jt on the other hand is assumed not to impact on the �rm�s input de-cisions. You can think of �jt as representing measurement errors in out-put, for example (other interpretations are possible too; see Section 2.2 inABBP). What�s important is that �jt is not a source of endogeneity bias.

Page 24: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

4.1 Instrumental Variables

Our problem: We want to estimate

yjt = �kkjt + �lljt + !jt + �jt;

but we cannot use OLS, since

Cov�ljt; !jt

�6= 0:

(It is likely, of course, that capital is endogenous too, but we abstract from thatpossibility for the moment.)

Suppose an instrument zjt is available, that ful�lls the following conditions:

1. The instrument is valid (or exogenous):

cov�zjt; !jt

�= 0:

Page 25: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

This is an exclusion restriction - zjt is excluded from the structuralequation (the production function).

2. The instrument is informative (or relevant). This means that the instru-ment zjt must be correlated with the endogenous regressor (labour in thecurrent example), conditional on all exogenous variables in the model (i.e.capital, if this is thought exogenous). That is, if we assume there is alinear relationship between ljt and zjt and kjt;

ljt = �0 + �1kjt + �1zjt + rjt; (3)

where rjt is mean zero and uncorrelated with the variables on the right-hand side, we require �1 6= 0.

Many economists take the view that, for instrumental variable estimation to beconvincing, the instruments used must be motivated by theory. Recall the �rst-order condition for labour derived above - with my slightly modi�ed notation

Page 26: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

we get

ljt =1

1� �l

hln�l + ln p� lnW + �kkjt + !jt

i:

� This suggests the wage rate W might be a useful instrument:

� Our theory says it is (negatively) correlated with labour.

� The wage rate also must be uncorrelated with !jt. This may not bean entirely innocuous assumption to make. While the wage rate doesnot directly enter the production function, wages might be correlatedwith unobserved productivity for other reasons - e.g. if more productive�rms have stronger market power in input markets - in which case thewage will not be a valid instrument.

Page 27: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� It also follows from the �rst-order condition above that the output priceis a potential instrument - however, that has been used less often in theliterature. Why might we be concerned about using the output price is aninstrument?

� A similar way of reasoning can be applied for capital, if that is thoughtendogenous (i.e. use the cost of capital as an instrument).

Page 28: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Five reasons why the IV approach based on prices as instruments hasnot been very successful

1. Market power. Wages and capital prices (and output prices) could wellbe correlated with unobserved productivity if input (output) markets arenot perfectly competitive: e.g. high unobserved productivity gives the �rmmarket power and so enables it to in�uence the price.

2. Wages and unobserved worker quality. When labour costs are reportedin �rm-level datasets, they typically come in the form of average wage perworker, and you may well be concerned that the average wage in the �rm iscorrelated with unobserved quality of the workforce. Since the unobservedquality of the workforce likely impacts on unobserved productivity, thiswould imply the average wage is an invalid instrument.

Page 29: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

3. Law of one price. If, as is typically the case, one wants to include timedummies in the production function, there must be variation in input pricesacross �rms at a given point in time for these to be useful instruments.If input markets are essentially national in scope, this seems unlikely. (Ifaverage wages indeed vary across �rms in most datasets, you suspect thisis at least partly picking up unobserved worker quality).

4. Endogenous unobserved productivity. Suppose unobserved productivity!jt actually depends on input choices - e.g. investment in modern tech-nology raises productivity. In that case it will be hard to argue that inputprices are valid instruments, since these surely will impact on investment.

5. Attrition. A di¤erent kind of endogeneity problem sometimes discussedin the literature is posed by endogenous attrition, i.e. that the �rm�s exit

Page 30: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

decision depends on unobserved productivity as well as input prices (afterall, these jointly determine the pro�tability of the �rm). In such a casewe will have a Heckman type selection problem, in which all variablesdetermining the exit decision will go into the residual of the productionfunction in the selected sample. Clearly input prices cannot be used asinstruments in this case.

The common theme across these reasons is that prices are unlikely to be validinstruments.

Page 31: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

4.2 Fixed E¤ects

A second traditional solution to the endogeneity problem is �xed e¤ects estima-tion, which as you know requires panel data. One key assumption underlyingthis approach is that unobserved productivity is constant over time,

!jt = !j

but varies across �rms. We would now write the production function as

yjt = �kkjt + �lljt +�!j + �jt

�;

and use perhaps the within estimator (��xed e¤ects� estimator) or the �rst-di¤erenced estimator to estimate the parameters - in the latter case for examplewe would thus estimate

yjt � yj;t�1 = �k�kjt � kj;t�1

�+ �l

�ljt � lj;t�1

�+��jt � �j;t�1

�;

Page 32: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

using OLS (probably with �rm-clustered standard errors since the di¤erencedresidual is likely serially correlated).

Notice that the source of endogeneity bias has been eliminated, thus e¤ectivelysolving the endogeneity problem (subject of course to strict exogeneity; seeearlier lectures in this course).

Page 33: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Three reasons why the �xed e¤ects approach has not been very success-ful

1. Time invariant unobserved productivity. The assumption that unob-served productivity is �xed over time is quite restrictive, especially in longerpanels.

2. Di¤erencing may exacerbate measurement error bias. When thereare measurement errors in inputs, the �xed e¤ects estimator may well bemore severely biased than the OLS estimator. Discuss.

3. Poor performance in practice. Fixed e¤ects estimates of the capitalcoe¢ cient are often implausibly low, and estimated returns to scale isoften (severely) decreasing (�k + �l << 1).

Page 34: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

[EXAMPLE 1. To be discussed in class]

Page 35: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

5 The Olley and Pakes (1996) approach

The Olley & Pakes (1996; henceforth OP) use a di¤erent approach to solve theendogeneity problems discussed above. Similar to the IV approach, OP derivetheir solution from the input demand equations, however OP do not requirefactor prices to be observed. In what follows I will discuss a simpli�ed versionof the OP model.

� The production function:

yjt = �0 + �kkjt + �lljt +�!jt + �jt

�:

(the original OP model also allows for an e¤ect of �rm age, but I ignorethat here).

Page 36: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Summary of key assumptions:

� Labour is a �exible input chosen in period t, after observing productivity!jt.

� Capital is a "quasi-�xed" input chosen in period t�1 and subject to strictlyconvex adjustment costs. Capital evolves according to the equation

Kjt = (1� �)Kj;t�1 + Ij;t�1;

where Ij;t�1 denotes investment.

� Unobserved productivity !it is assumed to follow a �rst order Markovprocess,

p�!j;t+1jf!j�gt�=0; Ijt

�= p

�!j;t+1j!jt

�;

Page 37: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

where Ijt is the �rm�s information set in period t. This means that, giventhe present information, future states are independent of the past states -lags of the productivity variable do not provide additional information asto what might happen to productivity in the future. Examples:

� Linear process

!jt = �!j;t�1 + �jt:

� Nonlinear process

!jt = �1!j;t�1 + �2!3j;t�1 + �jt:

� Nonparametric process

!jt = f�!j;t�1

�+ �jt:

Page 38: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Recall the linear process was adopted by Blundell and Bond in their analysis ofproduction functions based on US data.

� The pro�t in period t is de�ned as

�t = pK�kjt L

�ljt exp

��0 + !jt

��WjtLjt � pIIjt �G(Ijt;Kjt);

where p is the output price, pI is the price of one unit of capital, andG(Ijt;Kjt) is the adjustment cost for capital. Note: labour is not afunction of �jt since we�re assuming this term is just random noise (outputmeasurement error).

� Since labour is assumed to be a �exible input, the static �rst-order condi-tion applies:

�lpK�kjt L

�l�1jt exp

��0 + !jt

�=Wjt;

Page 39: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Ljt =

0@�lp exp��0 + !jt

�Wjt

1A1

1��lK

�k1��ljt :

Using this expression for labour in the pro�t function above, we can rewritepro�ts as

�t = (1� �l)�

�l1��ll

�p exp

��0 + !jt

�� 11��l

�Wjt

� �l�l�1K

�k1��ljt

�pIIjt �G(Ijt;Kjt);

or, in more reader-friendly notation,

�t = '�Wjt; !jt

�K

�k1��ljt � pIIjt �G(Ijt;Kjt):

You see how the labour variable has "disappeared" - replaced by the vari-ables and parameters determining Ljt as implied by the �rst-order condi-tion for labour. Using a notation more similar to that in OP, we might

Page 40: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

therefore write pro�ts as

�t = ��kjt; !jt

�� c

�Ijt�

where �jt is sales minus labour costs, and c�Ijt�is the cost of investment,

including strictly convex adjustment costs, for example

G(Ijt;Kjt) =

2

Ijt

Kjt

!2Kjt:

where is a parameter measuring the marginal adjustment cost of capital.

Page 41: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

The �rm�s objective

� The �nal important assumption underlying the OP framework concerns thebehaviour of the �rm.

� It is assumed that the �rm chooses investment and employment to max-imize the present value of current and expected future net revenues. Wehave already seen how labour is "optimized out" at each period, whichmeans we can write the value of the �rm as a function of capital andproductivity only:

V (kjt; !jt) = maxIt

Et

1Xs=t

(s�t)h��kjs; !js

�� c

�Ijs�i;

where Et denotes expectation given the information available in period t,and is a discount factor. The choice variable (or control variable) hereis investment in period t:

Page 42: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Note: the fact that labour is not visible in this equation does not meanlabour is irrelevant. Labour is not visible here because we have implicitlyreplaced it by the variables and parameters determining labour as impliedby the �rst-order condition. Indeed, estimating the coe¢ cient on labourin the production function is a central objective in the analysis.

� Alternatively, we can write the value of the �rm recursively as a (stochastic)Bellman equation:

V (kjt; !jt) = maxIt

��kjt; !jt

��c

�Ijt�+ Et[V (kj;t+1; !j;t+1)] (4)

Page 43: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

The �rm�s investment demand

� Key for the OP approach is the �rm�s investment. In a model of the formoutlined above, optimal investment in period t will depend on

� the existing capital stock; and

� expectations about the future pro�tability of capital.

� The �rst-order Markov assumption implies that expected productivity inthe future depends on current, but not past, productivity.

� OP hence write down an investment demand function of the followingform:

Ijt = It�kjt; !jt

�:

Page 44: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

This function needs to be strictly increasing in unobserved productivityfor the OP procedure to work - a �rm with a high value of !jt will investstrictly more than a �rm with a low value of !jt, conditional on kjt.

� [EXAMPLE 2. From Bond, Söderbom and Wu, 2008. To be discussed inclass]

Page 45: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Controlling for the endogeneity of input choice We are now ready todiscuss the estimation strategy proposed by OP. Notice that this is motivatedby the theory discussed above.

� The key "trick" in OP. Recall that investment is assumed to be a strictlymonotonic in !jt. This implies that the investment demand function

Ijt = Ijt�kjt; !jt

�can be inverted so that productivity is expressed as a function of invest-ment and capital:

!jt = ht�kjt; Ijt

�:

Intuitively, capital kjt and investment Ijt "tells" us what !jt must be.

Page 46: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Now return (to the production function:

yjt = �kkjt + �lljt +�!jt + �jt

�:

Recall that unobserved productivity !jt is a source of endogeneity bias.We now use !jt = ht

�kjt; Ijt

�and rewrite the production function as

yjt = �kkjt + �lljt + ht�kjt; Ijt

�+ �jt:

By including the function ht�kjt; Ijt

�as an additional term on the right-

hand side, we have e¤ectively "controlled" for unobserved productivity.

� Building on this, OP proposed a two stage procedure to estimate theparameters �l and �k This works as follows.

Page 47: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� First stage: De�ne

�t�kjt; Ijt

�= �kkjt + ht

�kjt; Ijt

�;

and rewrite the production function

yjt = �kkjt + �lljt +�!jt + �jt

�:

as

yjt = �lljt + �t�kjt; Ijt

�+ �jt:

� In general, the function �t is not linear. OP propose either approximating�t using a polynomial, e.g.

�t�kjt; Ijt

�= �0 + �1Ijt + �2kjt + �3

�Ijt � kjt

�+ �4I

2jt + �5k

2jt;

or using kernel methods (nonparametric). In any case, what is clear now isthat, provided we control for �t

�kjt; Ijt

�, we may be able to identify the

Page 48: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

labour coe¢ cient �l in the �rst stage. Indeed, if we use the polynomialabove, all we have to do is to estimate the following regression

yjt = �0+�lljt+�1Ijt+�2kjt+�3�Ijt � kjt

�+�4I

2jt+�5k

2jt+ �jt

using OLS.

� [EXAMPLE 3: Applying the �rst-stage OP procedure to the Blundell-Bonddata. To be discussed in class.]

Page 49: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Second stage: We have now estimated �l. In the second stage we shallestimate the capital coe¢ cient �k - this cannot be estimated in the �rststage. Note that the �rst-stage estimation will give us an estimate of thefunction �t, e.g.

�t�kjt; Ijt

�= �0 + �1Ijt + �2kjt + �3

�Ijt � kjt

�+ �4I

2jt + �5k

2jt;

if we are using the polynomial above.

� It follows that

!jt = ht�kjt; Ijt

�= �jt � �kkjt:

� Now, recall that unobserved productivity follows a �rst-order Markov process;

Page 50: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

this means we can decompose !jt as follows:

!jt = Et�1�!jt

�+ �jt

!jt = g�!j;t�1

�+ �jt;

where �jt is the innovation (shock) to productivity. If productivity followsa linear autoregressive process, for example, we would have

!jt = �!j;t�1 + �jt;

c.f Blundell-Bond.

� The production function, again:

yjt = �kkjt + �lljt + !jt + �jt;

which given the insights above can be written

yjt � �lljt = �kkjt + g�!j;t�1

�+ �jt + �jt;

Page 51: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

or

yjt � �lljt = �kkjt + g��j;t�1 � �0 � �kkj;t�1

�+ �jt + �jt: (5)

Now, because capital is chosen one period in advance, the residual �jt+�jtwill be uncorrelated with all the right-hand side variables (remember wehave already estimated �l, which is why I have moved �lljt to the left-handside here).

� Depending on how �exible you want to be, (5) can be estimated usingeither OLS (if g is linear); NLLS (if g is a polynomial); or kernel methods(if g is treated nonparametrically).

� [EXAMPLE 4: Applying the second-stage OP procedure to the Blundell-Bond data. To be discussed in class.]

Page 52: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

5.0.1 Discussion

Scalar unobservable assumption must hold, otherwise can�t invert.

� � Suppose there are two stochastic components of unobserved produc-tivity, so that

Ijt = It�kjt; !

1jt; !

2jt

�;

and suppse !1jt; !2jt follow di¤erent stochastic processes; for example,

let�s suppose !1jt is highly persistent whereas !2jt exhibits only moder-

ate serial correlation.

Page 53: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� In such a case, !1jt and !2jt will impact di¤erently on investment. For

example, conditional on capital,

!1jt = � > 0

!2jt = 0;

will give a stronger investment response than

!1jt = 0

!2jt = � > 0;

because the �rm understands that expected future pro�ts are higher inthe former case than in the latter case (since !1jt more persistent).

� Because !1jt; !2jt are both unobserved, we are stuck. The investment

demand equation cannot be inverted; put di¤erently, we can�t inferfrom capital and investment the values of !1jt; !

2jt separately. The OP

approach won�t work.

Page 54: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Zero investment levels potentially problematic Recall that investmentneeds to be a strictly monotonic function of (scalar) unobserved productivity.The presence of lots of zero investments in the data strongly indicates that thisis not the case - surely it�s wildly unrealistic to assume that all �rms that investnothing have precisely the same level of unobserved productivity (conditionalon capital).

If investment is irreversible, for example, the investment demand function willnot be a monotonic function of productivity, and there will be lots of investmentzeros in the data (see the graph taken from Bond, Söderbom and Wu, 2008).

� Levinsohn & Petrin (2003) proposed using raw materials as a proxy forunobserved productivity in such a case. Raw materials is rarely if ever zeroin datasets and so strict monotonicity might hold. Below I will discuss ageneralized approach in this vein proposed by Ackerberg, Caves and Frazer(2006), so I will not discuss the Levinsohn-Petrin estimator here.

Page 55: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Alternatively, we can retain the OP approach provided we simply drop allobservations for which investment is equal to zero. Provided the model iscorrectly speci�ed, such a procedure would control for unobserved produc-tivity and yield consistent estimates.

Page 56: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Labour really �exible? The OP approach just described is really only ap-propriate if labour is a �exible input. If not, e.g. because �rms can�t easily hireand �re workers from one day to another, then the investment demand functionspeci�ed as part of the OP approach,

Ijt = It�kjt; !jt

�:

would no longer be correct - investment would depend on capital and unob-served productivity, but it would also depend on labour:

Ijt = It�kjt; !jt; ljt

�:

Inverting out !jt would leave you with a function of the following form

!jt = ~ht�kjt; Ijt; ljt

�;

and so it would clearly not be possible to identify anything in the �rst stage:

yjt = �kkjt + �lljt + ~ht�kjt; Ijt; ljt

�+ �jt:

Page 57: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Collinearity and other issues

� Ackerberg, Caves and Frazer (2006) noted that the parameter �l on the�exible labour input is not identi�ed by estimating the �rst stage unlessin a pretty special case involving either serially uncorrelated wages or se-rially correlated optimization errors. More generally, parameters on �ex-ible inputs in Cobb-Douglas production functions are not identi�ed fromcross-section variation if all �rms face common input prices and inputs areoptimally chosen (Bond & Söderbom, work in progress).

� To see this, consider the f.o.c. for labour again:

ljt =1

1� �l

hln�l + ln p� lnW + �kkjt + !jt

i:

Page 58: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Identi�cation of �L in OP stage 1,

yjt = �kkjt + �lljt +�!jt + �jt

�:

requires variation across �rms in lit at given levels of kit and !it (or, underthe assumptions of OP, at given levels of kit and ht

�kjt; Ijt

�):

�t�kjt; Ijt

�= �kkjt + ht

�kjt; Ijt

�;

� Yet the structure of the conditional labour demand function indicates thatvariation across �rms in lit is fully explained by kit and !it, if the realwage is common to all �rms and the labour input is optimally chosen.

� In general, identi�cation of parameters on �exible inputs from cross-sectionvariation thus requires either variation across �rms in the real price of thoseinputs, or some form of optimization error in the choice of those inputs.

Page 59: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� As discussed in Ackerberg, Caves and Frazer (2006), identi�cation of �Lusing the �rst stage of the Olley-Pakes estimation procedure further re-quires that any variation across �rms in the real wage, or any optimizationerror in the choice of labour, must be serially uncorrelated. The reasonis that either persistent variation in real wages or persistent optimizationerror in the choice of labour would a¤ect the decision rule for capital, i.e.

Ijt = It�kjt; !jt

�:

would no longer be correct - instead,

Ijt = It�kjt; !jt;Wjt

�:

implying that the unobserved level of log TFP could no longer be ade-quately proxied using a function of investment and capital alone.

� For the same reason, consistent estimation of �L from the �rst stage alsorequires that there must be no variation across �rms in the cost of capital,and no optimization error in the investment decision.

Page 60: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Thus, if inputs are optimally chosen, the only form of input price variationthat allows identi�cation of �L using the �rst stage of the estimator pro-posed by Olley and Pakes (1996) is the presence of serially uncorrelatedvariation across �rms in the real wage.

Page 61: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

5.1 The Ackerberg, Caves and Frazer (2006) approach

� Ackerberg, Caves and Frazer (2006; henceforth ACF) suggest an alternativeestimation approach that avoids some of the problems discussed above(e.g. the potential collinearity problems), and that will work under lessrestrictive assumptions than those underlying the OP model.

� Just like OP (and LP), the ACF estimator is a two-step estimator. Themain di¤erence between the ACF approach and the OP (and LP) approachis that, with the ACF approach, no coe¢ cients of interest will be estimatedin the �rst stage of estimation. Instead, all input coe¢ cients are estimatedin the second stage. As we shall see, the �rst stage is still important,however.

Page 62: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� A useful starting point for the ACF approach is a three-factor Cobb-Douglas output production function,

qjt = �kkjt + �lljt + �mmjt + ~!jt;

where qjt is output, and I have added log raw materials, denoted mit, tothe basic speci�cation used above (a constant is subsumed in ~!jt). Rawmaterials is assumed a perfectly �exible input, and so the following static�rst order condition applies:

�mpQjt

Mjt= pm;

where Qjt is the level of output, Mjt is the level of raw materials and pm

is the unit price of raw materials, assumed constant in the cross-section.

� Note that raw materials will be proportional to output. Using this in theoutput production, we can obtain a value-added production function as

Page 63: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

follows:

yjt = �kkjt + �lljt + !jt:

Supposing that there are measurement errors in value-added, we modifythis accordingly:

yjt = �kkjt + �lljt + !jt + �jt;

where �jt denotes the measurement error as usual (note: �jt doesn�t haveto be measurement error, it can be a "real" shock to output, provided itdoes not impact on any of the factor inputs - see ABBP for a discussion).

� So the raw materials variable has disappeared from the scene. Why didwe introduce it then? The answer is that the raw materials variable willplay a role similar to that played by investment in the OP model - i.e. asa proxy for unobserved productivity.

Page 64: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Having set the scene, we are now ready to consider the ACF approach. Keyassumptions are as follows

� Materials is a �exible input chosen in period t, after observing productivity!jt.

� Capital is a "quasi-�xed" input chosen in period t � 1 and subject tostrictly convex adjustment costs.

� Labour is chosen before material inputs, but after capital has been chosen.in period t. Suppose labour is chosen at time t� 0:5.

Page 65: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Unobserved productivity !it is assumed to follow a �rst order Markovprocess between the subperiods t� 1; t� 0:5, and t:

p�!jtjIj;t�0:5

�= p

�!jtj!j;t�0:5

�;

and

p�!j;t�0:5jIj;t�1

�= p

�!j;t�0:5j!j;t�1

�:

� Capital for period t production is decided in view of !j;t�1 and the �rm�scapital in t� 1.

� Labour for period t production is decided in view of !j;t�0:5 and the�rm�s capital in t (which is already known at this point). This may bequite realistic: labour decisions need to be made in advance, since newworkers need to be trained or worker to be laid o¤ will have to be givensome period of notice.

Page 66: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Materials for period t production is decided in view of !jt and the �rm�scapital and labour in period t, both of which are known at this point:

mjt = ft�!jt; kjt; ljt

�:

� Key "trick" in ACF. Under the assumption that materials is a strictlymonotonic (increasing, to be consistent with the theory) function of !jt,conditional on capital and labour, we can invert this function for !jt, alongthe same lines as in OP:

!jt = f�1t�mjt; kjt; ljt

�;

and so we rewrite the value-added function

yjt = �kkjt + �lljt + !jt + �jt;

as

yjt = �kkjt + �lljt + f�1t�mjt; kjt; ljt

�+ �jt:

Page 67: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Note the close similarity with OP: by including the function f�1t�mjt; kjt; ljt

�as an additional term on the right-hand side, we have e¤ectively "con-trolled" for unobserved productivity. The remaining residual �jt is in-nocuous since it has no impact on factor inputs (e.g. because it�s simplymeasurement error in output).

� The snag here is that no parameter of interest can be identi�ed based onthis speci�cation. But don�t let that distract you. The key goal in the �rststage is to get rid of the �jt term - why this is desirable will be clearerlater.

Page 68: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� First stage. Regress log value added on a polynomial function of capital,labour and raw materials, e.g.

yjt = �0t + �1tkjt + �2tljt + �3tmjt +

+�4tk2jt + �5tl

2jt + �6tm

2jt

+�7tkjtmjt + �8tljtmjt + �9tkjtljt+�jt;

using OLS. The estimated �-parameters are not the parameters of interest.

� Now de�ne

�t = �kkjt + �lljt + f�1t�mjt; kjt; ljt

�;

�t = �kkjt + �lljt + !jt

which represents log value added, net of the term �jt. Having estimatedthe �rst stage regression, we can thus estimate �t simply by requestingthe predicted values.

Page 69: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� The next task is to decompose unobserved productivity:

!jt = Et�1�!jt

�+ �jt;

or

!jt = E�!jtj!j;t�1

�+ �jt;

(remember �rst order Markov property), where �jt is independent of allinformation known in period t� 1.

� Given the timing assumption that capital was decided in period t � 1, itmust then be that

Eh�jtkjt

i= 0; (6)

which you recognize is an orthogonality condition that can be used toestimate the parameters of interest.

Page 70: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Labour on the other hand is chosen in t� 0:5, and so at that time part of�jt has been observed - hence, lit is not uncorrelated with �jt:

Eh�jtljt

i6= 0:

However, lagged labour, li;t�1, was chosen in period t� 0:5� 1, and soat that point nothing was known about the innovation to productivity inperiod t:

Eh�jtlj;t�1

i= 0: (7)

� The two moments (6) and (7) can therefore be used to identify �k and�l. This is what happens in the second stage.

Page 71: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� Second stage. We have the following population moments:

Eh�jtkjt

i= 0

Eh�jtlj;t�1

i= 0;

� Provided we have a random sample, we can appeal to the analogy prin-ciple and replace population moments by sample moments. We can thenobtain consistent estimates of �k and �l by minimizing the criterion func-tion 24 TX

t=1

NXi=1

�jt

"kjtlj;t�1

#350 � C �24 TXt=1

NXi=1

�jt

"kjtlj;t�1

#35(1 x 2) (2 x 2) (2 x 1)

with respect to the parameters �k; �l. Since the model is exactly identi�ed,the choice of C is irrelevant - the minimum will always occur at zero.

Page 72: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� The computation of �jt. We saw above that

�t = �kkjt + �lljt + !jt;

hence

!jt = �t � �kkjt � �lljt:

Also, remember that we have an estimate of �t from the �rst stage; thus,conditional on the parameters �k; �l we can compute !jt.

� Moreover, remember that

!jt = E�!jtj!j;t�1

�+ �jt;

which we write as a nonparametric regression:

!jt = '�!j;t�1

�+ �jt:

Page 73: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

This suggests the following estimation recipe, for the second stage:

1. Guess �k; �l - denote these by �Gk ; �

Gl

2. Compute

!Gjt = �t � �Gk kjt � �Gl ljt

(remember �t is �xed since estimated in the �rst stage).

3. Regress !Gjt on !Gj;t�1 using some suitable technique - e.g. linear re-

gression (perhaps allowing for a polynomial), or nonparametric techniques.Compute the productivity innovation, based on the results:

�Gjt = !Gjt � '�!Gj;t�1

�:

Page 74: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

4. Compute the criterion function24 TXt=1

NXi=1

�jt

"kjtlj;t�1

#350 � C �24 TXt=1

NXi=1

�jt

"kjtlj;t�1

#35

5. Check if this looks like the global minimum; if it does, then STOP (youhave obtained your estimates); if not, judiciously change �Gk ; �

Gl and go

back to step 2 above.

Of course, you�d use some pre-programmed minimization routine to do this.

To get standard errors, the easiest procedure is probably to rely on bootstrap-ping (you should include the �rst stage as well).

Page 75: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

Relation between ACF and DPD approach

� ACF discuss how their estimator compares with the type of estimator("DPD") used by Blundell and Bond. The identify distinct advantagesand disadvantages of both approaches.

� The main advantage of ACF:

� Unobserved productivity can follow an arbitrary �rst order Markovprocess. That is, ACF can accommodate a nonparametric process,such as !jt = f

�!j;t�1

�+ �jt. This is not possible with the DPD

approach. ACF can do this because they recover the unobserved pro-ductivity term !jt. 1st stage estimation is important in this context,as this procedure eliminates the "irrelevant" part of the residual (e.g.output measurement error).

Page 76: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

� The main advantage of DPD:

� Easy to allow for �rm �xed e¤ects - i.e. unobserved time invariantheterogeneity across �rms. Recall we have suggested earlier in thiscourse that this is arguably the main advantage of having panel data.This is not possible in the ACF approach, if one were modelling thedynamics of !jt nonparametrically.

� However, the estimators can be made very similar to each other - if we wereusing a linear AR1 model for productivity, so that !jt = �!j;t�1 + �jt,then the second stage of ACF would literally be the dynamic COMFACmodel adopted by Blundell & Bond.

Page 77: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

1

ERSA Training Workshop

Måns Söderbom

University of Gothenburg, Sweden

[email protected]

January 2009

Estimation of Production Functions with Micro Data

Example 1

OLS and FE results based on firm-level panel data on Ghanaian manufacturing firms

. xi: reg lvad lk lemp i.year, cluster(firm)

i.year _Iyear_1992-2000 (naturally coded; _Iyear_1992 omitted)

Linear regression Number of obs = 1438

F( 10, 246) = 143.11

Prob > F = 0.0000

R-squared = 0.7385

Root MSE = 1.1832

(Std. Err. adjusted for 247 clusters in firm)

------------------------------------------------------------------------------

| Robust

lvad | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

lk | .2926854 .034091 8.59 0.000 .225538 .3598328

lemp | .7983413 .0644447 12.39 0.000 .6714075 .9252751

_Iyear_1993 | .0893364 .0910193 0.98 0.327 -.0899401 .268613

_Iyear_1994 | .1384356 .104344 1.33 0.186 -.0670859 .3439571

_Iyear_1995 | .5282265 .0917705 5.76 0.000 .3474704 .7089825

_Iyear_1996 | .3443942 .1054644 3.27 0.001 .1366657 .5521226

_Iyear_1997 | -.0112335 .1326423 -0.08 0.933 -.272493 .250026

_Iyear_1998 | .0657469 .1221722 0.54 0.591 -.17489 .3063838

_Iyear_1999 | .2862481 .1229291 2.33 0.021 .0441203 .528376

_Iyear_2000 | .2987483 .1240461 2.41 0.017 .0544203 .5430764

_cons | 4.662286 .2365325 19.71 0.000 4.196399 5.128173

------------------------------------------------------------------------------

. test lk+lemp=1

( 1) lk + lemp = 1

F( 1, 246) = 4.93

Prob > F = 0.0274

Page 78: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

2

. xi: xtreg lvad lk lemp i.year, fe cluster(firm)

i.year _Iyear_1992-2000 (naturally coded; _Iyear_1992 omitted)

Fixed-effects (within) regression Number of obs = 1438

Group variable: firm Number of groups = 247

R-sq: within = 0.0938 Obs per group: min = 1

between = 0.8142 avg = 5.8

overall = 0.7074 max = 9

F(10,246) = 10.87

corr(u_i, Xb) = 0.7317 Prob > F = 0.0000

(Std. Err. adjusted for 247 clusters in firm)

------------------------------------------------------------------------------

| Robust

lvad | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

lk | .0481399 .1981945 0.24 0.808 -.3422347 .4385146

lemp | .5240789 .1008461 5.20 0.000 .325447 .7227108

_Iyear_1993 | .1918263 .0920392 2.08 0.038 .0105409 .3731117

_Iyear_1994 | .2632723 .108263 2.43 0.016 .0500317 .4765129

_Iyear_1995 | .5801393 .101955 5.69 0.000 .3793231 .7809555

_Iyear_1996 | .4213335 .1173489 3.59 0.000 .1901968 .6524702

_Iyear_1997 | .0633159 .1362419 0.46 0.643 -.2050336 .3316654

_Iyear_1998 | .1064584 .1312125 0.81 0.418 -.1519848 .3649016

_Iyear_1999 | .3280545 .1301013 2.52 0.012 .0717999 .5843091

_Iyear_2000 | .3077651 .1381064 2.23 0.027 .0357432 .579787

_cons | 8.0316 1.989564 4.04 0.000 4.112848 11.95035

-------------+----------------------------------------------------------------

sigma_u | 1.4748081

sigma_e | .87081827

rho | .7414847 (fraction of variance due to u_i)

------------------------------------------------------------------------------

. test lk+lemp=1

( 1) lk + lemp = 1

F( 1, 246) = 4.51

Prob > F = 0.0348

Page 79: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

59

Figure 1: Investment Policy for Quadratic Adjustment Costs Only

Figure 2: Investment Policy for Partial Irreversibility Only

Figure 3: Investment Policy for Fixed Adjustment Costs Only

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

const1*Zt/Kt-1

I t/Kt

Quadratic Adjustment Costs Only

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

const1*Zt/Kt-1

I t/Kt

Partial Irreversibility Only

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

const1*Zt/Kt-1

I t/Kt

Fixed Adjustment Costs Only

Page 80: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

3

Example 3: OP estimate of the labour coefficient – BB data

use usbal89.dta, clear

tsset id year

ge ik=d.k

ge ikk=ik*k

ge ik2=ik^2

ge k2=k^2

ge ik3=ik^3

ge k3=k^3

ge ik2k=ik2*k

ge ikk2=ik*k2

a) OLS, no OP correction for endogeneity

. xi: reg y k n i.year, robust cluster(id)

i.year _Iyear_1982-1989 (naturally coded; _Iyear_1982 omitted)

Linear regression Number of obs = 4072

F( 9, 508) = 2507.63

Prob > F = 0.0000

R-squared = 0.9693

Root MSE = .35256

(Std. Err. adjusted for 509 clusters in id)

------------------------------------------------------------------------------

| Robust

y | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

k | .4322828 .0274846 15.73 0.000 .3782853 .4862803

n | .5578836 .0308763 18.07 0.000 .4972227 .6185445

_Iyear_1983 | -.0568626 .0083657 -6.80 0.000 -.0732982 -.0404269

_Iyear_1984 | -.050041 .0110933 -4.51 0.000 -.0718355 -.0282465

_Iyear_1985 | -.0875714 .0135255 -6.47 0.000 -.1141442 -.0609987

_Iyear_1986 | -.092866 .016461 -5.64 0.000 -.125206 -.0605259

_Iyear_1987 | -.0580931 .0174944 -3.32 0.001 -.0924634 -.0237228

_Iyear_1988 | -.0211632 .0185846 -1.14 0.255 -.0576754 .015349

_Iyear_1989 | -.0382923 .020265 -1.89 0.059 -.0781058 .0015213

_cons | 3.046843 .0915369 33.29 0.000 2.867005 3.22668

------------------------------------------------------------------------------

Page 81: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

4

b) OP with linear terms only, interacted with time

. xi: reg y i.year*k i.year|ik n , robust cluster(id)

i.year _Iyear_1982-1989 (naturally coded; _Iyear_1982 omitted)

i.year*k _IyeaXk_# (coded as above)

i.year|ik _IyeaXik_# (coded as above)

Linear regression Number of obs = 3563

F( 21, 508) = 1127.34

Prob > F = 0.0000

R-squared = 0.9692

Root MSE = .35206

(Std. Err. adjusted for 509 clusters in id)

------------------------------------------------------------------------------

| Robust

y | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

_Iyear_1983 | -.0117025 .0397574 -0.29 0.769 -.0898116 .0664066

_Iyear_1984 | -.0003504 .0386859 -0.01 0.993 -.0763545 .0756537

_Iyear_1985 | .0087549 .0338052 0.26 0.796 -.0576604 .0751701

_Iyear_1986 | -.0039311 .0271325 -0.14 0.885 -.0572368 .0493746

_Iyear_1987 | -.0064952 .0213817 -0.30 0.761 -.0485027 .0355122

_Iyear_1988 | (dropped)

_Iyear_1989 | -.0085923 .0218335 -0.39 0.694 -.0514875 .0343028

k | .4305507 .0278018 15.49 0.000 .37593 .4851714

_IyeaXk_1983 | .0003229 .0039407 0.08 0.935 -.0074191 .0080649

_IyeaXk_1984 | (dropped)

_IyeaXk_1985 | -.0045234 .0034495 -1.31 0.190 -.0113004 .0022536

_IyeaXk_1986 | -.0071593 .0049993 -1.43 0.153 -.0169812 .0026626

_IyeaXk_1987 | .0011513 .0052254 0.22 0.826 -.0091146 .0114173

_IyeaXk_1988 | .0069001 .0056709 1.22 0.224 -.0042412 .0180414

_IyeaXk_1989 | .0056234 .0059086 0.95 0.342 -.0059849 .0172316

ik | -.1679462 .1046535 -1.60 0.109 -.3735531 .0376607

_IyeaXi~1983 | .2370482 .1352143 1.75 0.080 -.0285999 .5026963

_IyeaXi~1984 | .2041958 .1285209 1.59 0.113 -.048302 .4566937

_IyeaXi~1985 | (dropped)

_IyeaXi~1986 | .2056813 .1355581 1.52 0.130 -.0606422 .4720048

_IyeaXi~1987 | .1325166 .1263455 1.05 0.295 -.1157073 .3807406

_IyeaXi~1988 | .1283877 .148174 0.87 0.387 -.1627217 .419497

_IyeaXi~1989 | .124821 .1406473 0.89 0.375 -.1515009 .401143

n | .5609091 .0303342 18.49 0.000 .5013132 .620505

_cons | 2.996726 .0999766 29.97 0.000 2.800308 3.193145

------------------------------------------------------------------------------

.

Page 82: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

5

c) OP, 2nd order polynomial terms, interacted with time

. xi: reg y i.year*k i.year|ik i.year|ikk i.year|ik2 i.year|k2 i.year|ik3

i.year|k3 i.year|ik2k i.year|ikk2 n , robust cluster(id)

i.year _Iyear_1982-1989 (naturally coded; _Iyear_1982 omitted)

i.year*k _IyeaXk_# (coded as above)

i.year|ik _IyeaXik_# (coded as above)

i.year|ikk _IyeaXikk_# (coded as above)

i.year|ik2 _IyeaXik2_# (coded as above)

i.year|k2 _IyeaXk2_# (coded as above)

i.year|ik3 _IyeaXik3_# (coded as above)

i.year|k3 _IyeaXk3_# (coded as above)

i.year|ik2k _IyeaXik2a# (coded as above)

i.year|ikk2 _IyeaXikka# (coded as above)

Linear regression Number of obs = 3563

F( 70, 508) = 628.25

Prob > F = 0.0000

R-squared = 0.9708

Root MSE = .34482

(Std. Err. adjusted for 509 clusters in id)

------------------------------------------------------------------------------

| Robust

y | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

_Iyear_1983 | -.1269729 .1447766 -0.88 0.381 -.4114075 .1574616

_Iyear_1984 | .0142485 .1417061 0.10 0.920 -.2641536 .2926507

_Iyear_1985 | -.125383 .1042455 -1.20 0.230 -.3301883 .0794223

_Iyear_1986 | -.0997699 .1192445 -0.84 0.403 -.3340429 .1345031

_Iyear_1987 | (dropped)

_Iyear_1988 | .1148153 .1008125 1.14 0.255 -.0832454 .312876

_Iyear_1989 | .0660589 .1243963 0.53 0.596 -.1783356 .3104534

k | (dropped)

_IyeaXk_1983 | .4731363 .0809219 5.85 0.000 .3141534 .6321191

_IyeaXk_1984 | .4266294 .0900538 4.74 0.000 .2497057 .6035532

_IyeaXk_1985 | .4882909 .0942897 5.18 0.000 .3030452 .6735366

_IyeaXk_1986 | .4260074 .0949073 4.49 0.000 .2395482 .6124665

_IyeaXk_1987 | .3817328 .0918138 4.16 0.000 .2013512 .5621144

_IyeaXk_1988 | .3188845 .0938174 3.40 0.001 .1345666 .5032024

_IyeaXk_1989 | .341171 .101227 3.37 0.001 .1422958 .5400461

ik | .4519549 .6751802 0.67 0.504 -.8745344 1.778444

_IyeaXik_1~3 | .190244 .7111746 0.27 0.789 -1.206961 1.587449

_IyeaXik_1~4 | -.7517249 .8771012 -0.86 0.392 -2.474917 .9714673

_IyeaXik_1~5 | (dropped)

_IyeaXik_1~6 | -.5032231 .8679689 -0.58 0.562 -2.208474 1.202027

_IyeaXik_1~7 | -1.070335 .8999296 -1.19 0.235 -2.838377 .6977074

_IyeaXik_1~8 | -.5777575 .8575433 -0.67 0.501 -2.262525 1.10701

_IyeaXik_1~9 | -.7952576 .8732094 -0.91 0.363 -2.510804 .9202887

ikk | .2549711 .2312303 1.10 0.271 -.1993143 .7092564

_IyeaXikk_~3 | -.6242853 .3032464 -2.06 0.040 -1.220057 -.0285138

_IyeaXikk_~4 | -.1361169 .3207742 -0.42 0.671 -.7663243 .4940905

_IyeaXikk_~5 | -.5145717 .3433347 -1.50 0.135 -1.189102 .159959

_IyeaXikk_~6 | -.0358771 .3145985 -0.11 0.909 -.6539514 .5821971

_IyeaXikk_~7 | .1029058 .3823818 0.27 0.788 -.6483386 .8541503

_IyeaXikk_~8 | -.1844971 .2655423 -0.69 0.488 -.7061934 .3371993

_IyeaXikk_~9 | (dropped)

Page 83: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

6

ik2 | .0286008 .6650747 0.04 0.966 -1.278035 1.335236

_IyeaXik2_~3 | .1570187 .715695 0.22 0.826 -1.249068 1.563105

_IyeaXik2_~4 | -.1403274 .7453135 -0.19 0.851 -1.604604 1.323949

_IyeaXik2_~5 | (dropped)

_IyeaXik2_~6 | .7096964 .7661279 0.93 0.355 -.7954727 2.214866

_IyeaXik2_~7 | .4335649 .8305117 0.52 0.602 -1.198095 2.065225

_IyeaXik2_~8 | -.0142493 .8681223 -0.02 0.987 -1.719801 1.691303

_IyeaXik2_~9 | 1.113024 .8462974 1.32 0.189 -.5496499 2.775698

k2 | -.0228113 .016048 -1.42 0.156 -.0543398 .0087173

_IyeaXk2_1~3 | -.0049018 .0093979 -0.52 0.602 -.0233655 .0135618

_IyeaXk2_1~4 | (dropped)

_IyeaXk2_1~5 | -.0073795 .0095139 -0.78 0.438 -.0260709 .0113119

_IyeaXk2_1~6 | .0075806 .0131017 0.58 0.563 -.0181596 .0333208

_IyeaXk2_1~7 | .0157766 .0113246 1.39 0.164 -.0064723 .0380255

_IyeaXk2_1~8 | .0279305 .0149167 1.87 0.062 -.0013755 .0572365

_IyeaXk2_1~9 | .022291 .0181463 1.23 0.220 -.01336 .0579421

ik3 | -.6825286 .2868345 -2.38 0.018 -1.246056 -.1190008

_IyeaXik3_~3 | .5448936 .2999866 1.82 0.070 -.0444736 1.134261

_IyeaXik3_~4 | .7374314 .3611197 2.04 0.042 .0279594 1.446903

_IyeaXik3_~5 | .5114732 .3572593 1.43 0.153 -.1904143 1.213361

_IyeaXik3_~6 | (dropped)

_IyeaXik3_~7 | .3755959 .3004034 1.25 0.212 -.21459 .9657818

_IyeaXik3_~8 | .9349795 .3288258 2.84 0.005 .2889535 1.581006

_IyeaXik3_~9 | .051696 .305618 0.17 0.866 -.5487349 .6521268

k3 | .0022769 .0009018 2.52 0.012 .0005053 .0040486

_IyeaXk3_1~3 | (dropped)

_IyeaXk3_1~4 | -.00012 .0004725 -0.25 0.800 -.0010482 .0008083

_IyeaXk3_1~5 | .0001134 .0006357 0.18 0.858 -.0011356 .0013624

_IyeaXk3_1~6 | -.0008324 .0007789 -1.07 0.286 -.0023627 .0006979

_IyeaXk3_1~7 | -.0012529 .0007453 -1.68 0.093 -.0027172 .0002114

_IyeaXk3_1~8 | -.0019506 .0009004 -2.17 0.031 -.0037196 -.0001815

_IyeaXk3_1~9 | -.0015104 .0010739 -1.41 0.160 -.0036202 .0005994

ik2k | .2063111 .1328624 1.55 0.121 -.0547163 .4673386

_IyeaXik2a~3 | -.2057004 .1397434 -1.47 0.142 -.4802466 .0688457

_IyeaXik2a~4 | -.23376 .1737731 -1.35 0.179 -.5751625 .1076424

_IyeaXik2a~5 | -.1579847 .1739745 -0.91 0.364 -.4997828 .1838134

_IyeaXik2a~6 | -.408215 .1566099 -2.61 0.009 -.7158978 -.1005321

_IyeaXik2a~7 | -.3061767 .1599863 -1.91 0.056 -.6204929 .0081394

_IyeaXik2a~8 | (dropped)

_IyeaXik2a~9 | -.3404561 .13897 -2.45 0.015 -.6134827 -.0674294

ikk2 | .0513622 .0248273 2.07 0.039 .0025853 .1001391

_IyeaXikka~3 | (dropped)

_IyeaXikka~4 | -.0542164 .0271281 -2.00 0.046 -.1075135 -.0009194

_IyeaXikka~5 | -.0298157 .0324681 -0.92 0.359 -.0936039 .0339725

_IyeaXikka~6 | -.0789603 .0330882 -2.39 0.017 -.1439668 -.0139538

_IyeaXikka~7 | -.0868584 .0349478 -2.49 0.013 -.1555184 -.0181985

_IyeaXikka~8 | -.0577613 .0301529 -1.92 0.056 -.117001 .0014783

_IyeaXikka~9 | -.0777462 .0320057 -2.43 0.015 -.1406262 -.0148663

n | .5939467 .0296368 20.04 0.000 .535721 .6521725

_cons | 3.194397 .1735615 18.40 0.000 2.85341 3.535384

------------------------------------------------------------------------------

Page 84: ERSA Training Workshop Lecture 4: Estimation of … · Lecture 4: Estimation of Production Functions with ... decisions are in⁄uenced by factors that are observed to ... where ˆeis

7

Example 4: OP estimate of the capital coefficient – BB data

Based on 2nd order polynomial in 1

st stage (see (c) in Example 3)

/* 2nd order polynomial terms , interacted with time */

xi: reg y i.year*k i.year|ik i.year|ikk i.year|ik2 i.year|k2 i.year|ik3

i.year|k3 i.year|ik2k i.year|ikk2 n , robust cluster(id)

(output omitted)

predict predy, xb

ge phihat=predy-_b[n]*n

ge phihat_1=l.phihat

keep if e(sample)==1

drop if phihat_1==. | k_1==.

ge y_bn=y-_b[n]*n

. nl (y_bn = {b0} + {bk} * k + {bm}*(phihat_1 - {bk} * k_1) + {bm2}*(phihat_1 -

{bk} * k_1)^2), initial(b0 3 bk 0.3 bm 0.1 bm2 0)

(obs = 3054)

Iteration 0: residual SS = 404.649

Iteration 1: residual SS = 364.3411

Iteration 2: residual SS = 359.907

Iteration 3: residual SS = 359.6591

Iteration 4: residual SS = 359.6583

Iteration 5: residual SS = 359.6583

Iteration 6: residual SS = 359.6583

Iteration 7: residual SS = 359.6583

Source | SS df MS

-------------+------------------------------ Number of obs = 3054

Model | 2474.18542 3 824.728475 R-squared = 0.8731

Residual | 359.658315 3050 .117920759 Adj R-squared = 0.8730

-------------+------------------------------ Root MSE = .3433959

Total | 2833.84374 3053 .928216095 Res. dev. = 2134.209

------------------------------------------------------------------------------

y_bn | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

/b0 | 12.81522 2.699174 4.75 0.000 7.522835 18.1076

/bk | .3875877 .0072252 53.64 0.000 .3734209 .4017544

/bm | -6.683843 1.653108 -4.04 0.000 -9.925161 -3.442526

/bm2 | 1.148054 .2538301 4.52 0.000 .6503589 1.645749

------------------------------------------------------------------------------

Parameter b0 taken as constant term in model & ANOVA table