the linear regression model with panel...
TRANSCRIPT
The Linear Regression Model with Panel Data
February 16, 2011
() Topic 4 February 16, 2011 1 / 29
Summary
Readings: Chapter 7 of textbook.
I will cover the pooled and individual e¤ects models.
A very popular version of the individual e¤ects model: the stochasticfrontier model will also be discussed.
Computational tools: Gibbs sampler with data augmentation.
() Topic 4 February 16, 2011 2 / 29
Notation
yit and εit denote tth observations (for t = 1, ..,T ) of the dependentvariable and error, respectively, for i th individual for i = 1, ..,N.
yi and εi denote vectors of T observations on dependent variable anderror, respectively, for i th individual.
Sometimes it is important to distinguish between the intercept andslope coe¢ cients.
Hence, de�ne Xi to be a T � k matrix containing T observations oneach of k explanatory variables (including intercept) for i th individual.eXi is T � (k � 1) matrix equal to Xi with intercept removed.
() Topic 4 February 16, 2011 3 / 29
If we stack observations for all N individuals together, we obtain theTN�vectors:
y =
2664y1..yN
3775 and ε =
2664ε1..
εN
3775Similarly, stacking observations on all explanatory variables togetheryields the TN �K matrix:
X =
2664X1..XN
3775
() Topic 4 February 16, 2011 4 / 29
The Pooled Model
Assume same regression relationship holds for every individual:
yi = Xi β+ εi ,
for i = 1, ..,N where β is the k�vector of regression coe¢ cients,including intercept
This is just a linear regression model of sort discussed in previouslectures.
No new issues arise.
() Topic 4 February 16, 2011 5 / 29
Individual E¤ects Models
Model is of the form:
yit = αi + βxit + εit
Di¤erent intercept for every individual (�individual e¤ect�)
Same slope for every individual
() Topic 4 February 16, 2011 6 / 29
The Likelihood Function
Likelihood function based on the regression equation:
yi = αi ιT + eXieβ+ εi
Properties of multivariate Normal imply likelihood function of theform:
p(y jα, eβ, h) =∏Ni=1
hT2
(2π)T2
�exp
�� h2
�yi � αi � eXieβ�0 �yi � αi � eXieβ���
where α = (α1, .., αN )0.
() Topic 4 February 16, 2011 7 / 29
The PriorA Non-hierarchical Prior
Any sort of prior, including a noninformative one.
Here we consider two types of priors which are computationally simpleand commonly used.
Individual e¤ects model can be written as:
y = X �β� + ε
where X � is a TN � (N + k � 1) matrix given by
() Topic 4 February 16, 2011 8 / 29
X � =
266664ιT 0T . . 0T eX10T ιT . . . eX2. 0T . . . .. . . . 0T .0T . . . ιT eXN
377775and
β� =
266664α1..
αNeβ
377775
() Topic 4 February 16, 2011 9 / 29
Individual e¤ects model can be written as regression model (withindividual dummy variables).
Use independent Normal-Gamma prior (but could also use naturalconjugate prior):
β� � N�
β�,V�
h � G�s�2, ν
�
() Topic 4 February 16, 2011 10 / 29
A Hierarchical Prior
Hierarchical priors are popular in many cases with high-dimensionalparameter spaces (such as the individual e¤ects model).
Consider a prior:αi � N (µα,Vα)
with αi and αj being independent of one another for i 6= j .Hierarchical structure of the prior arises if we treat µα and Vα asunknown parameters which require their own prior.
() Topic 4 February 16, 2011 11 / 29
We assume µα and Vα to be independent of one another with:
µα � N�
µα, σ2α
�and
V�1α � G�V�1α , να
�Hierarchical prior assumes all intercepts are drawn from samedistribution.
This extra structure (if consistent with patterns in the data), allowsfor more accurate estimation.
() Topic 4 February 16, 2011 12 / 29
For the remaining parameters, we assume a non-hierarchical prior ofthe independent Normal-Gamma variety.
eβ � N �β,V β
�and
h � G�s�2, ν
�This model is analogous to the frequentist random e¤ects model.
() Topic 4 February 16, 2011 13 / 29
Bayesian ComputationPosterior Inference under the Hierarchical Prior
Under the non-hierarchical prior, we have a linear regression modelwith independent Normal-Gamma prior. Hence, posterior inferencecan be can be carried out using methods in Chapter 4.
A Gibbs sampler can be used
The relevant posterior distributions for eβ and h, conditional on α, are
() Topic 4 February 16, 2011 14 / 29
eβjy , h, α, µα,Vα � N�
β,V β
�hjy , eβ, α, µα,Vα � G (s�2, ν)
αi jy , eβ, h, µα,Vα � N�αi ,V i
�,
µαjy , eβ, h, α,Vα � N�µα, σ
2α
�,
V�1α jy , eβ, h, α, µα � G�V�1α , να
�where formulae for arguments of these densities given in textbook,pages 152-154.
() Topic 4 February 16, 2011 15 / 29
Derivations above simple extensions of those for Normal linearregression model
Gibbs sampler requires only random number generation from Normaland Gamma distributions.
Note: the random coe¢ cients model is given by:
yi = Xi βi + εi
where βi varies over observation.
Discussed in textbook, pages 155-157. (Simple extension of individuale¤ects model so I will not discuss it here).
() Topic 4 February 16, 2011 16 / 29
E¢ ciency Analysis and the Stochastic Frontier Model
To motivate model, let output of �rm i at time t, Yit , be producedusing a vector of inputs, X �it , .
Firms have access to a common best-practice technology for turninginputs into output:
Yit = f (X �it ; β).
Production frontier measures the maximum amount of output thatcan be obtained from a given level of inputs.
Deviation of actual from maximum feasible output is a measure ofine¢ ciency.
() Topic 4 February 16, 2011 17 / 29
Formally:Yit = f (X �it ; β)τi
where 0 < τi � 1 is a measure of �rm-speci�c e¢ ciency and τi = 1indicates �rm i is fully e¢ cient.
Example: τi = 0.75 means that �rm i is producing only 75% of theoutput it could have if it were operating according to best-practicetechnology.
In this speci�cation, we have assumed each �rm has a particulare¢ ciency level which is constant over time. This assumption can berelaxed.
Adding a random error to the model, ζ it , to capture measurement (orspeci�cation) error:
Yit = f (X �it ; β)τi ζ it
() Topic 4 February 16, 2011 18 / 29
Common for f () to be log-linear (e.g. Cobb-Douglas or translog):
yit = Xitβ+ εit � zi
where yit = ln(Yit ), εit = ln(ζ it ), zi = �ln(τi ) and Xit is thecounterpart of X �it with the inputs transformed to logarithms
stack into matrices:yi = Xi β+ εi � zi ιT
zi is referred to as ine¢ ciency
0 < τi � 1It is a non-negative random variable.
Xit is assumed to contain an intercept and β1 is its coe¢ cient.
Note that this model is of the form of an individual e¤ects model:β1 � zi plays the same role that αi did for individual e¤ects models.
() Topic 4 February 16, 2011 19 / 29
Bayesian Inference in the Stochastic Frontier Model
Very similar to individual e¤ects model, so we will only sketch outdetails.The important new issue here is ine¢ ciency term, zi , so focus on that.Hierarchical prior for ine¢ ciencies:Since zi > 0, cannot use Normal hierarchical priorCommon choices include the truncated-Normal and members of thefamily of Gamma distributions.Here we will use the exponential distribution (which is Gamma withtwo degrees of freedom):
zi � G (µz , 2)
µz > 0 requires a prior.We use:
µ�1z � G�
µ�1z, νz
�() Topic 4 February 16, 2011 20 / 29
Now set up a Gibbs sampler.
Derive full conditional posterior distributions similarly to randome¤ects model
βjy , h, z , µz � N�
β,V�
hjy , β, z , µz � G (s�2, ν)
p(zi jyi ,Xi , β, h, µz ) ∝fN (zi jX i β� y i � (Thµz )
�1, (Th)�1)1(zi � 0)
µ�1z jy , β, h, z � G (µz , νz )
formulae for arguments of densities are given in the book.
Gibbs sampler involves drawning from Normal, truncated Normal andGamma distributions �all straightforward to do.
() Topic 4 February 16, 2011 21 / 29
Empirical Illustration: E¢ ciency Analysis with StochasticFrontier Models
To illustrate Bayesian inference in the stochastic frontier model,arti�cial data was generated from:
yit = 1.0+ 0.75x2,it + 0.25x3,it � zi + εit
for i = 1, .., 100 and t = 1, .., 5.
εit � N (0, 0.04), zi � G (� ln [.85] , 2), x2,it � U (0, 1) andx2,it � U (0, 1) .Note: ine¢ ciency distribution is selected to imply median of e¢ ciencydistribution is 0.85.
Priors are relatively noninformative (see textbook).
Posterior results based on Gibbs sampler
() Topic 4 February 16, 2011 22 / 29
Table 7.3 contains posterior means and standard deviations forparameters
With stochastic frontier models, interest often centers on �rm-speci�ce¢ ciencies, τi for i = 1, ..,N.
Since τi = exp (�zi ), and Gibbs sampler yields draws of zi , we cansimply transform them and average to obtain E (τi jy)There are N = 100 e¢ ciencies �we select �rms which have theminimum, median and maximum values for E (τi jy).These are labelled τmin, τmed and τmax in Table 7.3.
The histogram in Figure 7.5 plots the posterior means of thee¢ ciencies of all 100 �rms, might be presented to give a rough ideaof how e¢ ciencies are distributed across �rms.
() Topic 4 February 16, 2011 23 / 29
Table 7.3: Posterior Results for Arti�cialData Set from Stochastic Frontier Model
MeanStandardDeviation
β1 0.98 0.03β2 0.74 0.03β3 0.27 0.03h 26.69 1.86µz 0.15 0.02
τmin 0.56 0.05τmed 0.89 0.06τmax 0.97 0.03
() Topic 4 February 16, 2011 24 / 29
() Topic 4 February 16, 2011 25 / 29
An important issue in e¢ ciency analysis is whether point estimatescan be treated as a reliable guide to the ranking of �rms.
Important policy recommendations may hang on a �nding that �rm Ais less e¢ cient than �rm B.
Simply relying on point estimates which indicate that �rm A is lesse¢ cient than �rm B may lead to inappropriate policy advice.
But Gibbs sampler output can be used in a straightforward manner toshed light on this issue.
For instance, p (τA < τB jy) is the probability �rm A is less e¢ cientthan �rm B.
() Topic 4 February 16, 2011 26 / 29
We �nd p (τmax > τmed jy) = 0.89, p (τmax > τminjy) = 1.00 andp (τmed > τminjy) = 1.00.Thus, we can conclude that �rms which are ranked far apart in termsof their e¢ ciency estimates do truly di¤er in e¢ ciency.
However, it is likely that, e.g., researcher would be very uncertainabout saying 12th ranked �rm is more e¢ cient than 13th ranked.
Figure 7.6 plots posteriors for τmin, τmed and τmax.
() Topic 4 February 16, 2011 27 / 29
() Topic 4 February 16, 2011 28 / 29
Extensions/Applications
Panel data topics popular right now in the econometrics literature.
Panel data models introduced in this chapter are useful for modelingheterogeneity of various sorts.
This is a crucial issue in many �elds.
E.g. marketing has consumer heterogeneity,
labour economics, individuals may vary in many ways that cannot bedirectly observed by the econometrician (e.g. they may di¤er in theirreturns to schooling, their value of leisure, their productivity, etc.).
Dynamic panel data models are very hot these days (i.e. T is largeenough that you have to start worrying about time series and unitroot issues).
() Topic 4 February 16, 2011 29 / 29