topic1 panel
TRANSCRIPT
-
8/3/2019 Topic1 Panel
1/16
Panel Data
-
8/3/2019 Topic1 Panel
2/16
Outline
Panel Data
Fixed-effects vs. random-effects
First-differencing or fixed-effects
Strict Exogeneity Assumption
-
8/3/2019 Topic1 Panel
3/16
Panel Data (or Longitudinal Data)
A typical panel data set has both across-sectional dimension and a timeseries dimension. In particular, thesame cross-sectional units (e.g.individuals, families, firms, cities,states) are observed over time. Panel data is different from pooling
independent cross sections across time(or pooled OLS). Estimating the latteris a simple extension ofOLS.
-
8/3/2019 Topic1 Panel
4/16
Large N or Large T? N is the number of cross-sectional units and T is
the number of time periods.
Small N and small T (of little use)
* Large N and small T (Traditional Panel Data)
N is large enough for the Law of Large Numbers toapply while T is not.
Convenient to use if cross-sectional units areindependent.
Small N and Large T
T is large enough for the Law of Large Numbers toapply while N is not.
Autocorrelation has to be addressed.
Large N and Large T (Still under exploration)
-
8/3/2019 Topic1 Panel
5/16
Fixed Effects Panel-data Model
(individual-specific intercepts)
yit=0+t+1xit1+2xit2+ai+uit
Strict Exogeneity Assumption
Cov(Xit,uis)=0 for all tand s Ruling out dynamic models, which have lagged
dependent variables (e.g. yi,t-1) as explanatoryvariables. Models with the lags of dependentvariables as ind. Var. are still fine.
The effects of time-constant independentvariables can not be directly estimated becausethey are mixed in ai
t (time-specific intercepts) controls forcommon shocks to all agents at period t.
-
8/3/2019 Topic1 Panel
6/16
Names The individual-specific intercept ai may be called ai fixed
effector unobserved heterogenity.
The term uit is called idiosyncratic error.
The sum ai+uit is often called the composite error.
If Cov(Xit,ai) is nonzero but the pooled OLS method isused, estimates of all parameters might be biased.This
bias can be called heterogeneitybias.
Balanced Panelindicates panel data with observationsfor the same time periods for all individuals. Otherwise,the data are unbalanced.
-
8/3/2019 Topic1 Panel
7/16
Random Effects Models
yit=0+t+1xit1+2xit2+ai+uit
Key assumption: ai is uncorrelated with each explanatory variable in all
time periods.
Difference between RE and FE estimators
In FE, we effectively control for ai using dummy
variables. In RE, ai is omitted and is part of the disturbance
RE estimates are more efficient (or more precise) ifthe RE assumption is valid.
-
8/3/2019 Topic1 Panel
8/16
Random Effects Models
(continued)
Difference between RE and pooled OLS Since ai is in the error term, observations over time
are correlated for the same individual i
In RE approach, the correlation over time iseliminated using some sophisticated GLS(generalized least square) method.
In pooled OLS, the GLS correction is not used.
Hauman test
Compare the RE and FE estimates, if theestimates are very different, then the REassumption is probably invalid. In this case FEhas to be used. Otherwise, RE is more efficient.
-
8/3/2019 Topic1 Panel
9/16
Estimation of the Fixed-effect Panel
Data Model Fixed-effects (or Within) Estimator
Each variable is demeaned (i.e. subtracted by itsaverage)
Dummy Variable Regression (i.e. put in adummy variable for each cross-sectional unit,along with other explanatory variables.) Thismay cause estimation difficulty when N is large.
First-difference Estimator Each variable is differenced once over time, so
we are effectively estimating the relationshipbetween changes of variables.
-
8/3/2019 Topic1 Panel
10/16
First Differencing or Fixed-Effect? Theoretically, when N is large and T is small but
greater than 2, FE is more efficient when uit areserially uncorrelated while FD is more efficient whenuit follows a random walk.
When T is large and N is small
FD has advantage for processes with large positiveautocorrelation. FE is more sensitive to nonnormality,heteroskedasticity, and serial correlation in the
idiosyncratic errors. On the other hand, FE is less sensitive to violation of
the strict exogeneity assumption. So FE is preferredwhen the processes are weakly dependent over time.
-
8/3/2019 Topic1 Panel
11/16
With Classical Measurement Errors
When T>2, the measurement errorbias using FE estimator may be
smaller than that with FD approachbut higher than that with OLS.(Griliches and Hausman, 1986)
Natural IV for Measurement Error:Lagged dependent variables
-
8/3/2019 Topic1 Panel
12/16
Violation of the Strict Exogeneity
Assumption
Parameter estimates are inconsistent,natural experiment approach (e.g. IV)
is needed.
-
8/3/2019 Topic1 Panel
13/16
With Strict Exogeneity and
DependentO
bservations Parameter estimates are consistent
Standard errors estimates co
uld still bebiased:
Cross-sectional correlation or serial correlation(over time) in error terms
Heteroskedasticity
-
8/3/2019 Topic1 Panel
14/16
Possible Solutions (Need Large N and
Zero Cross-Sectional Correlation) Heteroskedasticity
Use White robust standard errors
Autocorrelation
Group the sample time dimension into twoperiods and apply the first-difference estimator(need large N). (Perform the best with D-in-Dapproach by Bertrand et al. 2004)
Clustered robust errors Newey-West standard errors (which also
accounts for heteroskedasticity) Cross-sectional Correlations
Clustered robust errors
-
8/3/2019 Topic1 Panel
15/16
Clustered Standard Errors
Key Assumption
Correlations within a cluster (a group of firms, aregion, different years for the same firm, differentyears for the same region) are the same are thesame for different observations.
Procedure
Identify clusters using economic theory (clustered byindustry, year, industry and year)
Let comp
uter calc
ulate cl
ustered standard errors
Try different ways of defining clusters and see howestimated standard errors are affected.
-
8/3/2019 Topic1 Panel
16/16
Unbalanced Panels If a panel data set is unbalanced for
reasons uncorrelated with uit, estimationconsistency using FE will not be affected
The attrition problem: If an unbalancedpanel is a result of some selection processrelated to uit, then endogeneity problem ispresent and need to be dealt with usingsome correction methods.