05-pooling_cross_sections_across_time.pdf

Introduction to Pooled Cross Sections

EMET 8002Lecture 5

August 13, 2009

Administrative Matters

Consultation hours: Tuesdays 3 pm to 5pm

Im going to be away again next Tuesday, so Ill hold consultation hours next week on Wednesday, August 19th from 3 pm to 5 pm

Case Studies projects have now been assigned. If you did not receive the email that I sent on Tuesday you can view the assignments on the course website

If you have not already done so, please contact your supervisor immediately. A few of the projects will require you to apply fordata access which should be done immediately!

Outline

Introduce pooled cross sections regression analysis Chapter 13 in the text

Potential pitfall using difference-in-differences estimation strategy

Introduction to two-period panel data and the first difference estimator

Pooling Independent Cross Sections across Time

What is it? It is obtained by sampling randomly from a large

population at two or more points in time For example, randomly sampling households from ACT

residents in 2000 and 2008

A Spreadsheet

view

Pooled cross sections Panel

Unit Period Unit Period

1 1 1 1

2 1 2 1

3 1 3 1

4 2 1 2

5 2 2 2

6 2 3 2

Formal notation

We can denote the pooled cross section as a random sample:

{ }1 2 1 1 1 2 1 21 2

, , ,..., 1,2,..., , 1,..., ,..., ...

1,2,...,

it it it kit T

Period Period PeriodT

y x x x i N N N N N N N

t T

= + + + + +

=


Consider the regression model:

Benefits: Pooling can lead to larger sample sizes: This leads to more precise

estimators and test statistics with more power. However, this is only true if the relationship between the dependent variable and at least some of the explanatory variables remains constant over time

If the xs are changing over time, it can also provide additional variation in x with which to estimate its effect on y

Note: the error term may have the structure

For now, well make the following assumptions:

And that the two components of the error term are independent

0 1 1 2 2 1 1 1 2 1 2

1 2

... 1,2,..., , 1,..., ,..., ...

1,2,...,

it it it k kit it T

Period Period PeriodT

y x x x u i N N N N N N N

t T

= + + + + + = + + + + +=

it t itu = +( ) ( )2 2| ~ 0, | ~ 0,t itX and X


Suppose the true error structure does include a year component, but we ignore that and run OLS on the following equation:

Does it matter? Yes! This introduces serial correlation between observations within the same

time period:

0 1 1 2 2 ...it it it k kit ity x x x u = + + + + +

( ) ( )( )( ) ( )2

2

| |

| ,

0

it jt t it t jt

t t it jt it jt

E u u X E X

E X i j t

= + + = + + +

=


This violates one of our assumptions for OLS! The OLS coefficient estimate is still unbiased and

consistent However, the variance-covariance matrix is

biased/inconsistent which leads to incorrect standard errors and incorrect inference

This is similar to the problem of serial correlation in time series models which we saw previously.

Thus, it makes sense to include time dummies (a.k.a. year effects):

0 1 1 2 22

...T

it it it k kit t itt

y x x x =

= + + + + + +

Interpretation of the year effects

How do we interpret the year dummies?

In words, each time dummy is the difference in the conditional expected value of y between the base year (t=1) and the year t=j

( )( )

( ) ( )1 1 2 2

1 1 2 2

| 1, ...

| , ...

| , | 1,

it it k kit

it it k kit j

j

E y t x x x

E y t j x x x

E y t j E y t

= = + + += = + + + +

= = =

x

x

x x


Further benefits: A further benefit of these data is that we can explore changes in the coefficients

over time This amount to allowing some or all of the s to have

t-subscripts In a two-period pooled cross section dataset with one explanatory variable:

We can then use F-tests (including Chow Tests) to test for changes in the regression model over time

Note: While changes in the coefficients may be interesting, one has to be very cautious in interpreting the source of the changes (e.g., as the impact of a policy or changing economic structure)

0 1 2 2 2it it it ity x x D = + + + +

Example 13.1: Womens Fertility over Time

Dependent variable is the number of children born to a woman.

Explanatory variables include socio-demographic characteristics.

Pooled time series of cross-sections (GSS: 1972, 1974, 1976, 1978, 1980, 1982, 1984).

N = 1,129 Dataset: FERTIL1.RAW One question of interest is: After controlling for other

observable factors, what happened to the fertility rate over time?


In Stata:sort yearby year: summarize kids

Mean number of children

1972 1974 1976 1978 1980 1982 1984

3.026 3.208 2.803 2.804 2.817 2.403 2.237

Year t -

1972

0.182 -0.223 -0.222 -0.209 -0.623 -0.789


In Stata: reg kids educ age agesq black east northcen west farm othrural town smcity y74 y76 y78 y80 y82 y84

kids

Coef.

Std. Err.

t

P>t

[95% Conf.

Interval]

educ

-.1284268

.0183486

-7.00

0.000

-.1644286

-.092425age

.5321346

.1383863

3.85

0.000

.2606065

.8036626agesq

-.005804

.0015643

-3.71

0.000

-.0088733

-.0027347black

1.075658

.1735356

6.20

0.000

.7351631

1.416152east

.217324

.1327878

1.64

0.102

-.0432192

.4778672northcen

.363114

.1208969

3.00

0.003

.125902

.6003261west

.1976032

.1669134

1.18

0.237

-.1298978

.5251041farm

-.0525575

.14719

-0.36

0.721

-.3413592

.2362443othrural

-.1628537

.175442

-0.93

0.353

-.5070887

.1813814town

.0843532

.124531

0.68

0.498

-.1599893

.3286957smcity

.2118791

.160296

1.32

0.187

-.1026379

.5263961y74

.2681825

.172716

1.55

0.121

-.0707039

.6070689y76

-.0973795

.1790456

-0.54

0.587

-.448685

.2539261y78

-.0686665

.1816837

-0.38

0.706

-.4251483

.2878154y80

-.0713053

.1827707

-0.39

0.697

-.42992

.2873093y82

-.5224842

.1724361

-3.03

0.003

-.8608214

-.184147y84

-.5451661

.1745162

-3.12

0.002

-.8875846

-.2027477_cons

-7.742457

3.051767

-2.54

0.011

-13.73033

-1.754579


There may be heteroskedasticity in the previous model. This could be related to the observed characteristics,

or It could simply be that the error variance is changing

over time Nonetheless, the usual heteroskedasticity-robust

standard errors and t statistics are still Just use the robust option with the regress command

in Stata

Allowing the effect to change across periods

We can also interact year dummy variables with key explanatory variables to see if the effect of that variable changed over time

Example 13.2: Changes in the returns to education and the gender wage gap

Consider the following regression model pooled over the years 1978 and 1985:

The dataset is CPS78_85.RAW reg lwage y85 educ y85educ exper expersq union

female y85fem

( ) 0 0 1 1 22

3 4 5 5

log 85 85

85

wage y educ y educ exper

exper union female y female u

= + + + ++ + + + +

Example 13.2: Changes in the returns to education and the gender wage gap

lwage

Coef.

Std. Err.

t

P>t

[95% Conf.Interval]

y85

.1178062

.1237817

0.95

0.341

-.125075

.3606874educ

.0747209

.0066764

11.19

0.000

.0616206

.0878212y85educ

.0184605

.0093542

1.97

0.049

.000106

.036815exper

.0295843

.0035673

8.29

0.000

.0225846

.036584expersq

-.0003994

.0000775

-5.15

0.000

-.0005516

-.0002473union

.2021319

.0302945

6.67

0.000

.1426888

.2615749female

-.3167086

.0366215

-8.65

0.000

-.3885663

-.244851y85fem

.085052

.051309

1.66

0.098

-.0156251

.185729_cons

.4589329

.0934485

4.91

0.000

.2755707

.642295

Chow test for structural change across time

We can apply the Chow test to see if a multiple regression function differs across two time periods

We can do this is pooled cross sections by interacting all explanatory variables with time dummies and performing an F-test that the interactions are jointly insignificant Usually, we allow the intercept to change over time

and only test that the slope parameters have changed

Policy Analysis with Pooled Cross Sections

This type of data can be useful in identifying the impacts of policies (or government programs) on various outcomes

They are especially helpful if the policy experiment has before & after and treatment & control dimensions

Consider a simple example: We wish to estimate the impact of participating in a

government program (i.e., the treatment) on an outcome, y Let participation in the program be captured by the dummy

variables: 1 affected ("treatment")0 unaffected ("control")i

D =

A simple estimate of the treatment effect

One estimator of the treatment effect is given by the difference of means:

In a regression context, we could estimate this difference by:

Such a regression would work if

T Cy y

i i iy D u = + +

( ) ( )( ) ( )

( ) ( ) ( ) ( )

| 0 | 0

| 1 | 1

| 1 | 0 iff | 1 | 0

i i i i

i i i i

i i i i i i i i

E y D E u D

E y D E u D

E y D E y D E u D E u D

= = + == = + + =

= = = = = =

A simple estimate of the treatment effect

The difference in means between the treatment and control groups and the OLS estimate of are consistent estimates of the treatment effect ONLY if there are no other differences between the treatment and control groups

Sometimes we can add covariates (Xs) to help control for differences between the treatment and control group

Nonetheless, this is often an impractical assumption to make

However, we may be able to use time variation in the application of the program (before & after) combined with variation in treatment

Empirical example: The effect of building an incinerator on house prices

The hypothesis that we are interested in testing is that the announcement of the pending construction of an incinerator wouldcause the prices of houses located nearby to fall, relative to houses further away.

A house is considered to be close if it is within 3 miles of the incinerator.

We have data on house prices for houses that sold in 1978, before the announcement of the incinerator, and in 1981, after the announcement.

We begin by regressing the real house price on a dummy variable for whether the house is close to the incinerator using data from 1981 using the dataset KIELMC.RAW

Empirical example: The effect of building an incinerator on house prices

1981 1978 1978 & 1981

Near Incinerator

-30,688(5,828)[-5.27]

-18,824(4,745)[-3.97]

-18,824(4,875)[-3.86]

Year 1981 18,790(4,050)[4.64]

Near Incinerator * 1981

-11,864(7,457)[-1.59]

The Difference-in-Differences Model

Consider the following simple example where we allow:

The model is:

Suppose that the differences between treatment and controlgroups can be written:

Also assume that the time effects can be written (normalized)as:

E u

i| D

i= 1!" #$ % E ui | Di = 0!" #$

y

it=! + "D

it+#

t+ u

it

E uit

| Di1= 0, D

i2= 0!" #$ = % C

E uit

| Di1= 0, D

i2= 1!" #$ = % T

!t= 0,t = 1

!t= ! ,t = 2


The expected outcomes in the before (period 1) are:

In the after period (period 2):

An estimate of can then be recovered by comparing:

E yi1

| Di1= 0, D

i2= 0!" #$ =% + & C

E yi1

| Di1= 0, D

i2= 1!" #$ =% + & T

E yi1

| Di1= 0, D

i2= 0!" #$ =% + & C +'

E yi1

| Di1= 0, D

i2= 1!" #$ =% + & T +' + (

E yi2

| Di1= 0, D

i2= 1!" #$ % E yi1 | Di1 = 0, Di2 = 1!" #$( )

% E yi2

| Di1= 0, D

i2= 0!" #$ % E yi1 | Di1 = 0, Di2 = 0!" #$( )


The difference-in-differences estimator would then be based on:

Or, alternatively,

In a regression framework, we would estimate this as:

yT ,2! y

T ,1"# + $

yC ,2! y

C ,1"#

yT ,2! y

C ,2" #

T! #

C( ) + $

yT ,1! y

C ,1" #

T! #

C( )

y

it=! +"AFTER

it+ # D

it+ $D

it% AFTER

it+ u

it


Of course, we could also add covariates. In this specification, we denote:

! " common time effect

# " permanent differences (across T,C)

$ " treatment effect (diff of diff)

Difference-in-difference

They key distinction between the difference-in-difference estimators and the difference in means is that we have relaxed the assumption about the distribution of the error terms across the treatment and control groups.

We no longer require the conditional expectation of the error term to be equal across groups, we only require the conditional expectation for each group to be constant over time

This may still be a strong assumption! You need to thinkabout the validity of making this assumption.

Another Example: The Effect of WorkersComp. on Injury Duration (Kentucky)

Notes: Dependent variable is log duration of Workers comp benefits. Controls include: age, sex,married, whether a hospital stay was required, indicators for the type of injury, and industry ofjob. After corresponds to the increase in the cap of weekly WC benefits.

5347534725672567Sample size

0.1880.0220.2140.031R-squared

YesNoYesNoControls

0.175(0.064)

0.229(0.070)

After*HighEarner

0.047(0.041)

0.014(0.045)

After

0.115(0.048)

0.233(0.049)

0.274(0.054)

0.462(0.051)

High Earner

Before andAfter

Before andAfter

AfterAfterData:

Examples of difference-in-difference

For a good example of a paper using this strategy, see Duflo, Esther. (2001). Schooling and labor market consequences of school construction in Indonesia: Evidence from an unusual policy experiment. American Economic Review. Vol. 91, No. 4, pp. 795-813.

For a good example of when the impact of the policy/program might have spillover effects on the control group, see Miguel, Edward and Michael Kremer. (2004). Worms: Identifying impacts on education and health in the presence of treatment externalities. American Economic Review. Vol. 72, No. 1, pp. 159-217.

How much should we trust diff-in-diff estimates?

The following discussion is based on an excellent paper by Bertrand, Duflo, and Mullainathan (2004) in the Quarterly Journal of Economics

Many papers that employed difference-in-differences estimators use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are inconsistent

Diff-in-diff estimates are usually based on estimating an equation of the form:

where i denotes individuals, s denotes a state or group membership, and t denotes the time period

ist s t ist st istY A B cX I = + + + +

How much should we trust diff-in-diff estimates? An important point, that we will address later in the course, is possible

correlation of the error terms across individuals within a state/group in a given year.

We are going to ignore this potential problem for now and assume that the econometricians have appropriately dealt with correlation within state-year cells Hence, lets think of the data being averaged over individuals

within a state in each given year Three factors make serial correlation an important issue in the

difference-in-differences context: Estimation often relies on long time series The most commonly used dependent variables are typically highly

serially correlated The treatment variable, Ist, changes very little within a state over

time


How severe is the problem? They examine how diff-in-diff performs on placebo laws,

where treated states (in the U.S.) are chosen at random as is the year of passage of the placebo law

Since the laws are fictitious, a significant effect should only be found 5% of the time (i.e., the true null hypothesis of no effect is falsely rejected 5% of the time)

They use wages as the dependent variable over 21 years They find rejection rates of the null hypothesis as high as

45% of the time! In other words, there is statistical evidence that these fakes

laws affected wages in close to half of the simulations


Does this matter practically? They find 92 diff-in-diff papers published between

1990 and 2000 in the following journals: the American Economic Review, the Industrial Labor Relations Review, the Journal of Labor Economics, the Journal of Political Economy, the Journal of Public Economics, and the Quarterly Journal of Economics 69 of these papers have more than 2 time periods

Only 4 papers collapse the data into before-after Thus, 65 papers have potential serial correlation problem

Only 5 provide a serial correlation correction


Some results: When the treatment variable is not serially correlated,

rejection rates of H0: no effect are close to 5% The overrejection problem diminishes with the serial

correlation in the dependent variable


Solutions: Parametric methods

Specify an autocorrelation structure for the error term, estimate its parameters, and use these parameters to compute standard errors This does not do a very good job of remedying the problem

With short time series, the OLS estimation of the autocorrelation parameter is downward biased

The autocorrelation structure may be incorrectly specified

Block bootstrap Bootstrapping is an advanced technique

It does poorly when the number of states/groups becomes small


Solutions (continued): Ignore time series information: average the before and after

data and estimate the model on two periods This is difficult when treatment occurs at different times across

states since there is no longer a uniform before and after and it is not even defined for control states This can be corrected for though

This procedure works well, even when there are a small number of states/groups

Arbitrary variance-covariance matrix Does quite well in general, although the rejection rate

increases above 5% when the number of states/groups is small Can be implemented in Stata using the cluster option (at the

state/group level, not the state-time cell)


Main message: There is not one preferred correction mechanism

Collapsing the data into pre- and post- periods produced consistent standard errors, even when the number of states is small (although the power of this procedure declines fast)

Allowing for an arbitrary autocorrelation process is also viable when the number of groups is sufficiently large

Doing good econometrics is not easy! Be very, very careful that all your assumptions are

met!

Panel data

If we have repeated observations on the same individuals(units, i) then we have longitudinal, or panel data:

Benefits of panel data: Similar to repeated cross-sections; BUT most importantly, we can exploit repeated observations on the

same individual in order to control for certain types of unobservedheterogeneity, which otherwise might contaminate OLS estimation;

Panel data allows for richer controls for unobserved heterogeneitythan just systematic differences between treatment and control.

yit, x

it{ } i = 1,2,3,..., Nt = 1,2,...,T

Panel Data

Begin with two periods, for simplicity Of course, we can do all the same stuff with panel data as with

pooled cross-sections. However, we can do more.

and we will also have an additional statistical consideration, withthe loss of independence across observations

Consider the simple regression model:

Omitted variables bias for arises when

Under certain assumptions, we will be able to exploit panel datain order to fix this bias (unobserved heterogeneity)

y

it= !

0+ !

1x

it+ v

it

1!

( ), 0it it

corr x v !

Panel Data

When corr(xit,vit)0 this violates assumption MLR.4(and MLR.4)

Hence, OLS is no longer valid

Under some circumstances we can cope with this problem using panel data. This is another example of when one of the core OLS assumptions fails to hold.

Fixed Effects Error Structure

Imagine we can write the error term as:

Furthermore, assume that ALL of the omitted variables bias wasdue to

i.e., the correlation of x with fixed (and unobserved) individualcharacteristics.

t

composite error

time effect

fixed effect

idiosyncratic effect

it t i it

it

i

it

v a u

v

a

u

!

!

= + +

"

"

"

"

( ), 0it i

corr x a !

First Difference Estimator

Consider the First Differenced (FD) estimator, based on:

The key point is that the fixed effects fall out. By assumption, we also require

Thus, by differencing we have eliminated the heterogeneitybias.

2 0 1 2 2

1 0 1 1 1

1

i i i

i i i

i i i

y x a u

y x a u

y x u

! ! "

! !

! "

= + + + +

= + + + +

# = # + + #

( ), 0i i

corr x u! ! =

Example: Crime and unemployment

We have data on crime and unemployment rates for 46 cities in 1982 and 1987 Cities are the unit of observation i 1982 and 1987 are the period of observation t

Well try three specifications:

0 1

0 0 1

0 1

198787 1982,1987

it it it

it t it it

i i i

crmrte unem u tcrmrte d unem u tcrmrte unem u

= + + == + + + =

= + +

Simple Example: Unemployment andcrime (standard errors in parentheses)

0.1270.0120.033R-squared

469246N

7.94(7.98)

Y87

2.22(0.88)

0.427(1.19)

-4.16(3.42)

UnemploymentRate

15.40(4.70)

93.42(12.74)

128.38(20.76)

Constant

1982,1987(FD)

1982,1987(Levels)

1987(Levels)

Data:

Interpretation

Controlling for unemployment, crime has risen between 1982 and 1987 in these cities

Using just cross sectional data (i.e., only the 1987 data) wouldsuggest that higher unemployment is associated with lower crime rates this is certainly not what we expect!

Using the first-differencing specification suggests that the partial correlation between crime and unemployment is positive when we control for city fixed effects (i.e., the negative partial correlation we observed in the 1987 cross section was biased)

Caveats to the first difference estimator

It may be incorrect to assume that corr(xi, vi)=0

We need variation in the xs This means we cannot include variables that do not change

over time across observations (e.g., race, country of birth, etc.)

It also means we cannot include variables for which the change would be the same for all observations (e.g., age)

Also, we cannot expect to get precise estimates on variables, such as education, which will tend to change for a relatively few observations in a dataset

The FD Estimator more generally

The FD estimator provides a powerful strategy for dealing withomitted variables bias when panel data are available.

More generally, we can apply the model to multiple time periods(not just two):

In which case the FD estimator is based on:

yit= !x

it+ "

tD

t+ a

i+ u

itt=2

T

#

!yit= "!x

it+ #

tD

t+ !u

itt=2

T

$

The FD Estimator and Program Evaluation

Of course, we can also use this framework for policy evaluation(difference-of-differences, as before).

The added benefit is that we can control for the unobservedfixed effects at the level of the individual unit.

We do not require as the same simple structure as with thepooled cross-sections.

But the framework is no panacea, since there may be very goodreasons why

For example, unexplained changes in y (the error term) may becorrelated with changes in policy.

( )cov , 0it itx u! ! "

Additional Considerations

Given that there is a time-series dimension to the FD estimator(and panel data more generally), we may need to account forserial correlation.

In addition, we may need to deal with heteroskedasticity. While there are GLS (serial correlation) procedures available,

the easiest solution would be to use Newey-West variance-covariance matrix.

Example: County Crime Rates (NC)

Panel of North Carolina counties, 1981-1987. How do various law enforcement variables affect the crime rate? Base specification includes (in logs):

Probability of arrest; Probability of conviction (conditional on arrest); Probability of prison (conditional on conviction) Average sentence (conditional on prison) Police per capita

Covariates: Region, urban, pop density, tax revenues Year effects

Estimated in levels and FD (ignoring serial correlation, etc.)

Example: County crime rates in North Carolina, 1981-1987.

Pooled Cross Sections First Differencing

log(prbarr) -0.720(0.037)

-0.327(0.030)

log(prbconv) -0.546(0.026)

-0.238(0.018)

log(prbpris) 0.248(0.067)

-0.165(0.026)

log(avgsen) -0.087(0.058)

-0.022(0.022)

log(polpc) 0.366(0.030)

0.398(0.027)

Year effects Yes Yes

No. observations 630 540

R-squared 0.57 0.43

Interpretation

Consider the impact of the probability of being arrested: The first-differencing estimates suggest that we were

overestimating the negative impact on the crime rate (i.e., increasing the probability of arrest has less of an impact once you remove county fixed effects)

Potential pitfalls

It can be worse than pooled OLS if one or more of the explanatory variables is subject to measurement error Differencing a poorly measured regressor reduces its

variation relative to its correlation with the difference error (see Wooldridge, 2002, Chapter 11 for more details)

This could be a problem with explanatory variables from household or firm surveys, especially ones in developing countries

Differencing with More Than Two Time Periods

More on the error structure: When doing FD estimation with more than two time

periods, we must assume that uit is uncorrelated over time no serial correlation This assumption is sometimes reasonable, but it will not

hold if we assume that uit are uncorrelated over time. If the uit are serially uncorrelated with constant variance

then uit and uit-1 are negatively correlated (-0.5) If uit follows a stable AR(1) process then uit will be

serially correlated Only when uit follows a random walk will uit be serially

uncorrelated


Testing for serial correlation in the first-differenced equation: First, we estimate our first-differenced equation and

obtain the residuals Run a simple pooled OLS regression of the residual on

the lagged residual for t=3,,T, i=1,,N and compute a standard t test for the coefficient on the lagged residual

m ititr u


Correcting for serial correlation: In the presence of AR(1) serial correlation we can use

the Prais-Winsten FGLS estimator The Cochrane-Orcutt procedure is less preferred since

we lose N observations by dropping the first time period

However, standard PW procedures will treat the observations as if they followed an AR(1) process over both i and t, which makes no sense in this situation since we have assumed independence across i

A detailed treatment on how to do this can be found in Wooldridge (2002)

Assumptions for Pooled OLS Using First Differences

Assumption FD.1: For each i the model is

where the j

are the parameters to be estimated and ai

is the unobserved effect.

Assumption FD.2: We have a random sample from the cross section.

Assumption FD.3: Each explanatory variable changes over time (for at least some i) and no perfect linear relationships exist among the explanatory variables.

1 1 ... , 1,...,it it k itk i ity x x a u t T = + + + + =


Assumption FD.4: For each t, the expected value of the idiosyncratic error given the explanatory variables in all time periods and the unobserved effect is zero: E(uit|Xt,ai)=0. As stated, this assumption is stronger than is

necessary for consistency (uit is uncorrelated with xitj for all j=1,,k and for all t=2,,T).

Under assumptions FD.1 through FD.4, the first-difference estimator is unbiased.


Assumption FD.5: The variance of the differenced errors, conditional on all explanatory variables, is constant (i.e., homoskedastic): var(uit|Xi)=2, t=2,,T.

Assumption FD.6: For all ts, the differences in the idiosyncratic errors are uncorrelated (conditional on all explanatory variables): cov(uit,uis|Xi)=0, ts.

Under assumptions FD.1 through FD.6, the FD estimator of j is the best linear unbiased estimator (conditional on the explanatory variables).

Comparison of assumptions with standard OLS

Notice the strong similarities between the first differencing assumptions (FD) and those for standard OLS (MLR): MLR.1 and FD.1 are basically the same, except weve now

added repeated observations and an unobserved effect for each cross-sectional observation

MLR.2 and FD.2 are the same MLR.3 and FD.3 are the same, except weve added the

condition that there has to be at least some time variation for each of the explanatory variables

MLR.4 and FD.4 are the same, except that the condition is across all time periods (clearly FD.4 is the same as MLR.4 if T=1)

Comparison of assumptions with standard OLS

FD.5 is the same as MLR.5: homoskedasticity, but of the differenced error terms

FD.6 is new. It assumes that there is no correlation over time of the error terms (clearly this was not an issue when T=1) But we had a no serial correlation assumption in time

series models

Practice questions

In-chapter questions: 13.1, 13.3, 13.4, 13.5

End-of-chapter questions: C13.2, C13.7, C13.11 (i-iv)

Introduction to Pooled Cross SectionsAdministrative MattersOutlinePooling Independent Cross Sections across TimeA Spreadsheet viewFormal notationPooling Independent Cross Sections across TimePooling Independent Cross Sections across TimePooling Independent Cross Sections across TimeInterpretation of the year effectsPooling Independent Cross Sections across TimeExample 13.1: Womens Fertility over TimeExample 13.1: Womens Fertility over TimeExample 13.1: Womens Fertility over TimeExample 13.1: Womens Fertility over TimeAllowing the effect to change across periodsExample 13.2: Changes in the returns to education and the gender wage gapExample 13.2: Changes in the returns to education and the gender wage gapChow test for structural change across timePolicy Analysis with Pooled Cross SectionsA simple estimate of the treatment effectA simple estimate of the treatment effectEmpirical example: The effect of building an incinerator on house pricesEmpirical example: The effect of building an incinerator on house pricesSlide Number 25Slide Number 26Slide Number 27Slide Number 28Difference-in-differenceSlide Number 30Examples of difference-in-differenceHow much should we trust diff-in-diff estimates?How much should we trust diff-in-diff estimates?How much should we trust diff-in-diff estimates?How much should we trust diff-in-diff estimates?How much should we trust diff-in-diff estimates?How much should we trust diff-in-diff estimates?How much should we trust diff-in-diff estimates?How much should we trust diff-in-diff estimates?Slide Number 40Slide Number 41Panel DataSlide Number 43Slide Number 44Example: Crime and unemploymentSlide Number 46InterpretationCaveats to the first difference estimatorSlide Number 49Slide Number 50Slide Number 51Slide Number 52Example: County crime rates in North Carolina, 1981-1987.InterpretationPotential pitfallsDifferencing with More Than Two Time PeriodsDifferencing with More Than Two Time PeriodsDifferencing with More Than Two Time PeriodsAssumptions for Pooled OLS Using First DifferencesAssumptions for Pooled OLS Using First DifferencesAssumptions for Pooled OLS Using First DifferencesComparison of assumptions with standard OLSComparison of assumptions with standard OLSPractice questions

05-pooling_cross_sections_across_time.pdf

Documents