panel data seminar - · pdf fileit+1 as an additional set of covariates ... (crest-insee)...

29
Panel Data Seminar Discrete Response Models Romain Aeberhardt Laurent Davezies Crest-Insee 11 April 2008 Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 1 / 29

Upload: ngonhu

Post on 08-Mar-2018

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Panel Data SeminarDiscrete Response Models

Romain Aeberhardt Laurent Davezies

Crest-Insee

11 April 2008

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 1 / 29

Page 2: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Overview and Strategies

Contents

1 Overview and Strategies

2 Simple Approaches and their DrawbacksLinear Probability ModelFixed effects : the Incidental Parameters ProblemRandom Effects : the assumptions are too strong

3 Classical RemediesConditional Logit : removing the Fixed EffectsChamberlain’s and Mundlak’s Approaches : relaxing the RandomEffects assumption

4 ExtensionsDynamic frameworkSemi-Parametric approach

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 2 / 29

Page 3: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Overview and Strategies

Introduction

Panel data characterized by an outcome of the form :yit = F (xitβ + αi + uit)

Main advantage of panel data : possibility to take into account theunobserved heterogeneity αi

Main difficulty with panel data : dealing with unobservedheterogeneity, in particular : relationship between αi and xit

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 3 / 29

Page 4: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Overview and Strategies

Important reminder

The usual denomination of “Fixed Effects” and “Random Effects” ismisleading

Fixed Effects “means” no assumption concerning the dependencebetween αi and xit

Random Effects “means” in general an independence assumptionbetween αi and xit (although it can be relaxed)

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 4 / 29

Page 5: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Overview and Strategies

Simple strategies

Linear Probability Model

Good for a quick startBut bad properties (worse than in cross section)

Probit / Logit with Fixed Effects as dummies

Conceptually simpleBut ML estimators are consistent only when N →∞ and T →∞(incidental parameters problem)

Simple Random Effects Probit

Computationaly quite easy (already implemented)But one strong assumption of no correlation between unobservedheterogeneity and covariatesSo one misses the point of using panel data

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 5 / 29

Page 6: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Overview and Strategies

Classical Remedies

Conditional Logit

In the spirit of the Within or FD transformationsNo assumptions required on the correlation between unobservedheterogeneity and covariatesBut the identification hinges on the functional form (logit)

Chamberlain’s and Mundlak’s Approaches

Based on the RE framework, computationaly easyRelaxes the no correlation assumptionAllows only for a restricted relation between unobserved heterogeneityand covariates

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 6 / 29

Page 7: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Overview and Strategies

Extensions

Dynamic framework

Relaxes the strict exogeneity assumptionIn particular, allows for the presence of the lagged dependent variableamong the covariatesQuestion of state dependence vs. unobserved heterogeneityRaises a new issue : the initial conditions problem

Semi-parametric models

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 7 / 29

Page 8: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Overview and Strategies

Main Reference for this class

Econometric Analysis of Cross Section and Panel Data, J.M.Wooldridge

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 8 / 29

Page 9: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Simple Approaches and their Drawbacks

Contents

1 Overview and Strategies

2 Simple Approaches and their DrawbacksLinear Probability ModelFixed effects : the Incidental Parameters ProblemRandom Effects : the assumptions are too strong

3 Classical RemediesConditional Logit : removing the Fixed EffectsChamberlain’s and Mundlak’s Approaches : relaxing the RandomEffects assumption

4 ExtensionsDynamic frameworkSemi-Parametric approach

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 9 / 29

Page 10: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Simple Approaches and their Drawbacks Linear Probability Model

Linear Probability Model : good for a quick start

Main advantage : allows to use all the simple and well known methodsdevelopped for linear models (FE, RE, Chamberlain’s approach, ...)

Same problems as in the cross section case (predicted values outsidethe unit interval, heteroskedasticity)

Even less appealing : it implies −xiβ ≤ αi ≤ 1− xiβ

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 10 / 29

Page 11: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Simple Approaches and their Drawbacks Fixed effects : the Incidental Parameters Problem

First idea : using dummies for fixed effects

Interest : no assumption on the correlation structure between αi andxit

A priori simple : just add dummies in the equation and use standardestimation procedures

Danger : MLE estimators are asymptotically unbiased and consistentonly if N →∞ and T →∞

Intuition : in the ML framework the number of regressors is fixed, andhere it increases with NFixed effects are biased and poorly estimated when T is smallIt contaminates the rest of the coefficients through the MLE procedureDifference with the linear case : the estimation of β did not depend onthe αi (Frish-Waugh)

This is called the “incidental parameters problem”

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 11 / 29

Page 12: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Simple Approaches and their Drawbacks Fixed effects : the Incidental Parameters Problem

Chamberlain’s illustration of the incidental parametersproblem

Very simple framework : ML estimation of a logit model with twoindependent time periods, fixed effects and one explanatory variable xit s.t.∀i , xi1 = 0 and xi2 = 1

P(yit = 1|x , α) =eαi+xitβ

1 + eαi+xitβ

if yi1 = 0 and yi2 = 0 then αi = −∞if yi1 = 1 and yi2 = 1 then αi = +∞if yi1 + yi2 = 1 then αi = −β/2

and β = 2 log(n2/n1)P−→ 2β

with n1 = #{i |yi1 = 1, yi2 = 0} and n2 = #{i |yi1 = 0, yi1 = 1}

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 12 / 29

Page 13: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Simple Approaches and their Drawbacks Random Effects : the assumptions are too strong

RE : simple procedure but strong assumptions

Basic assumptions :

P(yit = 1|xit , αi ) = Φ(xitβ + αi )yi1, yi2, . . . , yiT independent conditional on (xi , αi )

Density of (yi1, . . . , yiT ) conditional on (xi , αi ) :

f (yi1, . . . , yiT |xi , αi , β)

=T∏

t=1

f (yit |xit , αi , β)

=T∏

t=1

Φ(xitβ + αi )yit [1− Φ(xitβ + αi )]1−yit

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 13 / 29

Page 14: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Simple Approaches and their Drawbacks Random Effects : the assumptions are too strong

RE : simple procedure but strong assumptions

One needs to integrate out αi , which requires an additionalassumption :

αi |xi ∼ N (0, σ2α)

The conditional density becomes

f (yi1, . . . , yiT |xi , β, σα) =

∫ +∞

−∞[

T∏t=1

f (yit |xit , α, β)]1

σαϕ

σα

)dα

This is already implemented or easy to implement in standardsoftwares

The independance assumption of αi and xi is very strong

One misses the point of using panel data

But this procedure will be the basis for more complicated approaches

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 14 / 29

Page 15: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Classical Remedies

Contents

1 Overview and Strategies

2 Simple Approaches and their DrawbacksLinear Probability ModelFixed effects : the Incidental Parameters ProblemRandom Effects : the assumptions are too strong

3 Classical RemediesConditional Logit : removing the Fixed EffectsChamberlain’s and Mundlak’s Approaches : relaxing the RandomEffects assumption

4 ExtensionsDynamic frameworkSemi-Parametric approach

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 15 / 29

Page 16: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Classical Remedies Conditional Logit : removing the Fixed Effects

Conditional Logit : make the αi vanish

In the spirit of the linear FE model

Requires no assumption on αi

yi1, . . . , yiT independent conditional on (xi , αi )

The distribution of (yi1, . . . , yiT ) conditional on

xi , αi and ni =T∑

t=1

yit

does not depend on αi

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 16 / 29

Page 17: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Classical Remedies Conditional Logit : removing the Fixed Effects

Conditional Logit : make the αi vanish

Example with T = 2, the result is based on

P(yi1 = 1, yi2 = 0|αi , xi )

P(yi1 = 0, yi2 = 1|αi , xi )= eβ(xi1−xi2)

and then

P(yi1 = 0, yi2 = 1|yi1 + yi2 = 1, αi , xi ) =1

1 + eβ(xi1−xi2)

independent of αi and hence,

P(yi1 = 0, yi2 = 1|yi1 + yi2 = 1, xi ) =1

1 + eβ(xi1−xi2)

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 17 / 29

Page 18: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Classical Remedies Conditional Logit : removing the Fixed Effects

Conditional Logit : make the αi vanish

Conditional log likelihood for observation i is

clli (β) = 1{ni=1}(wi log Λ[(xi2 − xi1)β]

+ (1− wi ) log(1− Λ[(xi2 − xi1)β]))

Same properties as the “usual” likelihood

The identification uses only the individuals who change state

Only drawback : the identification hinges on the functional form(logit) and there is no similar strategy with probit for example

There is still a conditional independance assumption for the yit : i.e.no serial correlation in the uit , and no state dependence.

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 18 / 29

Page 19: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Classical RemediesChamberlain’s and Mundlak’s Approaches : relaxing the

Random Effects assumption

Back to the RE

Relaxing the crucial RE assumption : αi |xi ∼ N (0, σ2α) by specifying a

special form of dependence

Mundlak (1978) : αi |xi ∼ N (ψ + xiξ, σ2a)

Chamberlain (1980), more general form : instead of xi , he uses thevector of all explanatory variables across all time periods xi

We can use standard RE probit software by just adding all the xi toall time periods (Chamberlain), or only the xi (Mundlak)

Restrictive in the sense that it specifies a distribution of αi w.r.t. xi

Still strong assumptions on the distribution tails for αi

At least allows for some correlation

Can be extended, for instance by specifying the distribution of thehigher moments of αi |xi

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 19 / 29

Page 20: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Classical RemediesChamberlain’s and Mundlak’s Approaches : relaxing the

Random Effects assumption

Strict exogeneity

All the previous procedures hinge on the strict exogeneity of xit

conditional on αi :

xit independent of uit′ at all time periods t ′

Very difficult to correct for endogeneity in nonlinear models

But an easy test can be implemented :

Let wit be a subset of xit which potentially fail the strict exogeneityassumptionInclude wit+1 as an additional set of covariatesUnder the null hypothesis of strict exogeneity,the coefficients on wit+1 should be statistically insignificant

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 20 / 29

Page 21: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions

Contents

1 Overview and Strategies

2 Simple Approaches and their DrawbacksLinear Probability ModelFixed effects : the Incidental Parameters ProblemRandom Effects : the assumptions are too strong

3 Classical RemediesConditional Logit : removing the Fixed EffectsChamberlain’s and Mundlak’s Approaches : relaxing the RandomEffects assumption

4 ExtensionsDynamic frameworkSemi-Parametric approach

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 21 / 29

Page 22: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions Dynamic framework

State dependence vs. unobserved heterogeneity

Dynamic framework :

P(yit = 1|yit−1, . . . , yi0, xi , αi ) = G (xitδ + ρyit−1 + αi )

xit are supposed to be strictly exogenous, but yit−1 appears on theRHS so we lose the strict exogeneity (yit−1 depends on uit−1)

Extensions of the previous approaches

Conditional logit cf Chamberlain (1985, 1993), Magnac (2000), HonoreKyriazidou (1997)Extension of the RE framework but raises the initial conditions problem

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 22 / 29

Page 23: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions Dynamic framework

Conditional Logit in a dynamic framework

You need at least 4 observations per individual

Intuition : in order to make the αi vanish, you need to consider thetwo sets of events :A = {yi0 = d0, yi1 = 0, yi2 = 1, yi3 = d3}andB = {yi0 = d0, yi1 = 1, yi2 = 0, yi3 = d3}With no other covariates, see Chamberlain (1985), Magnac (2000)

Extensions with strictly exogenous covariates, see Honore andKyriazidou (2000)

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 23 / 29

Page 24: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions Dynamic framework

Back to RE framework, the initial conditions problem

Form of the joint density of the observations ranging from 0 to T for anindividual i :

f (yi0, yi1, . . . , yiT |αi , xi , β) =T∏

t=1

f (yit |yit−1, xit , αi , β)f (yi0|xi0, αi )

Goal : integrating out αi in order to obtain :

f (yi0, yi1, . . . , yiT |xi , β) =

∫ T∏t=1

f (yit |yit−1, xit , αi , β)f (yi0|xi , αi )g(αi |xi )dαi

Initial conditions problem : specifying f (yi0|xi , αi )

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 24 / 29

Page 25: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions Dynamic framework

Initial conditions problem : Heckman’s approach

Specify f (yi0|xi , αi ) and then specify a density for αi given xi

For instance, assume that yi0 follows a probit model with successprobability Φ(η + xiπ + γαi )

Then integrate out αi by specifying for instance αi |xi ∼ N (mi , σ2i )

Problem : it is very difficult to specify the density of yi0 given (xi , αi )

Problem : because the ”true” density of yi0 given (xi , αi ) is not knownand is supposed to depend on yi−1, estimators are biased whenT < +∞

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 25 / 29

Page 26: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions Dynamic framework

Initial conditions problem : Wooldridge’s approach

Instead of working on the full density

f (yi0, yi1, . . . , yiT |αi , xi , β)

Wooldridge prefers to work on the conditional density

f (yi1, . . . , yiT |yi0, αi , xi , β)

Advantage : remaining agnostic on the density of yi0 given (xi , αi )Then specify a density for αi given (yi0, xi )and keep conditioning on yi0 in addition to xi

f (yi1, . . . , yiT |yi0, xi , θ) =

∫ +∞

−∞f (yi1, . . . , yiT |yi0, xit , α, β)h(α|yi0, xi , γ)dα

For example, with h(α|yi0, xi , γ) ∼ N (ψ + ξ0yi0 + xiξ, σ2a)

yit = 1{ψ+xitδ+ρyit−1+ξ0yi0+xiξ+ai+eit>0}

We can use standard RE probit software by just adding yi0 and xi toall time periods

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 26 / 29

Page 27: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions Semi-Parametric approach

Reminder on Manski’s approach in cross section (1988)

Model yi = 1{xiβ+εi>0}

Until now, the conditional density f (ε|xi ) was specified

Can we relax this assumption ?

E(ε|X ) = 0 is not enough to identify β (Manski, 1988)med(ε|X ) will allow to identify β/ ‖β‖ under one more technicalassumption concerning X : there must be one continuous variable Xk ,s.t. the density of Xk |X−k is positive everywhere a.s.

β0 = arg maxβ

E((2Y − 1)1{X ′β>0})

βMS ∈ arg maxβ

n∑i=1

Yi1{X ′β≥0} + (1− Yi )1{X ′β<0}

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 27 / 29

Page 28: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions Semi-Parametric approach

Reminder on Manski’s approach in cross section (1988)

βMSP→ β0

n1/3(βMS − β0

)L→ D

See Kim and Pollard (1990) for the exact definition of D

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 28 / 29

Page 29: Panel Data Seminar - · PDF fileit+1 as an additional set of covariates ... (Crest-Insee) Panel Data Seminar 11 April 2008 21 ... Extension of the RE framework but raises theinitial

Extensions Semi-Parametric approach

Extensions to panel data

See Honore and Kyriazidou (1997) :

Extension to dynamic panel data with exogenous covariates

P(yi0 = 1|xi , αi ) = p0(xi , αi )

P(yit = 1|xi , αi , yi0, . . . , yit−1) = F (xitβ + γyit−1 + αi )

with T = 4, β and γ may be estimated by maximizing w.r.t. b an g

n∑i=1

1{xi2−xi3=0}(yi2 − yi1)sgn((xi2 − xi1)b + g(yi3 − yi0))

Aeberhardt and Davezies (Crest-Insee) Panel Data Seminar 11 April 2008 29 / 29