cds m phil econometrics limited dependent variable models ... · 10/05/2011 · limited dependent...

9
3/3/2014 1 CDS M Phil Econometrics Vijayamohan CDS M Phil Econometrics Vijayamohanan Pillai N 1 3-Mar-14 CDS Mphil Econometrics Vijayamohan Limited Dependent Variable Limited Dependent Variable Models: Models: Tobit Tobit CDS M Phil Econometrics Vijayamohan 3 3-Mar-14 Introduction Introduction Limited Dependent Variable Models: Limited Dependent Variable Models: Truncation and Censoring Truncation and Censoring Maddala Maddala, G. 1983. , G. 1983. Limited Dependent and Limited Dependent and Qualitative Variables in Econometrics Qualitative Variables in Econometrics. . Cambridge University Press. Cambridge University Press. CDS M Phil Econometrics Vijayamohan 4 3-Mar-14 Truncation Truncation A A truncated distribution truncated distribution is the part of an is the part of an untruncated untruncated distribution that is above or distribution that is above or below some specified value. below some specified value. If a continuous random variable If a continuous random variable x has has pdf pdf f(x) f(x) and and a is a constant, then the density of the is a constant, then the density of the truncated RV is truncated RV is ) a x ( ob Pr ) x ( f ) a x | x ( f > = > 3-Mar-14 CDS M Phil Econometrics Vijayamohan 5 Truncated standard normal distribution for a = – 0.5, 0, and 0.5 a = a = – 0.5 0.5 a = 0 a = 0 a = 0.5 a = 0.5 CDS M Phil Econometrics Vijayamohan 6 3-Mar-14 Truncation Truncation Truncation occurs when some observations on Truncation occurs when some observations on both the dependent variable and both the dependent variable and regressors regressors are lost. are lost. For example, income may be the dependent For example, income may be the dependent variable and only low variable and only low- -income people are income people are included in the sample. included in the sample. In effect, truncation occurs when the sample In effect, truncation occurs when the sample data is drawn from a subset of a larger data is drawn from a subset of a larger population. population.

Upload: letu

Post on 27-Aug-2018

215 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

1

CDS M Phil Econometrics Vijayamohan

CDS M Phil Econometrics

Vijayamohanan Pillai N

13-Mar-14

CDS Mphil Econometrics Vijayamohan

Limited Dependent Variable Limited Dependent Variable Models: Models: TobitTobit

CDS M Phil Econometrics Vijayamohan

33-Mar-14

IntroductionIntroduction

Limited Dependent Variable Models:Limited Dependent Variable Models:

Truncation and CensoringTruncation and Censoring

MaddalaMaddala, G. 1983. , G. 1983. Limited Dependent and Limited Dependent and Qualitative Variables in EconometricsQualitative Variables in Econometrics. . Cambridge University Press.Cambridge University Press.

CDS M Phil Econometrics Vijayamohan

43-Mar-14

TruncationTruncation

A A truncated distribution truncated distribution is the part of an is the part of an untruncateduntruncated distribution that is above or distribution that is above or below some specified value.below some specified value.

If a continuous random variable If a continuous random variable xx has has pdfpdff(x) f(x) and and aa is a constant, then the density of the is a constant, then the density of the truncated RV istruncated RV is

)ax(obPr

)x(f)ax|x(f

>=>

3-Mar-14 CDS M Phil Econometrics Vijayamohan

5

Truncated standard

normal distribution

for a = – 0.5, 0, and 0.5

a = a = –– 0.5 0.5

a = 0 a = 0

a = 0.5 a = 0.5

CDS M Phil Econometrics Vijayamohan

63-Mar-14

TruncationTruncation

Truncation occurs when some observations on Truncation occurs when some observations on both the dependent variable and both the dependent variable and regressorsregressorsare lost. are lost.

For example, income may be the dependent For example, income may be the dependent variable and only lowvariable and only low--income people are income people are included in the sample. included in the sample.

In effect, truncation occurs when the sample In effect, truncation occurs when the sample data is drawn from a subset of a larger data is drawn from a subset of a larger population.population.

Page 2: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

2

CDS M Phil Econometrics Vijayamohan

7

CensoringCensoring

3-Mar-14

censoring occurs when the value of an censoring occurs when the value of an observation is only partially known.observation is only partially known.

One of the earliest attempts to One of the earliest attempts to analyseanalyse a a statistical problem involving censored data: statistical problem involving censored data:

Daniel Bernoulli's 1766 analysis Daniel Bernoulli's 1766 analysis

of smallpox morbidity and mortality data to of smallpox morbidity and mortality data to demonstrate the efficacy of vaccination.demonstrate the efficacy of vaccination.

3-Mar-14 CDS M Phil Econometrics Vijayamohan

8

Censored Regression ModelCensored Regression Model

Censoring occurs when data on the Censoring occurs when data on the dependent variable is lost (or limited) dependent variable is lost (or limited)

but not data on the but not data on the regressorsregressors..

When the dependent variable is censored, When the dependent variable is censored, values in a certain range are all transformed values in a certain range are all transformed to (or reported as) a single value.to (or reported as) a single value.

CDS M Phil Econometrics Vijayamohan

9

Censored Regression ModelCensored Regression Model

3-Mar-14

For example, For example,

people of all income levels may be included in people of all income levels may be included in the sample, the sample,

but for some reason but for some reason

the income of highthe income of high--income people income people

may be topmay be top--coded as, say, Rs100,000. coded as, say, Rs100,000.

A defect in the sampleA defect in the sample

CDS M Phil Econometrics Vijayamohan

103-Mar-14

An ExampleAn ExampleA labor supply model estimates the relationship between hours worked by employees and characteristics of employees such as age, education and family status.

For people who are unemployed, it is not possible to observe the number of hours they would have worked had they had employment.

Still we know age, education and family status for those observations.

Another ExampleAnother Example

CDS Mphil Econometrics Vijayamohan

Suppose we are interested in finding out Suppose we are interested in finding out the amount of money a HH spends on a the amount of money a HH spends on a house in relation to sociohouse in relation to socio--economic economic variables.variables.

Many HHs may not have purchased Many HHs may not have purchased house:house:

Zero expenditure for themZero expenditure for them

CDS Mphil Econometrics Vijayamohan

Page 3: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

3

CDS M Phil Econometrics Vijayamohan

133-Mar-14

Another ExampleAnother Example

•• Suppose we are interested in studying Suppose we are interested in studying

how much an individual how much an individual desireddesired to give to to give to charity. charity.

•• For many people the amount we observe is For many people the amount we observe is zero, zero,

•• i.e. they give nothing to charity. i.e. they give nothing to charity.

•• For others, we observe the actual amount For others, we observe the actual amount they contributed. they contributed.

CDS M Phil Econometrics Vijayamohan

14

Censored & Truncated Regression ModelCensored & Truncated Regression Model

3-Mar-14

Truncated regression models Truncated regression models

are used for data, are used for data,

where whole observations are missing where whole observations are missing

so that the values for so that the values for

the dependent and the dependent and

the independent variable are unknown. the independent variable are unknown.

CDS M Phil Econometrics Vijayamohan

15

Censored & Truncated Regression ModelCensored & Truncated Regression Model

3-Mar-14

Censored regression models Censored regression models

are used for data, are used for data,

where only the value for the dependent where only the value for the dependent variable (hours of work for example) variable (hours of work for example)

is unknown is unknown

while the value of the independent variable while the value of the independent variable

(age, education, family status) (age, education, family status)

is still available is still available

TobitTobit ModelModel

Censored regression ? Censored regression ? oror

Truncated regression ?Truncated regression ?

3-Mar-14 CDS M Phil Econometrics Vijayamohan

16

CDS Mphil Econometrics Vijayamohan

Original Original TobitTobit model suggested by model suggested by James Tobin (1918 James Tobin (1918 –– 2002)2002)

3-Mar-14 CDS M Phil Econometrics Vijayamohan

18

Page 4: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

4

CDS M Phil Econometrics Vijayamohan

193-Mar-14

Some examples in the empirical literatureSome examples in the empirical literature

Analyze a dependent variable that is zero for a Analyze a dependent variable that is zero for a significant fraction of the observations.significant fraction of the observations.

CDS M Phil Econometrics Vijayamohan

203-Mar-14

TobitTobit ModelModel

•• The structural equation in the The structural equation in the TobitTobit model is:model is:

•• where where uuii ∼∼ N(0, N(0, σσ22))

y*y* is a latent variable that is observed for is a latent variable that is observed for

values greater than values greater than ττ and and

censored otherwise.censored otherwise.

ii*i uxy +β=

CDS M Phil Econometrics Vijayamohan

213-Mar-14

TobitTobit ModelModel

•• The observed y is defined by the following The observed y is defined by the following measurement equationmeasurement equation

y*, if y* > y*, if y* > ττ

ττyy, if y* , if y* ≤≤ ττyyii = =

ii*i uxy +β= •• In the typical In the typical TobitTobit model, model,

•• we assume that we assume that ττ = 0 = 0 •• i.e. the data are censored at 0.i.e. the data are censored at 0.

•• Thus, we haveThus, we have

CDS Mphil Econometrics Vijayamohan

yyii = = y*, if y* > 0

0, if y* ≤ 0

TobitTobit ModelModel

CDS M Phil Econometrics Vijayamohan

233-Mar-14

TobitTobit ModelModel

This model contains This model contains a a ProbitProbit model model

for for yyii being zero or positive being zero or positive

and a and a standard Regression model standard Regression model

for the positive values of for the positive values of yyii. .

yyii = = y*, if y* > 0

0, if y* ≤ 0

( )2i ,0N~u σ

ii*i uxy +β=

CDS M Phil Econometrics Vijayamohan

243-Mar-14

TobitTobit ModelModel

The Probit model may, for example,

describe the influence of explanatory variables on the decision

whether or not to donate to charity,

while

the Regression model measures

the effect of the explanatory variables

on the size of the amount for

donating individuals.

Page 5: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

5

CDS M Phil Econometrics Vijayamohan

253-Mar-14

TobitTobit ModelModel

• Why Use the Tobit Model?

• Why not just use the observations for which y > 0 and estimate the model using OLS?

• The answer:

• if you do, your parameter estimates will be biased and inconsistent.

• The degree of bias will also increase as the number of observations that take on the value of zero increases.

CDS M Phil Econometrics Vijayamohan

263-Mar-14

( )2i ,0N~u σ

yyii = =

yi*, if

0, if yi* ≤ 0

> 0

Neglecting the truncation can lead to biased estimates of α and β

=> ]x,0y|y[E iii

Why Use the Why Use the TobitTobit Model?Model?

]/)x[(

]/)x[(x

i

ii σβΦ

σβφσ+β

ii*i uxy +β=

CDS M Phil Econometrics Vijayamohan

273-Mar-14

The last term on the RHS [The last term on the RHS [ σλσλ((αα)) ] :] :

the inverse Mills ratio / hazard function the inverse Mills ratio / hazard function

for the std N distribution.for the std N distribution.

φφ = = pdfpdf and and ΦΦ = = cdfcdf: p(: p(yyii > 0)> 0)

Why Use the Why Use the TobitTobit Model?Model?

E[y | truncation] = µ + σλ(α)

]/)x[(

]/)x[(x

i

ii σβΦ

σβφσ+β=> ]x,0y|y[E iii

CDS M Phil Econometrics Vijayamohan

283-Mar-14

Inverse Mills RatioNamed after John P. Mills, Named after John P. Mills,

The ratio of the Probability Density Function over theThe ratio of the Probability Density Function over the

Cumulative Distribution Function of a distribution.Cumulative Distribution Function of a distribution.

If If xx is a random variable distributed normallyis a random variable distributed normally

with mean with mean µµ and variance and variance σσ22, then, then

where where αα is a constant, is a constant,

ϕϕ denotes the standard normal denotes the standard normal pdfpdf, and , and

ΦΦ is the standard normal is the standard normal cdfcdf..

σµ−αΦ

σµ−αϕ

σ+µ=α> ]x|x[E

)z(

)z(

Φϕσ+µ=

CDS M Phil Econometrics Vijayamohan

29

Why Use the Why Use the TobitTobit Model?Model?

3-Mar-14

• Consider for example,

• the amount a person gives to charity.

• Suppose the true relationship between the amount a person wantswants to give to charity and that person’s income is

CDS Mphil Econometrics Vijayamohan

Why Use the Why Use the TobitTobit Model?Model?

Page 6: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

6

CDS M Phil Econometrics Vijayamohan

313-Mar-14

Why Use the Why Use the TobitTobit Model?Model?

The lower income people would actually like The lower income people would actually like to give negative amounts to give negative amounts (i.e. get money back!). (i.e. get money back!).

The red line indicates the true regression line for the relationship between income and donations

CDS M Phil Econometrics Vijayamohan

323-Mar-14

In reality, we do not observe individuals In reality, we do not observe individuals making negative contributions. making negative contributions.

The observed data looks like this:The observed data looks like this:

What we observe is What we observe is they give nothing. they give nothing.

Why Use the Why Use the TobitTobit Model?Model?

CDS M Phil Econometrics Vijayamohan

333-Mar-14

Why Use the Why Use the TobitTobit Model?Model?

If we simply estimated the model by OLS,

the parameter estimates would be biased downwards.True relationship

OLS regression line

CDS M Phil Econometrics Vijayamohan

343-Mar-14

Why Use the Why Use the TobitTobit Model?Model?

OLS tends to underestimate the magnitude of the slope.

the parameter estimates would be biased downwards.

True relationship

OLS regression line

A bit more complex than interpreting estimated coefficients from the OLS model.

In particular, the estimated coefficients represent the marginal effect of x on y*.

That is :

marginal effect of x on the latent variable y* not on the observed variable y.

3-Mar-14 CDS M Phil Econometrics Vijayamohan

35

Interpreting Interpreting TobitTobit EstimatesEstimates

kk

*i

x

]x|y[Eβ=

∂∂

CDS M Phil Econometrics Vijayamohan

363-Mar-14

What we want to explain is the observed amount of charitable observed amount of charitable contributions contributions

not the desired amount of charitable contributions. Thus,

what we want is the expected value of y

conditional on y being greater than zeroy being greater than zero: E[y | y > 0, x].

Interpreting Interpreting TobitTobit EstimatesEstimates

Page 7: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

7

CDS M Phil Econometrics Vijayamohan

373-Mar-14

What we want is the expected value of y conditional on y being greater than zero: E[y | y > 0, x]. That is:

Interpreting Interpreting TobitTobit EstimatesEstimates

=> ]x,0y|y[E iii

The desired marginal effects are the derivative of this function with respect to x.

=σβΦσβφ

σ+β]/)x[(

]/)x[(x

i

ii

)z(

)z(xi Φ

φσ+β

We have

E(y) = E(y | y > 0) P(y > 0)

3-Mar-14 CDS M Phil Econometrics Vijayamohan

38

Interpreting Interpreting TobitTobit EstimatesEstimates

=> ]x,0y|y[E iii

)z(

)z(xi Φ

φσ+β

)z()z(

)z(X)y(E i Φ

Φφσ+β=

)z()z(x)y(E i σφ+Φβ=

)z(x

)y(Ek

k

Φβ=∂

∂k

k

*i

x

]x|y[Eβ=

∂∂

)ax(obPr

)x(f)ax|x(f

>=>

CDS M Phil Econometrics Vijayamohan

393-Mar-14

Method of maximum likelihoodMethod of maximum likelihood

Olsen’s (1978) Olsen’s (1978) reparameterizationreparameterization simplifies ML simplifies ML estimation.estimation.

James Heckman has proposed a simple alternative to James Heckman has proposed a simple alternative to the ML method:the ML method:

J. J. Heckman, “Sample Selection Bias as a J. J. Heckman, “Sample Selection Bias as a Specification Error,” Specification Error,” EconometricaEconometrica, vol. 47, pp. , vol. 47, pp. 153153––161.161.

EstimationEstimation

CDS M Phil Econometrics Vijayamohan

403-Mar-14

Heckman AlternativeHeckman Alternative

Consists of a twoConsists of a two--step estimating procedure:step estimating procedure:

Step 1: estimate the probability of, say, Step 1: estimate the probability of, say,

a consumer owning a house, a consumer owning a house,

on the basis of the on the basis of the probitprobit model. model.

CDS M Phil Econometrics Vijayamohan

413-Mar-14

Heckman AlternativeHeckman Alternative

Step 2: estimate the model

by adding to it the inverse Mills ratio or the hazard rate that is derived from the probit estimate.

yyii = = yi*, if

0, if yi* ≤ 0

> 0

IMRIMR

=> ]x,0y|y[E iii]/)x[(

]/)x[(x

i

ii σβΦ

σβφσ+β

ii*i uxy +β=

CDS M Phil Econometrics Vijayamohan

423-Mar-14

Heckman AlternativeHeckman Alternative

The Heckman procedure yieldsThe Heckman procedure yields

consistent estimates of the parameters, but consistent estimates of the parameters, but

they are not as efficientthey are not as efficient

as the ML estimates.as the ML estimates.

An Example of An Example of TobitTobitmodel: model:

Page 8: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

8

CDS M Phil Econometrics Vijayamohan

433-Mar-14 CDS M Phil Econometrics Vijayamohan

443-Mar-14

CDS M Phil Econometrics Vijayamohan

453-Mar-14

05

1015

Ext

ram

arita

l

10 15 20Education

451 observations lying along the horizontal axis. 451 observations lying along the horizontal axis. ⇒⇒ a censored sample, a censored sample,

⇒⇒ a a tobittobit model may be appropriate.model may be appropriate.

Of the 601 Of the 601

responses, 451 responses, 451

individuals had individuals had

no extramarital no extramarital

affairs, and 150affairs, and 150

individuals had individuals had

one or more one or more

affairs.affairs.

CDS M Phil Econometrics Vijayamohan

463-Mar-14

Ray Fair Model: OLSRay Fair Model: OLS

CDS M Phil Econometrics Vijayamohan

473-Mar-14

Ray Fair Model: Ray Fair Model: TobitTobit

CDS M Phil Econometrics Vijayamohan

483-Mar-14

Compare OLS & Compare OLS & TobitTobit estimatesestimates

OLSOLS TobitTobit

Page 9: CDS M Phil Econometrics Limited Dependent Variable Models ... · 10/05/2011 · Limited Dependent Variable Models: Tobit CDS M Phil Econometrics Vijayamohan 3-Mar-14 3 Introduction

3/3/2014

9

CDS M Phil Econometrics Vijayamohan

493-Mar-14

Ray Fair Model: Ray Fair Model: TobitTobit

CDS M Phil Econometrics Vijayamohan

503-Mar-14

Ray Fair Model: Ray Fair Model: TobitTobit

CDS M Phil Econometrics Vijayamohan

513-Mar-14

TobitTobit: Coefficient Estimates & Marginal Effects: Coefficient Estimates & Marginal Effects

Compare…..Compare…..

CDS M Phil Econometrics Vijayamohan

523-Mar-14

TobitTobit in in StataStata

StatisticsLinear models & relatedCensored regressionTobit regression

Also

StatisticsPostestimation(1) Tests: Test parameters(2) Marginal effects or elasticities