laura magazzini - phdeconomics.sssup.it · laura magazzini (@univr.it) truncation and censoring 20...

35
Truncation and Censoring Laura Magazzini [email protected] Laura Magazzini (@univr.it) Truncation and Censoring 1 / 35

Upload: others

Post on 28-May-2020

14 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and Censoring

Laura Magazzini

[email protected]

Laura Magazzini (@univr.it) Truncation and Censoring 1 / 35

Page 2: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring

Truncation and censoring

Truncation: sample data are drawn from a subset of a largerpopulation of interest

. Characteristic of the distribution from which the sample data are drawn

. Example: studies of income based on incomes above or below thepoverty line (of limited usefulness for inference about the wholepopulation)

Censoring: values of the dependent variable in a certain range are alltransformed to (or reported at) a single value

. Defect in the sample data

. Example: in studies of income, people below the poverty line arereported at the poverty line

Truncation and censoring introduce similar distortion intoconventional statistical results

Laura Magazzini (@univr.it) Truncation and Censoring 2 / 35

Page 3: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Truncation

Truncation

Aim: infer the caracteristics of a full population from a sample drawnfrom a restricted population

. Example: characteristics of people with income above $100,000

Let Y be a continous random variable with pdf f (y). The conditionaldistribution of y given y > a (a a constant) is:

f (y |y > a) =f (y)

Pr(y > a)

In case of y normally distributed:

f (y |y > a) =1σφ( x−µ

σ

)1− Φ(α)

where α = a−µσ

Laura Magazzini (@univr.it) Truncation and Censoring 3 / 35

Page 4: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Truncation

Moments of truncated distributions

E (Y |y < a) < E (Y )

E (Y |y > a) > E (Y )

V (Y |trunc .) < V (Y )

Laura Magazzini (@univr.it) Truncation and Censoring 4 / 35

Page 5: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Truncation

Moments of the truncated normal distribution

Let y ∼ N(µ, σ2) and a constant

E (y |truncation) = µ+ σλ(α)

Var(y |truncation) = σ2[1− δ(α)]

. α = (a− µ)/σ

. φ(α) is the standard normal density

. λ(α) is called inverse Mills ratio:

λ(α) = φ(α)/[1− Φ(α)] if truncation is y > aλ(α) = −φ(α)/Φ(α) if truncation is y < a

. δ(α) = λ(α)[λ(α)− α], where 0 < δ(α) < 1 for any α

Laura Magazzini (@univr.it) Truncation and Censoring 5 / 35

Page 6: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Truncation

Example: a truncated log-normal income distribution

From New York Post (1987): “The typical upper affluent American...makes $142,000 per year... The people surveyed had householdincome of at least $100,000”. Does this tell us anything about the typical American?

“... only 2 percent of Americans make the grade”. Degree of truncation in the sample: 98%. The $142,000 is probably quite far from the mean in the full population

Assuming lognormally distributed income in the population (log ofincome has a normal distribution), the information can be employedto deduce the population mean incomeLet x = income and y = ln x

E [y |y > log 100] = µ+σφ(α)

1− Φ(α)

By substituting E [x ] = E [ey ] = eµ+σ2/2, we get E [x ] = $22, 087. 1987 Statistical Abstract of the US listed average household income of

about $25, 000 (relatively good estimate based on little information!)

Laura Magazzini (@univr.it) Truncation and Censoring 6 / 35

Page 7: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Truncation

The truncated regression model

y∗i = x ′iβ + εi , εi |xi ∼ N(0, σ2)Unit i is observed only if y∗i cross a threshold:

yi =

{n.a. if y∗i ≤ ay∗i if y∗i > a

E [yi |y∗i > a] = x ′iβ + σλ(αi ), with αi = (a− x ′iβ)/σThe marginal effect in the subpopulation is:

∂E [yi |y∗i > a]

∂xi= β + σ(dλ(αi )/dαi )

∂αi

∂xi= ...

= β(1− δ(αi ))

. Since 0 < δ(αi ) < 1, the marginal effect in the subpopulation is lessthan the corresponding coefficient

. If the interest is in the linear relationship between y∗ and x(population), the β can be directly interpreted

Laura Magazzini (@univr.it) Truncation and Censoring 7 / 35

Page 8: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Truncation

Estimation

OLS of y on x leads to inconsistent estimates

. The model is yi |y∗i > a = E (yi |y∗i > a) + εi = x ′i β + σλ(αi ) + εi

. By construction, the error term is heteroskedastic

. Omitted variable bias (λi is not included in the regression)

. In applications, it is usually found that the OLS estimates are biasedtoward zero

Under the normality assumption, MLE can be obtained

. f (y |y > a) =1σφ( y−µ

σ )1−Φ(α) with α = a−µ

σ

. The log-likelihood can be written as

log L =N∑i=1

log

[σ−1φ

(yi − x ′i β

σ

)]−

N∑i=1

log

[1− Φ

(a− x ′i β

σ

)]

Laura Magazzini (@univr.it) Truncation and Censoring 8 / 35

Page 9: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Truncation

Example: simulated dataY ∗ = −1.5 + 0.5x + ε/2, N = 100, a = 0

Laura Magazzini (@univr.it) Truncation and Censoring 9 / 35

Page 10: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Censored data

Censored regression models generally apply when the variable to beexplained is partly continuous but has positive probability mass at oneor more points

Assume there is a variable with quantitave meaning y∗ and we areinterested in E [y∗|x ]

If y∗ and x were observed for everyone in the population: standardregression methods (ordinary or nonlinear least squares) can beapplied

In the case of censored data, y∗ is not observable for part of thepopulation

. Conventional regression methods fail to account for the qualitativedifference between limit (censored) and nonlimit (continuous)observations

. Top coding / corner solution outcome

Laura Magazzini (@univr.it) Truncation and Censoring 10 / 35

Page 11: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Top coding: exampleData generating process

Let wealth∗ denote actual family wealth, measured in thousands ofdollars

Suppose that wealth∗ follows the linear regression modelE [wealth∗|x ] = x ′β

Censored data: we observe wealth only when wealth∗ > 200

. When wealth∗ is smaller than 200 we know that it is, but we do notknow the actual value of wealth

Therefore observed wealth can be written as

wealth = max(wealth∗, 200)

Laura Magazzini (@univr.it) Truncation and Censoring 11 / 35

Page 12: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Top coding: exampleEstimation of β

We assume that wealth∗ given x has a homoskedastic normaldistribution

wealth∗ = x ′β + ε, ε|x ∼ N(0, σ2)

Recorded wealth is: wealth = max(wealth∗, 200) = max(x ′β + ε, 200)

β is estimated via maximum likelihood using a mixture of discrete andcontinuous distributions (details later...)

Laura Magazzini (@univr.it) Truncation and Censoring 12 / 35

Page 13: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Example: seat demanded and ticket sold

Laura Magazzini (@univr.it) Truncation and Censoring 13 / 35

Page 14: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Corner solution outcomes

Still labeled “censored regression models”

Pioneer work by Tobin (1958): household purchase of durable goods

Let y be an observable choice or outcome describing some economicagent, such as an individual or a firm, with the followingcharacteristics: y takes on the value zero with positive probability butis a continuous random variable over strictly positive values

. Examples: amount of life insurance coverage chosen by an individual,family contributions to an individual retirement account, and firmexpenditures on research and development

. We can imagine economic agents solving an optimization problem, andfor some agents the optimal choice will be the corner solution, y = 0

. The issue here is not data observability, rather individual behaviour

. We are interested in features of the distribution of y given x , such asE [y |x ] and Pr(y = 0|x)

Laura Magazzini (@univr.it) Truncation and Censoring 14 / 35

Page 15: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

The censored normal distribution

y∗ ∼ N(µ, σ2)

Observed data are censored in a = 0:{y = 0 if y∗ ≤ 0y = y∗ if y∗ > 0

The distribution is a mixture of discrete and continuous distribution

. If y∗ ≤ 0: f (y) = Pr(y = 0) = Pr(y∗ ≤ 0) = Φ(−µ/σ) = 1− Φ(µ/σ)

. If y∗ > 0: f (y) = φ(y−µσ

)E [y ] = 0× Pr(y = 0) + E [y |y > 0]× Pr(y > 0) = (µ+ σλ)Φ

(0−µσ

)with λ = φ/Φ

Laura Magazzini (@univr.it) Truncation and Censoring 15 / 35

Page 16: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

The censored regression modelTobit model (Tobin, 1958)

Let y∗ be a continuous variable (latent variable): y∗i = x ′iβ + εi ,where ε|x ∼ N(0, σ2)

The observed data y are

yi = max(0, y∗i ) =

{0 if y∗i ≤ 0y∗i if y∗i > 0

Why not OLS?

Estimates can be obtained by MLE

Laura Magazzini (@univr.it) Truncation and Censoring 16 / 35

Page 17: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Estimation

A positive probability is assigned to the observations yi = 0:

Pr(yi = 0|xi ) = Pr(y∗i ≤ 0|xi )= Pr(x ′iβ + εi ≤ 0)

= Pr(εi ≤ −x ′iβ)

= 1− Pr(εi < x ′iβ)

= 1− Φ

(x ′iβ

σ

)The likelihood can be written as:

L(β, σ2|y) =∏yi=0

(1− Φ

(x ′iβ

σ

)) ∏yi>0

1

σφ

(yi − x ′iβ

σ

)

=∏yi=0

(1− Φ

(x ′iβ

σ

)) ∏yi>0

1√2πσ2

e− 1

2

(yi−x′i β

σ

)2

Laura Magazzini (@univr.it) Truncation and Censoring 17 / 35

Page 18: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Marginal effect in the tobit model

In the case of censored data, β estimated from the tobit model canbe employed to study the effect of x on E [y∗|x ]

In the case of corner solution outcome, the estimated β are notsufficient since E [y |x ] and E [y |x , y > 0] depend on β in a non-linearway

∂E [yi |xi ]∂xi

= Φ

(x ′iβ

σ

∂E [yi |xi ]∂xi

= Pr(yi > 0)∂E [yi |xi ,yi>0]∂xi

+ E [yi |xi , yi > 0]∂ Pr[yi>0]∂xi

A change in xi has two effects:

(1) It affects the conditional mean of y∗i in the positive part of thedistribution

(2) It affects the probability that the observation will fall in the positivepart of the distribution

Laura Magazzini (@univr.it) Truncation and Censoring 18 / 35

Page 19: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Example: simulated dataY ∗ = −1.5 + 0.5x + ε/2, N = 100

Laura Magazzini (@univr.it) Truncation and Censoring 19 / 35

Page 20: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Censored data

Some issues in specification

Heteroschedasticity

. MLE is inconsistent

. However the problem can be approached directly and σi considered inthe likelihood function instead of σ. Specification of a particular modelfor σi provides the empirical model for estimation

Misspecification of Pr(y∗ < 0). In the tobit model, a variable that increases the probability of an

observation being a non-limit observation also increases the mean ofthe variable

- Example: loss due to fire in buildings

. A more general model has been devised involving a decision equationand a regression equation for nonlimit observations

Non-normality

. MLE is inconsistent

. Research is ongoing both on alternative estimators and on methods fortesting this type of misspecification

Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35

Page 21: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Sample selection

What if observation is driven by a different process?

(1) Data observability

. Saving function (in the population):saving = β0 + β1income + β2age + β3married + β4kids + u

. Survey data only includes families whose household head was 45 yearsof age or older

(2) Individual behaviour (Boyes, Hoffman, Low, 1989; Greene, 1992)

. y1 = 1 if individual i defaults on a loan/credit card, 0 otherwise

. y2 = 1 if individual i is granted a loan/credit card, 0 otherwise

. For a given individual, y1 is not observed unless y2 equals 1

Laura Magazzini (@univr.it) Truncation and Censoring 21 / 35

Page 22: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Sample selection / incidental truncation

Let y and z have a bivariate distribution with correlation ρ

We are interested in the distribution of y given that another variablez exceeds a particular value

. Intuition: if y and z are positively correlated then the truncation of zshould push the distribution of y to the right

The truncated joint distribution is

f (y , z |z > a) =f (y , z)

Pr(z > a)

To obtain the incidentally truncated marginal density of y , we shouldintegrate z out of this expression

Laura Magazzini (@univr.it) Truncation and Censoring 22 / 35

Page 23: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Moment of the incidentally truncated bivariate normaldistribution

Let y and z have a bivariate normal distribution with means µy andµz , standard deviations σy and σz , and correlation ρ

E [y |z > a] = µy + ρσyλ(αz)

V [y |z > a] = σ2y [1− ρ2δ(αz)]

. αz = (a− µz)/σz

. λ(αz) = φ(αz)/[1− Φ(αz)]

. δ(αz) = λ(αz)[λ(αz)− αz ]

If the truncation is z < a, then λ(αz) = −φ(αz)/Φ(αz)

Laura Magazzini (@univr.it) Truncation and Censoring 23 / 35

Page 24: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Example: A model of labor supply

Consider a population of women where only a subsample is engagedin market employment

We are interested in identifying the determinants of the labor supplyfor all women

A simple model of female labor supply consists of 2 equations

(1) Wage equation: the difference between a person’s market wage and herreservation wage, as a function of characteristics such as age,education, number of children, ... plus unobservables

(2) Hours equation: The desired number of labor hours supplied dependson the wage, home characteristics (e.g. presence of small children),marital status, ... plus unobservable

Truncation: Equation 2 describes the desired hours, but an actualfigure is observed only if the individual is working, i.e. when themarket wage exceeds the reservation wage

The hours variable is incidentally truncated

Laura Magazzini (@univr.it) Truncation and Censoring 24 / 35

Page 25: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Example: A model of labor supplyWhen OLS on the working sample?

Assume working women are chosen randomly

If the working subsample has similar endowments of characteristics(both obs. & unobs.) as the nonworking sample, OLS is an option

BUT the decision to work is not random: the working andnonworking sample potentially have different characteristics

. When the relationship is purely trough observables, appropriateconditioning variables can be included in the relevant equation

. If unobservable characteristics affecting the work decision are correlatedwith the unobservable characteristics affecting wage, then a relationshipis determined that cannot be tackle by including appropriate controls

. A bias is induced due to “sample selection”

Laura Magazzini (@univr.it) Truncation and Censoring 25 / 35

Page 26: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Regression in a model of selection (1)

Equation that determines sample selection

z∗i = w ′i γ + ui

The equation of primary interest is

yi = x ′iβ + εi

where yi is observed only when z∗i is greater than zero (otherwisedata are not available)

. This model is closely related to the Tobit model, although it is lessrestrictive: the parameters explaining the censoring are not constrainedto equal those explaining the variation in the observed dependentvariable. For this reason the model is also known as Tobit type two.

Laura Magazzini (@univr.it) Truncation and Censoring 26 / 35

Page 27: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Regression in a model of selection (2)

If ui and εi have a bivariate normal distribution with zero mean andcorrelation ρ,

E [yi |yi is observed] = E [yi |z∗i > 0]

= E [yi |ui > −w ′i γ]

= x ′iβ + E [εi |ui > −w ′i γ]

= x ′iβ + ρσελi (αu)

where αz = −w ′i γ/σu and λ(αu) = φ(αu)/Φ(αu)

So, the regression model can be written as

yi |z∗i > 0 = E [yi |z∗i > 0] + υi

= x ′iβ + ρσελi (αu) + υi

Laura Magazzini (@univr.it) Truncation and Censoring 27 / 35

Page 28: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Regression in a model of selection (3)

E [yi |z∗i > 0] = x ′iβ + ρσελi (αu)

OLS regression using the observed data will lead to inconsistentestimates (omitted variable bias)The marginal effect of the regressors on yi in the observed sampleconsists of two components:. Direct effect on the mean of yi (β). In addition, if the variable appears in the probability that z∗i is positive,

then it will influence yi through its presence in λi

∂E [yi |z∗i > 0]

∂xik= βk + γk

(ρσεσu

)δi (αu)

Most often z∗i is not observed, rather we can infer its sign but not itsmagnitude. Since there is no information on the scale of z∗, the disturbance

variance in the selection equation cannot be estimated (we let σ2u = 1)

Laura Magazzini (@univr.it) Truncation and Censoring 28 / 35

Page 29: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Regression in a model of selection (4)

Selection mechanisms

z∗i = w ′i γ + ui ,

where we observe zi = 1 if z∗i > 0 and 0 otherwise.

. Pr(zi = 1|wi ) = Φ(w ′i γ)

. Pr(zi = 0|wi ) = 1− Φ(w ′i γ)

Regression modelyi = x ′iβ + εi ,

where yi is observed only when zi is equal to one (otherwise data arenot available)

. (ui , εi ) ∼ bivariate normal[0, 0, 1, σε, ρ]

Laura Magazzini (@univr.it) Truncation and Censoring 29 / 35

Page 30: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Estimation

Least squares using the observed data produces incosistent estimatesof β (omitted variable)

Least squares regression of y on x and λ would be a consistentestimator

. However, even if λi were observed, OLS would be inefficient: υi areheteroskedastic

Maximum likelihood estimation can be applied

Heckman (1979) proposed a two-step procedure

Laura Magazzini (@univr.it) Truncation and Censoring 30 / 35

Page 31: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Maximum likelihood estimation

The log likelihood for observation i , log Li = li , can be written as:

. If yi is not observedli = log Φ(−w ′i γ)

. If yi is observed

li = log Φ

(w ′i γ + (yi − x ′i β)ρ/σε√

1− ρ2

)− 1

2

(yi − x ′i β

σε

)− log(

√2πσε)

σε and ρ are not directly estimated (they have to be greater than 0)

Directly estimated are log σε and atanhρ:

atanhρ =1

2log

(1 + ρ

1− ρ

)Estimation would be simplified if ρ = 0

Laura Magazzini (@univr.it) Truncation and Censoring 31 / 35

Page 32: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Two-step procedureHeckman (1979)

yi |z∗i > 0 = E [yi |z∗i > 0] + υi

= x ′iβ + ρσελi (αu) + υi

1 Estimate the probit equation by MLE to obtain estimates of γ. Foreach observation in the selected sample, compute λ̂i (inverse Millsratio)

2 Estimate β and βλ = ρσε by least squares regression of y on x and λ̂

Laura Magazzini (@univr.it) Truncation and Censoring 32 / 35

Page 33: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Estimators of the variance and standard errors

Second step standard errors need to be adjusted to account for thefirst step estimation

The estimation of σε needs to be adjusted:

. At each observation, the true conditional variance of the disturbancewould be σ2

i = σ2ε (1− ρ2δi )

. A consistent estimator of σ2ε is given by:

σ̂2ε =

1

ne′e + ˆ̄δb2

λ

To test hypothesis, an estimate of the asymptotic covariance matrixof the coefficients (including βλ) is needed

. Two problems arise: (1) the disturbance terms υi is heteroskedastic;(2) there are unknown parameters in λi

. Formulas are rather cumbersome, but can be calculated using thematrix of independent variables, the sample estimates of σ2

ε and ρ, andthe assumed known values of λi and δi

Laura Magazzini (@univr.it) Truncation and Censoring 33 / 35

Page 34: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Two-step procedureDiscussion

Identification: exclusion restriction

. Although the inverse Mills ration is non linear in the single index w ′i γ,the function mapping this index into the inverse Mills ratio is linear forcertain ranges of the index

. Accordingly, the inclusion of additional variables in wi in the first stepcan be important for identification of the second step estimates

. In real world, there are few cadidates for simultaneous inclusion in wi

and exclusion from xi

Inclusion of the inverse Mills ratio into the equation of interest isdriven by the normality assumption

. Recent research includes specific attempts to move away from thenormality assumption:

yi |z∗i > 0 = x ′i β + µ(w ′i γ) + υi

where µ(w ′i γ) is called “selectivity correction”

Laura Magazzini (@univr.it) Truncation and Censoring 34 / 35

Page 35: Laura Magazzini - phdeconomics.sssup.it · Laura Magazzini (@univr.it) Truncation and Censoring 20 / 35. Truncation and censoring Sample selection Sample selection What if observation

Truncation and censoring Sample selection

Selection in qualitative response models

The problem of sample selection has been modeled in other settingsbesides the linear regression model

Binary choice model have been considered, but also count datamodels

For example in the case of the Poisson model:

. yi |εi ∼ Poisson(λi )

. log λi = x ′i β + εi

. (yi , xi ) are only observed when zi = 1, where z∗i = w ′i γ + ui and zi = 1if z∗i > 0, 0 otherwise

. Assume that (εi , ui ) have a bivariate normal distribution with non-zerocorrelation

. Selection affects the mean (and the variance) of yi and, in the observeddata, yi no longer has a Poisson distribution

Laura Magazzini (@univr.it) Truncation and Censoring 35 / 35