qualitative and limited dependent variable models

27
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Post on 19-Dec-2015

228 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

QUALITATIVE AND LIMITED DEPENDENT VARIABLE

MODELS

Page 2: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• Model to describe choice behavior

• Dependent variables that are limited, that is the range of values is constrained

• Or

• The values are not completely observable

Page 3: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

e.g.

• a worker decides to drive to work or not

• a high school graduate decides to go to college or not

• a household decides to purchase a house or rent

• why are some loan applications accepted and others not?

Page 4: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Qualitative Choice Models:

Binary choice models

Occurs when an individual is making a choice.

Models with Binary Dependent Variables

• If the dependent variable assumes only 2 values

1 if the outcome is chosen and 0 if not

Page 5: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• For these models, least squares estimation methods are not the best choices.

• In this case OLS is both biased and inconsistent

• OLS suffers from heteroscedasticity problem

• Instead, maximum likelihood estimation (MLE) is the usual method used

Page 6: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• MLE of probit (or logit) discrete choice model

• Slopes can only be estimated up to a scale factor

• But coefficient signs and t-values have the usual interpretation

Page 7: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

An example

• Y = Affairs = 1 if individual has had an affair

• X1 = Dummy for male

• X2 = Years of marriage

• X3 = Dummy for kids in the marriage

• X4 = Dummy for religion

• X5 = Years of education

• X6 = Dummy for Happy

Page 8: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Logit (MLE) estimate gives

Y = 1.29 + 0.25X1 + 0.05X2 + 0.44X3 -0.89X4 + 0.01X5 -0.87X6

X2, X4, X6 have p-values < 5%

Logit coefficients do not directly measure marginal effects so its hard to interpret them.

However, we can interpret the signs.

Page 9: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Odds ratio

• Marginal effects on the odds ratio.

• Suppose the coefficient on Happy under the odds ratio is 0.42, how do we interpret this number.

• Happy is a dummy variable if a person switches from an unhappy relationship to a happy one, the odds ratio would be 42% of what it was before.

Page 10: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Suppose an individual had an odds ratio of 4.

i.e. P(Y=1) = 4/5 and P(Y=0) = 1/5 there’s an 80% chance that an individual will have an affair.

If the individual’s marriage becomes Happy, then the odds ratio becomes 42% higher as before.

4*0.42 = 1.68 There is a 63% chance the individual will

have an affair

Page 11: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

The linear probability model

Suppose we wish to explain an individual’s choice between driving to work (private transportation) and taking the bus (public transportation)

Individual’s choice can be represented by a dummy variable

Page 12: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Y = 1 if individual drives to work= 0 if takes bus

We can collect a random sample of workers then the outcome Y will have a probability function

where p = probability that y=1.

E(y) = p

,)1()( 1 yy ppyf

Page 13: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Explanatory variable = X

Difference between time by bus and time by car

Expectation:

As x increases, an individual would be more inclined to drive to work

We expect a positive relationship between x and p (the probability to drive to work)

Page 14: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

The linear regression model = the linear probability model

=

Y= E(y) + e

Y = p + e

Y = Bo + B1X + e

This model is heteroscedastic, because the variance of the error term varies from one observation to another.

Page 15: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• If we use OLS to estimate the model, we’d obtain

XpyE 10ˆˆˆ)(

Page 16: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Consider what happens when you use this model to predict behavior:

If you substitute different values of x in the the equation, you might obtain values of phat that are

1. Less than 0 (negative)2. Greater than 1

Values that do not make sense as probabilities

Page 17: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• Generally, in a linear regression model the slope coefficient (if positive) suggests that the increase in x will have a constant effect on y

• But in probability models (binary dependent variable model), the constant rate of increase is impossible because

0≥ p ≤1

Page 18: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

To overcome this problem we use nonlinear probit and logit models.

In E-views, choose Objects, New Object, Equation and select the Binary Estimation option.

Specify your equation in the equation box.

E-views uses Maximum Likelihood estimator

Page 19: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• The estimates produces by a probit or logit model that look like slopes and intercepts in the output are actually standardized slopes and intercepts, known only up to a scale factor

• So the size of these coefficients is not viewed in the same way (as the change in Y for a one-unit change in X). Instead, we focus on the sign and the statistical significance of these coefficients, in particular, the "slopes.“

Page 20: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• If a "slope" is positive, it means that an increase in the corresponding X variable increases the latent propensity to choose the 1 alternative, thereby also increasing the predicted probability of choosing the 1 alternative. (Analogously for negative slope coefficients.)

Page 21: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• If the fitted probability is greater than 0.5, we predict the individual will choose 1. If it is less than 0.5, we predict they will choose 0.

• The asymptotic t-ratios allow us to test zero hypotheses about these "slopes." If the t-ratios exceed (roughly) 2 in absolute value, we can reject the hypothesis that the relevant X variable has no effect on fitted choice probabilities.

Page 22: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Missing Observations

• Collect data on Sleep and Age

• All data on Sleep but 20% of Age is missing

• How do you use all the data to show the effect of Age on Sleep?

Page 23: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

Non-Experimental Data

• Non-experimental data can sometimes make it very difficult to draw policy implications from regression analysis

Page 24: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

GUN CONTROL

• Suppose your sample consists of households that have been victimized by robbery. The dependent variable takes a value of 1 if a household member is shot during the robbery and 0 otherwise. One of your explanatory variables is a dummy variable equal to 1 if there is a handgun present in the house, 0 otherwise. When a handgun is present in a household, an occupant of that house is much more likely to be shot in the process of a robbery than when no handgun is present. Therefore, to minimize injury and loss of life from robbery incidents, private ownership of handguns should be banned. Evaluate this policy proposal and the "evidence" upon which it is premised

Page 25: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• Briefly describe the nature of the true "experiment" that would allow an unambiguous determination of the effect of handgun presence on robbery shootings via a regression like this.

Page 26: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

LEGALIZATION OF MARIJUANA:

• Suppose you have a random sample of at-risk 18-year-olds. The dependent variable is the number of times each teenager has used heroin. Among the explanatory variables is a dummy variable that takes a value of 1 if the subject experimented with marijuana prior to age 13, and 0 otherwise. You find that the coefficient on this dummy variable is positive and strongly statistically significant. Therefore, we should not legalize marijuana use (which would make it much more accessible to pre-teens) since this will lead to widespread use of heroin. Evaluate this policy proposal and the "evidence" upon which it is premised

Page 27: QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS

• Briefly describe the nature of the true "experiment" that would allow an unambiguous determination of the effect of pre-teen marijuana use on subsequent heroin use via a regression like this