econ 351 - interactions and dummies - interactions and dummies.pdf · why do we use the values zero...

25
ECON 351 - Interactions and Dummies Maggie Jones 1 / 25

Upload: lykhue

Post on 18-Mar-2018

222 views

Category:

Documents


5 download

TRANSCRIPT

ECON 351 - Interactions and

Dummies

Maggie Jones

1 / 25

Readings

I Chapter 6: Section on Models with Interaction Terms

I Chapter 7: Full Chapter

2 / 25

Interaction Terms with Continuous

Variables

I In some regressions we might expect the partial effect of onevariable to depend on the magnitude of another variable

I An example of this is the effect of adding another bedroom toa house on the price of a house might vary based on theoriginal size of the house: large houses should have a higherincrease in house price when adding a new bedroom

3 / 25

Interaction Terms with Continuous

Variables

I We can formalize this as (for example)

y = β0 + β1x1 + β2x2 + β3x1 ∗ x2 + u

I Where now, the partial effect of x2 on y is:

dy

dx2= β2 + β3x1

I So that the effect of a one unit increase in x2 on y depends onwhat the value of x1 is

4 / 25

Dummy Variables

I Sometimes we wish to incorporate qualitative information intoour regressions

I Examples of qualitative information could be the colour of acar, city in which someone lives, whether a person identifies asmale or female, etc.

I A dummy variable (also called a binary variable) is one thatcontains information on two outcomes (male or female, owns acomputer or does not, is white or non-white, etc)

I These variables are represented in regression analysis as a 0 or1

5 / 25

Example: Female and Married are

Dummy Variables

mination, we might define female to be a binary variable taking on the value one forfemales and the value zero for males. The name in this case indicates the event with the

value one. The same information is cap-tured by defining male to be one if the per-son is male and zero if the person isfemale. Either of these is better than usinggender because this name does not make itclear when the dummy variable is one:does gender ! 1 correspond to male or

female? What we call our variables is unimportant for getting regression results, but italways helps to choose names that clarify equations and expositions.

Suppose in the wage example that we have chosen the name female to indicate gen-der. Further, we define a binary variable married to equal one if a person is marriedand zero if otherwise. Table 7.1 gives a partial listing of a wage data set that mightresult. We see that Person 1 is female and not married, Person 2 is female and married,Person 3 is male and not married, and so on.

Why do we use the values zero and one to describe qualitative information? In asense, these values are arbitrary: any two different values would do. The real benefit ofcapturing qualitative information using zero-one variables is that it leads to regressionmodels where the parameters have very natural interpretations, as we will see now.

Part 1 Regression Analysis with Cross-Sectional Data

212

Q U E S T I O N 7 . 1

Suppose that, in a study comparing election outcomes betweenDemocratic and Republican candidates, you wish to indicate theparty of each candidate. Is a name such as party a wise choice for abinary variable in this case? What would be a better name?

Table 7.1

A Partial Listing of the Data in WAGE1.RAW

person wage educ exper female married

1 3.10 11 2 1 0

2 3.24 12 22 1 1

3 3.00 11 2 0 0

4 6.00 8 44 0 1

5 5.30 12 7 0 1

! ! ! ! ! !! ! ! ! ! !! ! ! ! ! !

525 11.56 16 5 0 1

526 3.50 14 5 1 0

d 7/14/99 5:55 PM Page 212

6 / 25

Dummy VariablesI When we include dummy variables in the regression model, we

interpret the effect of the dummy variables as a change inintercept. E.g.,

y = β0 + δ0D + β1x+ u

I where now D = 1 if the individual belongs to a specific groupand 0 otherwise, x is a continuous variable as before

I We can evaluate the change in intercept by taking theexpected value of y conditional on x and D:

E[y|D = 0, x] = β0

E[y|D = 1, x] = β0 + δ0

I The difference: E[y|D = 0, x]− E[y|D = 1, x] is the change inintercept from belonging to the group identified by D: δ0

7 / 25

Dummy Variables

I Example: effect on wages from being female and controllingfor education:

wage = β0 + δ0female+ β1education+ u

I where female = 1 if the individual identifies as female and 0otherwise, education is a continuous variable as before

I We can show the change in intercept graphically for ahypothetical wage regression:

8 / 25

males, and !0 is the difference in intercepts between females and males. We couldchoose females as the base group by writing the model as

wage " #0 $ %0male $ &1educ $ u,

where the intercept for females is #0 and the intercept for males is #0 $ %0; this impliesthat #0 " &0 $ !0 and #0 $ %0 " &0. In any application, it does not matter how wechoose the base group, but it is important to keep track of which group is the basegroup.

Some researchers prefer to drop the overall intercept in the model and to includedummy variables for each group. The equation would then be wage " &0male $#0 female $ &1educ $ u, where the intercept for men is &0 and the intercept for womenis #0. There is no dummy variable trap in this case because we do not have an overallintercept. However, this formulation has little to offer, since testing for a difference inthe intercepts is more difficult, and there is no generally agreed upon way to computeR-squared in regressions without an intercept. Therefore, we will always include anoverall intercept for the base group.

Part 1 Regression Analysis with Cross-Sectional Data

214

F i g u r e 7 . 1

Graph of wage = &0 $ !0female $ &1educ for !0 ' 0.

educ

slope = &1

wage

&0 $ !0

men: wage = &0 $ &1educ

women:wage = (&0 $ !0) + &1 educ

&0

0

d 7/14/99 5:55 PM Page 214

9 / 25

Dummy Variables

I The graph on the previous slide shows an example of the wageregressions when there is a positive intercept term, β0 andwhen women have a lower average wage compared to men,conditional on their level of education (δ0 < 0)

I Note that by including a dummy variable for “female” in theregression model, we have indirectly selected a base group –i.e. a group whose average wage (given that education is 0) isequal to the intercept term

I All other dummy variables in the regression are measured withrespect to the base group

I e.g., δ0 measures the change in intercept given that you arefemale, relative to male (or non-female)

10 / 25

Dummy Variables for Multiple

Categories

I It is often the case that we have a qualitative variable that hasmore than two groups

I e.g., race: black, white, hispanic, indigenous, asian, etcI e.g., highest level of schooling: no school, college, bachelor’s

degree, master’s, etcI e.g., geographic location: Western Canada, Eastern Canada,

Northern Canada

11 / 25

Dummy Variables for Multiple

Categories

I We can use dummy variables for multiple categories in thesame way that we use dummy variables for two categories

I Note that if we have g groups, we can only include g − 1dummy variables in the regression model

I If we include g dummy variables then the constant term willbe an exact linear combination of the dummy variables andthus we will have perfect collinearity - this is called thedummy variable trap

12 / 25

Dummy Variables for Multiple

Categories

I For example, suppose we are interested in whether or notdifferent races have different wages, conditional on education

I We can split race into: black, white, hispanic, indigenous,asian, other

I Then our regression model may look like the following:

wage = α0 + δ1black + δ2hisp+ δ3indig + δ4asian+ δ5other + α1educ+ u

I where we have omitted “white” so that all other dummyvariables are measured with respect to “white”

I α0 is thus the intercept term when all other dummies are equalto 0 - i.e., the intercept for “white”

13 / 25

Dummy Variables for Multiple

Categories

I We can work this out for each category with expectations asfollows:

I E(wage|white, educ = 0) = α0

I E(wage|black, educ = 0) = α0 + δ1

I E(wage|hisp, educ = 0) = α0 + δ2

I E(wage|indig, educ = 0) = α0 + δ3

I E(wage|asian, educ = 0) = α0 + δ4

I E(wage|other, educ = 0) = α0 + δ5

14 / 25

Interactions Involving Dummy

Variables

I We can interact two dummy variables to see if the effect of onecategory depends on belonging to the other category

I e.g. does the effect of race on wage vary by gender?I e.g. does the effect of gender on wage vary by marital status?I e.g. does the effect of parental education on educational

achievement vary by single parent households?

15 / 25

Interactions Involving Dummy

Variables

I For instance, the question “does the effect of gender on wagevary by marital status” can be addressed with the followingregression:

wage = β0 + β1female+ β2married+ β3female×married+ u

I The interpretation of each of the dummy variables is as anintercept shift relative to the omitted category

I Here, the omitted category is the value of the intercept whenall dummies are equal to 0: f = 0, m = 0, f ×m = 0, i.e. theintercept for non-female, non-married people

16 / 25

Interactions Involving Dummy

Variables and Continuous Variables

I We have established that dummy variables shift the interceptup or down depending on “group membership”

I Interacting two dummy variables translates to another shift inintercept

I Interacting a dummy variable with a continuos variable isinterpreted as a change in slope

I There are many relationships that might vary based on groupmembership

I e.g. does the number of weeks worked increase wages more forpeople with higher levels of schooling?

I e.g. is the return to education greater for men or women? is itgreater for minorities or non-minorities?

17 / 25

Interactions Involving Dummy

Variables and Continuous Variables

I We’ll consider as an example whether the effect of educationon wages varies by gender:

wage = β0 + δ0female+ β1education+ δ1female× education+ u

I E(wage|fem = 0, edu = 0) = β0 is the intercept term fornon-female

I E(wage|fem = 1, edu = 0) = β0 + δ0 is the itnercept for female

I dwagededu

= β1 is the change in wage associated with a one unitchange in education for non-female

I dwagededu

= β1 + δ1 is the change in wage associated with a oneunit change in education for female

18 / 25

F i g u r e 7 . 2

Graphs of equation (7.16). (a) !0 " 0, !1 " 0; (b) !0 " 0, !1 # 0.

Allowing for Different Slopes

We have now seen several examples of how to allow different intercepts for any num-ber of groups in a multiple regression model. There are also occasions for interactingdummy variables with explanatory variables that are not dummy variables to allow fordifferences in slopes. Continuing with the wage example, suppose that we wish to testwhether the return to education is the same for men and women, allowing for a constantwage differential between men and women (a differential for which we have alreadyfound evidence). For simplicity, we include only education and gender in the model.What kind of model allows for a constant wage differential as well as different returnsto education? Consider the model

log(wage) $ (%0 & !0 female) & (%1 & !1 female)educ & u. (7.16)

If we plug female $ 0 into (7.16), then we find that the intercept for males is %0, andthe slope on education for males is %1. For females, we plug in female $ 1; thus, theintercept for females is %0 & !0, and the slope is %1 & !1. Therefore, !0 measures thedifference in intercepts between women and men, and !1 measures the difference in thereturn to education between women and men. Two of the four cases for the signs of !0

and !1 are presented in Figure 7.2.

Part 1 Regression Analysis with Cross-Sectional Data

226

wage

(a) educ

men

women

wage

(b) educ

men

women

d 7/14/99 5:55 PM Page 226

19 / 25

Note that all the standard t statistics and F statistics arecomputed as before. Nothing changes with statistical testing forinteractions or dummy variables. The only difference betweeninteractions and dummy variables from what we saw in theprevious chapters is our interpretation of the regression coefficients.

20 / 25

Testing for Differences in Regression

Functions Across GroupsI We may be faced with situations in which we hypothesize that

the entire regression function is different for two groups

I Start with the standard regression function

y = β0 + β1x1 + · · ·+ βkxk + u

I If the entire regression function were different based on somegroup membership, e.g. D = 1 if belong to group 1, D = 0otherwise, then we can rewrite as:

y = β0 + δ0D+β1x1 + · · ·+βkxk + δ1x1×D+ · · ·+ δkxk×D+ e

I Here we have allowed the intercept to vary based on groupmembership and all of the slope parameters to vary based ongroup membership

21 / 25

Testing for Differences in Regression

Functions Across GroupsI We may be faced with situations in which we hypothesize that

the entire regression function is different for two groups

I Start with the standard regression function

y = β0 + β1x1 + · · ·+ βkxk + u (1)

I If the entire regression function were different based on somegroup membership, e.g. D = 1 if belong to group 1, D = 0otherwise, then we can rewrite as:

y = β0+δ0D+β1x1+· · ·+βkxk+δ1x1×D+· · ·+δkxk×D+e (2)

I Here we have allowed the intercept to vary based on groupmembership and all of the slope parameters to vary based ongroup membership

22 / 25

Testing for Differences in Regression

Functions Across Groups

I If the regression function were exactly the same for group 1and group 2, then δ0 = δ1 = · · · = δk = 0

I This suggests that a natural way to test this hypothesis wouldbe to use an F statistic

I H0 : δ0 = δ1 = · · · = δk = 0

I Ha : at least one δk 6= 0

I (2) is the unrestricted model

I (1) is the restricted model

23 / 25

Testing for Differences in Regression

Functions Across Groups

I Then the F statistic is:

F =(SSRR − SSRUR)/(k + 1)

SSRUR/(n− 2k − 2)

I Which follows a Fk+1,n−2k−2 distribution

I This statistic is called the Chow statistic

24 / 25

Testing for Differences in Regression

Functions Across Groups

I The Chow statistic can also be computed as:

F =(SSRR − SSR1 − SSR2)/(k + 1)

(SSR1 + SSR2)/(n− 2k − 2)

I Where SSR1 is the SSR obtained from running the restrictedmodel using only the sample of data from group 1 and SSR2 isthe SSR obtained from running the restricted model usingonly the sample of data from group 2

I This suggests that there are two ways to compute theF -statistic: one using the restricted and unrestricted models,and one using the restricted model on the full sample, group 1sample, and group 2 sample

25 / 25