introduction to statistics: political science (class 7) part i: interactions wrap-up part ii: why...
TRANSCRIPT
Introduction to Statistics: Political Science (Class 7)
Part I: Interactions Wrap-up
Part II: Why Experiment in Political Science?
Why use an interaction term?
• Theoretical reason to think the relationship between one potential IV and the DV depends on the value of another IV
Was CER turned into a partisan issue by political rhetoric?
• DV: Support for Comparative Effectiveness Research (CER) – ranges from 0 “strongly oppose” to 100 “strongly support”
• We think the relationship between party affiliation and support depends on whether an individual is politically engaged (we measure this using “voted in 2008”)
Coef. SE T P
Party Affiliation (-3=strong R; 3=strong D) 1.286 0.878 1.460 0.143
Voted in 2008 -1.138 1.484 -0.770 0.443
Party Affiliation x Voted in 2008 3.575 0.918 3.900 0.000
Constant 61.100 1.358 44.980 0.000
61.100 + 1.286*Party – 1.138*Voted + 3.575*Party*Voted + u
61.100 + Party*1.286 + Party*Voted*3.575 – 1.138*Voted + u
61.100 + Party(1.286 + Voted*3.575) – 1.138*Voted + u
61.100 + Party*1.286 + Voted*Party*3.575 – Voted*1.138 + u
61.100 + Party*1.286 + Voted(Party*3.575 –1.138) + u
OR
Regression estimates an equation…
Party Aff. Voted Party Aff. Voted Party x Voted Constant Predicted Value
Coefficients 1.286 -1.138 3.575 61.100
-3 0 -3.858 0 0 61.100 57.242
-2 0 -2.572 0 0 61.100 58.528
-1 0 -1.286 0 0 61.100 59.814
0 0 0.000 0 0 61.100 61.100
1 0 1.286 0 0 61.100 62.386
2 0 2.572 0 0 61.100 63.672
3 0 3.858 0 0 61.100 64.959
Party Aff. Voted Party Aff. Voted Party x Voted Constant Predicted Value
Coefficients 1.286 -1.138 3.575 61.100
-3 1 -3.858 -1.13775 -10.7258 61.100 45.378
-2 1 -2.572 -1.13775 -7.1505 61.100 50.240
-1 1 -1.286 -1.13775 -3.57525 61.100 55.101
0 1 0.000 -1.13775 0 61.100 59.962
1 1 1.286 -1.13775 3.575252 61.100 64.824
2 1 2.572 -1.13775 7.150504 61.100 69.685
3 1 3.858 -1.13775 10.72576 61.100 74.547
40
50
60
70
80
Strong Republican Weak Republican Lean Republican Independent Lean Democrat Weak Democrat Strong DemocratSu
pp
ort
fo
r C
om
pa
rati
ve
Eff
ec
tiv
en
es
s R
es
ea
rch
Did not Vote Voted
Why/how does this work?
• Remember: OLS “blindly” identifies the coefficients on the IVs you specify that minimize the sum of the squared residuals
• If the relationship between X1 and Y does not depend on the value of X2, then the coefficient on the interaction will be 0 because that will lead to the best fit!
Two primary threats to identifying causal relationships
• Reverse causation – If we find an association, what causes what?
• Confounding / missing variables– Unaccounted for factors that might lead to
biased estimates of the relationship between an explanatory variable and outcome
Experimental data
• Emphasis on the data gathering process
• Randomized intervention– Defining characteristic of experiments. What’s
so great about it?
The logic of random assignment
• If each of you were to roll a die and:– Be assigned to group 1 if you roll a 1, 2, or 3– Be assigned to group 2 if you roll a 4, 5, or 6
• On average, how would two groups differ?
Benefits of Random Assignment
• Random assignment ensures that treatment and control groups will be similar except for the fact that one group is “treated”
Does media bias affect party attachments?
• Observational (survey)– What is your main source of TV news?– Fox News: 63% Republicans, 22% Democrats– CNN: 25% Republicans, 63% Democrats
• If we run a regression predicting party identification with main news source as the independent variable…– Missing variables? – Reverse causation?
Does media bias affect attitudes?
• Experiment – recruit a bunch of New Haven residents– Randomly assign to watch:
• A conservative news program OR• A liberal program OR• A placebo or nothing
– Measure issue attitudes
• Compare attitudes across groups
Media Experiment
• What confounds would we account for?• Treatment is – by design – not correlated
with anything else. So no confounds!• Is reverse causation a problem?
External validity
• Limits of examining effect of media bias on party attachments “in the lab”?– Is this how people really watch TV?– Is one “session” enough? – Demand effects?– Is the sample likely to be affected in a unique
way?
Do GOTV efforts work?
• During a presidential election year, campaigns spend loads of money on efforts to get people to vote
• But how do we know if they work?
• One possibility: survey people– Ask if they were contacted– Ask if they voted
Do GOTV efforts work?
Not Contacted Contacted
Did not Vote 374
(33.8%)
124
(12.5%)
Voted 731
(66.2%)
870
(87.5%)
DV=Turnout Predictor Coef SE T P
Contacted 0.214 0.018 11.87 0.000
Constant 0.662 0.012 53.38 0.000
• Being contacted increases the probability that someone will turnout by 21%???? – What else could explain (confound) this
relationship?
GOTV: lab or survey experiment
• Lab or survey experiment: embed a randomized treatment (text) in a survey
• Effects of GOTV messages:– Randomly present some people with a message
encouraging them to vote and not others– Ask them how likely they say they are to vote– See if people presented with the message say
they are more likely to vote
• Strengths of this? Weaknesses?
GOTV: field experiment
• Field experiment: intervention done while people are going about their business
• Effects of GOTV messages:– Randomly send some people on the voter
rolls a message encouraging them to vote and not others.
– Check the voter rolls after the election and see if people who were sent a message were more likely to vote.
Benefits of Field Experiments
• What are some of the benefits of a field experiment like this?
• Big one: External validity
Toolbox
• Multivariate regression and experiments are two ways to attempt to make inferences about causality
• Benefits of observational analysis:– Can “find” data – don’t have to gather it
yourself– Sometimes the only reasonable approach
(What causes wars? How does GDP affect infant mortality?)
Toolbox
• Costs of observational:– Difficult (impossible?) to definitively determine
causation• Did we measure every possible confound?• Did we specify the controlled relationships
properly?• What causes what?
Baby, bathwater
• This does not mean that multivariate regression is useless! – If we think carefully about what the right
regression model should be… we can get to pretty darn good (i.e., defensible) estimates
• This means think theoretically:– Do we have strong prior expectation that X
causes Y, rather than Y causing X?– What factors might confound our estimates?