introduction to statistics: political science (class 1) answering political questions with...
TRANSCRIPT
Introduction to Statistics: Political Science (Class 1)
Answering Political Questions with Quantitative Data
(political variables, review of bivariate regression, & thinking about causality)
Why learn how to answer political questions with quantitative data?
• Area to apply/practice using statistics– Tools can be applied elsewhere (on the job, health
decisions [Atkins/gluten free?])
• Understand cause and effect in politics– Academic reasons – develop knowledge that can be
passed on to others– As a citizen – evaluate evidence about policies; who
deserves credit/blame
• Prepare for your future responsibilities as political officials???
What types of questions can data analysis help us to answer?
• International relations– Why do countries go to war?
• Comparative politics– Why does the rate of infant mortality vary across
countries?
• Policy– How can we improve student test scores?
• Public opinion/political behavior– How do people decide whether to vote? – What policies does the public support and why?
Today’s agenda…
Measuring political concepts
Review of bivariate regression
Thinking about causality
Measurement: Units of analysis
• What are the cases/rows in political data?• Actors: individuals, elected officials
• Geographic/political units: states, countries, precincts
• Events: individual congressional races, elections (e.g., “seats won”), court cases
• Unit/Time: country-year, individual at time T
Measurement: Data Sources
• Government / historical records– Vote by precinct; GDP/economic data;
individual turnout
• Expert assessments– Level of democracy; presidents’ personalities
• Surveys– Reported attitudes / behaviors
For example… .
• Distribution of a variable in politics
• What is this “margin of error +/- 3%”?
Relationships between variables (regression analysis)
• Two types of variables:– Dependent variable (or predicted variable or
“regressand”) – what we want to predict– Independent variable (or explanatory variable
or “regressor”)
• Bivariate regression model
Υ = β0 + β1X + u
How does presidential approval affect midterm election outcomes? • Unit of analysis: midterm election (1950-2006)
• Dependent variable: seats gained by incumbent president’s party (House)
• Independent variable: presidential approval on Labor Day of election year – 0 (no one approves) 100 (everyone approves)
Coef SE Coef T PPresidential Approval 1.32 0.50 2.64 0.020Constant -93.32 27.28 -3.42 0.005
656055504540
10
0
-10
-20
-30
-40
-50
-60
Presidential Approval (Labor Day)
Seats
Gain
ed in
the H
ouse
by P
resi
dent'
s Part
y
2006
20021998
1994
19901986
1982
1978
1974
1970
1966
1962
1958
1954
1950
Υ = β0 + β1X + uSeats = -93.32 + (1.32 * Approval) + u
In 1978, Carter’s approval was 49(%)
Remember: in regression analysis (aka “Ordinary Least Squares”), the “best fit” line is the one that minimizes the sum of the squared residuals
-15
Obama’s approval rating was 46(%)
Democratic Peace
• Theory: Democracies tend not to go to war with one another – why would this be?
• What does a democracy look like? How could we measure “democracy”?
Polity III Democracy score (0-10)
• Competitiveness of Executive Recruitment– Selection (e.g., hereditary, military-based, rigged) (0 points)– Dual/Transactional (one hereditary/one by elections) (1 point)– Election (2 points)
• Constraints on Chief Executive– Unlimited Authority (0 points)– Substantial limitations (2 points)– Parity/Subordination (4 points)
• Openness of Executive Recruitment– 0 or 1 point
• Competitiveness of participation– Repressed/no participation (0 points)– Factional (ethnic/parochial factions battle it out; 1 point)– Transitional – Competitive (stable and enduring secular political groups compete for
political influence at the national level; 3 points)
Democracy Peace?
• Units of analysis: country-dyad-years– Restricted to “relevant” dyads (1945-2008)
• Dependent variable: number of years the pair of countries have been at peace
• Independent variable: sum of countries’ democracy scores (0-20)
Coef SE Coef T PDemocracy Scores 0.259 0.023 11.34 0.000Constant 23.21 0.253 91.82 0.000
Why are these SEs so small / T values so big???N=35,554
Causal relationships
• Identifying associations is nice, but usually we want to identify causality
• Two primary threats– Reverse causation
• If we find an association, what causes what?
– Confounding / missing variables• Additional factors that might lead us to give too
much “credit” to an explanatory variable
Reverse Causation?
Intent to VoteContact by a
Political Campaign
?
NOTE: Solid lines = proposed causal relationship; dotted lines = non-causal correlation
Let’s say we have some survey data…
Missing variable?
Forest Fires Ice Cream Sales?
Hot Weather “Common Response”
NOTE: Solid lines = proposed causal relationship; dotted lines = non-causal correlation
What else might explain midterm outcomes?
Were we giving too much “credit” to presidential approval ratings as an explanation in our bivariate analysis?
Presidential Approval Midterm Outcomes
Midterm OutcomesPresidential Approval(Labor Day before election)
?
Economic Conditions
Democracy Peace?
Pair of Countries (do not) Go to War
Level of Democracy in Pair of Countries
?
Military Power of Pair of Countries
Explanations for lower likelihood of war that might confound the relationship between
democracy and peace?
For the next few weeks…
Thinking about and accounting for more than one possible explanation
– Next 4 classes: using multivariate regression to deal with known, measured confounds
– Later: dealing with unknown confounds and reverse causation
Goals
• By the end of the semester you will be...• ...able to conduct and interpret multivariate
regression analysis and analyze experimental data
• ...better prepared to understand quantitative findings reported in political science (and other) research
• ...able to think critically about and recognize the strengths and weaknesses of these analyses
Grading/expectations
• No new books – but you’re encouraged to have *a book*
• 4 homework assignments – Conduct and interpret analysis– Think about how analyses could be improved
• Participation– If you don’t understand, ask!
• The final: about 1/3 focused on first segment of the class, 2/3 on this segment
Note on next week
• First homework assignment will be handed out this Thursday. Due next Thursday.
• No class next Tuesday
• TAs will hold extra office hours on Monday (November 1st – see syllabus for times)
• Take a look at the homework before Monday – you may need help!
Next time (Thursday)
• What multiple regression analysis (regression with more than one explanatory variable) can get us