applying geostatistical methods to lattice data: an initial examination of u.s. presidential...
TRANSCRIPT
Applying Geostatistical Methods to Lattice Data: An Initial
Examination of U.S. Presidential Elections in Iowa
A.C. ThomasStatistics 225
December 14, 2004
Sources/Guides
• Main source: “Hierarchical Models”, chapters 2 and 3 (geostatistical and spatial data)
• Data sources: http://www.sos.state.ia.us/elections/results/ (1996/2000)
• http://www.cnn.com/ (2004)• Special thanks: Brad Carlin (UMN),
Andy Gelman (Columbia), Paul Edlefsen (Harvard)
• GeoR: P.J. Ribeiro and P.J. Diggle
Motivation
• In this course, we have learned about three different methods of examining spatial data (depending on relevant conditions) with some interchangeabilities
• Often, we may not have the tools to examine data sets using one method (i.e. the shortcomings of R in manipulating lattice data)
• In this case, we will compare and contrast the effectiveness of a geostatistical method used on lattice data to a lattice method through self cross-validation
Interrelationship
• Geostats and kriging: using variograms and distance relationships to predict quantities across distances
• Lattices: using neighbour relationships to predict quantities across distances
• Direct similarities: some weighting schemes across distances directly resemble covariograms
Why election data?
• Why not?• Spatial organization is well understood
and constant in time (county borders have not changed across data sets) and built into R (maps library)
• While specific challengers change over time, parties are relatively constant, as are other control variables
• Ramifications are germane to the functioning of society (and the insatiable appetite of news junkies and policy wonks)
Questions:
• For this data set, does a geostatistical approximation produce a result comparable in error to a lattice model?
• If so, can we use fitted information from one election to predict the complete results of the next one? (And how much are we off?)
Chosen model: Iowa
Why Iowa?
• 99 counties which have roughly equal area, removing a possible nuisance (and are rectilinear, so easier to draw)
• Swing state, with a rough vote balance over time
• Not too big, not too small in either population or size
Simplification: No third parties
• For now, considering only the votes for Democrat and Republican candidates in presidential elections from 1996-2004
• Not so bad in 2000/2004, when independent vote was about 3% of total
• Worse in 1996 (Perot’s successful campaign drew a lot), up to 10% of total votes
Iowa in 1996 (Dole, Clinton)
Iowa in 2000 (Bush, Gore)
Iowa in 2004 (Bush, Kerry)
Initial impressions
• There seems to be a tendency to vote more Republican the further west we look
• (Observation, courtesy Matt Anthony: as we go east, we hit Illinois, a Democratic core.)
• What is the population distribution by county over time?
Iowa’s total voters, 1996
Iowa’s total voters, 2000
Iowa’s total voters, 2004
Quick-and-dirty non-spatial analysis
• Question: how does population size correlate with the Democratic vote?
• Correlation between blue vote and “total” vote:
• 1996: = 0.18• 2000: = 0.30• 2004: = 0.29.• So population would appear to be
an important covariate.
Geostatistical analysis
• Locations: centroids of each county (obtained through centroid.polygon function in maps library of R)
• Data: Republican percentage of vote (arbitrarily chosen, not necessarily personal political affiliation)
Initial data plots: Unaltered
Initial fitting
• Semivariogram appears to increase without bound, suggesting nonstationarity
• Plan: use Universal Kriging with this semivariogram
• Problem: Trend appears to be power law, with power greater than 2 (impossible to fit with conventional definitions
• Possible solutions: a) remove trend from data. b) don’t care.
Plan A: Remove trend from data
• What it does: lets us remove known spatial dependence, look at other trends
• Initial look: – major discrepancies.
Plan B: Don’t care.
• The goodness of fit only tails off at the end
• Preliminary results show the other option to be extremely inaccurate due to noise levels in residual data
Second trend removed, data centered
Exploratory Kriging
Meaningful Kriging
• Since we want to test the predictive power of this method, we should test it on our current data through cross-validation
• Key: remove one point, use semivariogram with remaining points to interpolate the value at each centroid
• Then, return trend to data and compare with original values
• Use universal kriging with second-degree trend
1996 Redux – Predicted Values
• In total, Dole “receives” 9,726 more votes than predicted.
• Absolute error: 43,526
• Total 2-party votes: 1,112,902
Fitting variograms between models
• For all, power model was appropriate choice ^2 + ^2 * t^
• 1996: ^2 = 9.24e-4, =1.98, ^2=0.031• 2000: ^2 = 9.93e-4, =2.00, ^2=0• 2004: ^2 = 1.16e-3, =2.00, ^2=0.025• All roughly identical, even with different
total averages
2000 Predicted
• Prediction: Bush gets 26,000 more votes
• Absolute error: 181,880
• Total Bush/Gore votes: 1,272,890
2004 Prediction
• Prediction: Bush gets 32,094 more votes
• Absolute difference: 74,458
• Total votes: 1,479,702
“Naïve Neighbour”
• For a baseline comparison, take the simplest (stupidest) lattice cross-validation test – “ask your neighbour”, trivial SAR weights
• Predicted value at a square is simply the mean of border-sharing neighbours (data is Republican percentage of vote)
“NN” 1996
• Dole: 10,819 more predicted
• Total deviation: 40,923
“NN” 2000
• Bush gets 28,535 extra in prediction
• Total deviation: 59,670
“NN” 2004
• Bush gets 37,175 more
• Total deviation: 76,926
Cross-validation summary
Geostat error
NN error
Geostat total error
NN total error
Voting pop.
1996
9,726 10,819 43,526 40,923 1,112,902
2000
26,000
28,535 61,485 59,670 1,272,890
2004
32,094
37,175 74,458 76,926 1,479,702
Conclusions
• Data is definitely not stationary, even after removing trends
• Good kriging is about as effective as “naïve neighbour”, both without covariates
• Prediction with these tools at this simple level is not yet accurate enough
• Each method overpredicts the Republican vote
• Fitting information for each year is very close
Future Developments and Unanswered Questions – New!
• I’ve since introduced universal co-kriging with population, past voting behavior and second-degree spatial dependences using the gstat package.
• Needed: data from the last 4 elections, conveniently packaged. Other prediction using spatial methods.