pol 51: scientific study of politics
DESCRIPTION
POL 51: Scientific Study of Politics. Prof. B. Jones Dept. of Political Science UC-Davis. Plots and Z-scores. How to do some of the “stuff” in HW 4 Multiple plots on a single page Creating z-scores and finding p -values Visualizing political data Data: Obama vote share by county. - PowerPoint PPT PresentationTRANSCRIPT
POL 51: Scientific Study of Politics
Prof. B. JonesDept. of Political ScienceUC-Davis
Plots and Z-scores
How to do some of the “stuff” in HW 4 Multiple plots on a single page Creating z-scores and finding p-values Visualizing political data Data: Obama vote share by county
Dot Chart: Obama Vote
dotchart(obamapercent, labels=row.names, cex=.7, xlim=c(0, 100), main="Support for Obama", xlab="Percent Obama")
abline(v=50)
Returns:
ModocLassenShastaTehamaGlennSierraColusaKernYubaSutterTulareCalaverasKingsAmadorMaderaMariposaTuolumnePlumasSiskiyouInyoEl DoradoPlacerDel NorteOrangeButteStanislausFresnoRiversideTrinitySan BernardinoNevadaSan Luis ObispoMercedSan DiegoSan JoaquinVenturaMonoSacramentoLakeSan BenitoSanta BarbaraImperialAlpineHumboldtSolanoNapaYoloMontereyContra CostaLos AngelesSanta ClaraMendocinoSan MateoSonomaSanta CruzMarinAlamedaSan Francisco
0 20 40 60 80 100
Support for Obama
Percent Obama
Interpretation?
Geographical Patterns?Central ValleyCoastalSoCal, NorCal?
Why might you observe these patterns? Z-scores
NB: we’re doing this for learning purposes
Z-scores
Easy: create mean, standard deviation Then derive z-score using formula from
last slide set: R code on next slide
Z-scores and R
#Z scores for Obama meanobama<-mean(obamapercent) sdobama<-sd(obamapercent) zobama<-(obamapercent-meanobama)/sdobama
Interpretation
Z-scores in metric of standard deviations Large z imply the observation is further away from mean than
observations with small z. Z=0 means the observation is exactly at the mean. Dotchart (code):par(mfcol=c(1,1)) dotchart(zobama, labels=row.names, cex=.7, xlim=c(-3, 3), main="p-values for Obama Vote Z-scores", xlab="Probability")
abline(v=0) abline(v=1, col="red") abline(v=-1, col="red") abline(v=2, col="dark red") abline(v=-2, col="dark red")
ModocLassenShastaTehamaGlennSierraColusaKernYubaSutterTulareCalaverasKingsAmadorMaderaMariposaTuolumnePlumasSiskiyouInyoEl DoradoPlacerDel NorteOrangeButteStanislausFresnoRiversideTrinitySan BernardinoNevadaSan Luis ObispoMercedSan DiegoSan JoaquinVenturaMonoSacramentoLakeSan BenitoSanta BarbaraImperialAlpineHumboldtSolanoNapaYoloMontereyContra CostaLos AngelesSanta ClaraMendocinoSan MateoSonomaSanta CruzMarinAlamedaSan Francisco
-3 -2 -1 0 1 2 3
Obama Vote Z-scores
Z-score
Probability Values
High Z-scores are probabilistically less likely to be observed than smaller scores.
Consult a z-distribution table Probability area is given Can think about probabilities in the “tails” One-tail (upper or lower) Two-tail (upper + lower) R
R code
twotailp<- 2*pnorm(-abs(zobama)) #Gives us area in the upper and lower tails of z
onetailp<- pnorm(-abs(zobama)) #Gives us 1-tail probability area; if #subtract this from 1, this give us the area #below this z score (if z is positive) or #area above this z score (if z is negative)
zp<-cbind(county, onetailp, twotailp, zobama ); zp
Plots 4 plots on one page:
par(mfcol=c(2,2))
boxplot(obamapercent, ylab="Vote Percent", main="Obama Vote: Box Plot", col="blue") hist(zobama, xlab="Obama Vote as Z-Scores", ylab="Frequency", main="Histogram of Standardized Obama Vote", col="blue")
hist(obamapercent, ylab="Frequency", xlab="Vote Percent", main="Obama Vote: Histogram", col="blue")
plot(zobama, onetailp, ylab="One-Tail p", xlab="Z-score", main="Z-scores and p-values", col="blue")
3040
5060
7080
Obama Vote: Box Plot
Vot
e P
erce
nt
Histogram of Standardized Obama Vote
Obama Vote as Z-Scores
Fre
quen
cy
-2 -1 0 1 2
05
1015
Obama Vote: Histogram
Vote Percent
Fre
quen
cy
30 40 50 60 70 80 90
05
1015
-1 0 1 2
0.0
0.1
0.2
0.3
0.4
0.5
Z-scores and p-values
Z-score
One
-Tai
l p