pol 51: scientific study of politics

POL 51: Scientific Study of Politics

Prof. B. JonesDept. of Political ScienceUC-Davis

Plots and Z-scores

How to do some of the “stuff” in HW 4 Multiple plots on a single page Creating z-scores and finding p-values Visualizing political data Data: Obama vote share by county

Dot Chart: Obama Vote

dotchart(obamapercent, labels=row.names, cex=.7, xlim=c(0, 100), main="Support for Obama", xlab="Percent Obama")

abline(v=50)

Returns:

ModocLassenShastaTehamaGlennSierraColusaKernYubaSutterTulareCalaverasKingsAmadorMaderaMariposaTuolumnePlumasSiskiyouInyoEl DoradoPlacerDel NorteOrangeButteStanislausFresnoRiversideTrinitySan BernardinoNevadaSan Luis ObispoMercedSan DiegoSan JoaquinVenturaMonoSacramentoLakeSan BenitoSanta BarbaraImperialAlpineHumboldtSolanoNapaYoloMontereyContra CostaLos AngelesSanta ClaraMendocinoSan MateoSonomaSanta CruzMarinAlamedaSan Francisco

0 20 40 60 80 100

Support for Obama

Percent Obama

Interpretation?

Geographical Patterns?Central ValleyCoastalSoCal, NorCal?

Why might you observe these patterns? Z-scores

NB: we’re doing this for learning purposes

Z-scores

Easy: create mean, standard deviation Then derive z-score using formula from

last slide set: R code on next slide

Z-scores and R

#Z scores for Obama meanobama<-mean(obamapercent) sdobama<-sd(obamapercent) zobama<-(obamapercent-meanobama)/sdobama

Interpretation

Z-scores in metric of standard deviations Large z imply the observation is further away from mean than

observations with small z. Z=0 means the observation is exactly at the mean. Dotchart (code):par(mfcol=c(1,1)) dotchart(zobama, labels=row.names, cex=.7, xlim=c(-3, 3), main="p-values for Obama Vote Z-scores", xlab="Probability")

abline(v=0) abline(v=1, col="red") abline(v=-1, col="red") abline(v=2, col="dark red") abline(v=-2, col="dark red")

ModocLassenShastaTehamaGlennSierraColusaKernYubaSutterTulareCalaverasKingsAmadorMaderaMariposaTuolumnePlumasSiskiyouInyoEl DoradoPlacerDel NorteOrangeButteStanislausFresnoRiversideTrinitySan BernardinoNevadaSan Luis ObispoMercedSan DiegoSan JoaquinVenturaMonoSacramentoLakeSan BenitoSanta BarbaraImperialAlpineHumboldtSolanoNapaYoloMontereyContra CostaLos AngelesSanta ClaraMendocinoSan MateoSonomaSanta CruzMarinAlamedaSan Francisco

-3 -2 -1 0 1 2 3

Obama Vote Z-scores

Z-score

Probability Values

High Z-scores are probabilistically less likely to be observed than smaller scores.

Consult a z-distribution table Probability area is given Can think about probabilities in the “tails” One-tail (upper or lower) Two-tail (upper + lower) R

R code

twotailp<- 2*pnorm(-abs(zobama)) #Gives us area in the upper and lower tails of z

onetailp<- pnorm(-abs(zobama)) #Gives us 1-tail probability area; if #subtract this from 1, this give us the area #below this z score (if z is positive) or #area above this z score (if z is negative)

zp<-cbind(county, onetailp, twotailp, zobama ); zp

Plots 4 plots on one page:

par(mfcol=c(2,2))

boxplot(obamapercent, ylab="Vote Percent", main="Obama Vote: Box Plot", col="blue") hist(zobama, xlab="Obama Vote as Z-Scores", ylab="Frequency", main="Histogram of Standardized Obama Vote", col="blue")

hist(obamapercent, ylab="Frequency", xlab="Vote Percent", main="Obama Vote: Histogram", col="blue")

plot(zobama, onetailp, ylab="One-Tail p", xlab="Z-score", main="Z-scores and p-values", col="blue")

3040

5060

7080

Obama Vote: Box Plot

Vot

e P

erce

nt

Histogram of Standardized Obama Vote

Obama Vote as Z-Scores

Fre

quen

cy

-2 -1 0 1 2

05

1015

Obama Vote: Histogram

Vote Percent

Fre

quen

cy

30 40 50 60 70 80 90

05

1015

-1 0 1 2

0.0

0.1

0.2

0.3

0.4

0.5

Z-scores and p-values

Z-score

One

-Tai

l p

pol 51: scientific study of politics

Documents