section 9.1 notes - washington-liberty · section 9.1 notes 3 so based on our results either: (1)...

14
Section 9.1 Notes 1 Signi%icance Tests (Hypothesis Tests) Con$idence intervals ESTIMATE an unknown parameter (μ or p in our case) by giving us a possible range of values for that parameter with some level of con$idence. We could use this estimate to make statements (with some level of con$idence) about the true parameter value. For Example: One of your classmates is running for student government. He claims that he has 63% of the classes' vote. You collect a random sample of 50 students and $ind that 28 of them will vote for him. Does your data support his claim? Use a 95% con$idence interval to help argue your case. Are there other ways that we could answer questions like these? Free-Throw Activity: A basketball player claims to make 80% of the free throws that he attempts. We think he might be exaggerating. To test this claim, we'll ask him to shoot some free throws. Have the player shoot 25 free throws - record how many he made. Do we have enough data to decide whether the player's claim is valid? How many shots do we need to make a decision?

Upload: others

Post on 28-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

1

Signi%icanceTests(HypothesisTests)

Con$idenceintervalsESTIMATEanunknownparameter(μorpinourcase)bygivingusapossiblerangeofvaluesforthatparameterwithsomelevelofcon$idence.

Wecouldusethisestimatetomakestatements(withsomelevelofcon$idence)aboutthetrueparametervalue.

ForExample:Oneofyourclassmatesisrunningforstudentgovernment.Heclaimsthathehas63%oftheclasses'vote.Youcollectarandomsampleof50studentsand$indthat28ofthemwillvoteforhim.Doesyourdatasupporthisclaim?Usea95%con$idenceintervaltohelpargueyourcase.

Arethereotherwaysthatwecouldanswerquestionslikethese?

Free-ThrowActivity:Abasketballplayerclaimstomake80%ofthefreethrowsthatheattempts.Wethinkhemightbeexaggerating.Totestthisclaim,we'llaskhimtoshootsomefreethrows.

Havetheplayershoot25freethrows-recordhowmanyhemade.

Dowehaveenoughdatatodecidewhethertheplayer'sclaimisvalid?

Howmanyshotsdoweneedtomakeadecision?

Page 2: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

2

Whataresomeobservationswehaveaboutthisactivity?

Whenisiteasytodecideifaclaimiswrong?Whenisithard?

Howfarofffromtheclaimdoesthesampledataneedtobetoconvinceusitiswrong?

ThePrincetonReviewclaimsthattheirpracticecoursewillimproveSATscoresfor90%oftheirparticipants.Youaredoubtful.InordertotestPrincetonReview'sclaimyourwholeseniorclass(say,475students)decidestotakethecourseandtestwhetherPrincetonReview’sclaimiscorrect.

YourentireclasstakesthecourseandthentakestheSATasecondtime.EachstudentfromyourseniorclassanonymouslyreportswhetherornottheirscoreimprovedonthesecondSATtest.

Hereisthesampledata:475students410studentshadimprovedscores

Youwanttoknowwhattheprobabilityisofgettingthiskindofsampledata(orworse)ifinfactPrincetonReviewiscorrectand90%(andnofewer)ofparticipantshaveimprovedscores.

Whatistheprobabilitythatatmost410of475randomlyselectedstudentsimprovedtheirscorewhenthetrueproportionofstudentswhoimproveaftertakingthePrincetonReviewcourseis90%?

Page 3: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

3

Sobasedonourresultseither:

(1)PrincetonReview'sclaimiscorrect(p=0.9)and,byverybadluck,averyunlikelyoutcomehasoccurred(ourclassresults).

OR

(2)PrincetonReview'sclaimiswrongandtheproportionofimprovementisactuallylessthan0.9,sooursampleresultisnotanunlikelyoutcome.

AnoutcomethatwouldrarelyhappenifaclaimweretrueisgoodevidencethattheclaimisNOTtrue.

BasicsofaSigni%icanceTest

1.Makeastatementaboutaparameter-i.e.thatμorpareequaltosomevalue(thetestisdesignedto$indevidenceagainstthisstatement-thisisthenullhypothesisHo)

2.Determineanalternatehypothesis-i.e.thatμorparedifferentthanthevalueclaimedintheHo-couldbelessthan,greaterthan,orjustnotequalto

3.Collectsampledata

4.Determinetheprobabilityofgettingthissampledata,ormoreextreme,givenyourHoistrue

5.ConcludewhetherornotHohasbeenshowntobefalseandcanberejectedinfavorofthealternative(Ha).

Page 4: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

4

HypothesesNullHypothesis(Ho)

-theclaimthatisbeingtestedinanysigni$icancetest-testistryingto$indevidenceagainstHo-ALWAYSastatementabouttheparameter-ALWAYSstatingthatthereis"nodifference",thattheparameterisEQUALtosomething

AlternativeHypothesis(Ha)-theclaimaboutthepopulationthatwearetryingto$indevidencefor-ALWAYSstatingthatthetrueparameterislessthan,greaterthan,ornotequaltothevalueputforthinHo-canbeone-sidedortwo-sided

ALWAYSestablishyourhypothesesBEFOREyouhaveseenthedata-otherwiseitischeating!

CHECKYOURUNDERSTANDING:

Foreachofthefollowingsettings,(a)describetheparameterofinterest,and(b)stateappropriatehypothesesforasigni$icancetest.

1.AccordingtotheWebsitesleepdeprivation.com,85%ofteensaregettinglessthaneighthoursofsleepanight.Janniewondersifthisresultholdsinherlargehighschool.SheasksanSRSof100studentsattheschoolhowmuchsleeptheygetonatypicalnight.Inall,75oftheresponderssaidlessthan8hours.

2.Aspartofits2010censusmarketingcampaign,theU.S.CensusBureauadvertised"10questions,10minutes-that'sallittakes".Onthecensusformitself,weread,"TheU.S.CensusBureauestimatesthat,fortheaveragehousehold,thisformwilltakeabout10minutestocomplete,includingtimeforreviewingtheinstructionsandanswers."Wesuspectthattheactualtimeittakestocompletetheformmaybelongerthanadvertised.

Page 5: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

5

P-Value

Thesmallerthep-value,thegreatertheevidenceAGAINSTHo,providedbyourdata.

Largep-valuesdoNOTproveHotrue,theyjustfailtogiveusconvincingevidenceagainstHo.Failingto$indevidenceagainstHomeansonlythatthedataareconsistentwithHo,notthatwehaveclearevidencethatHoistrue.

The probability that the statistic (we calculated from our sample data)

would take a value as extreme or more extreme than the one actually

observed IF the null hypothesis (Ho) is true.

StatisticalSigni%icance

The$inalstepofasigni$icancetestistostateconclusions.

DeterminewhetherornotyourejectHoorfailtorejectHo.Note,onceagain,wedoNOTacceptHoastrue,weonlyfailtorejectit.

Howdowedetermine"toounlikely"?

WerejectHoifoursampleresultistoounlikelytohappenbychance.

Page 6: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

6

Somemorepractice...

Explainwhat'swrongwiththestatedhypotheses,thengivecorrecthypotheses.

1.Achangeismadethatshouldimprovestudentsatisfactionwiththeparkingsituationatyourschool.Rightnow,37%ofstudentsapproveoftheparkingthat'sprovided.Thenullhypothesisistestedagainstthealternative .

2.Inplanningastudyofthebirthweightsofbabieswhosemothersdidnotseeadoctorbeforedelivery,aresearcherstatesthehypothesesas

Page 7: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

7

3.AGallopPollreportonanationalsurveyof1028teenagersrevealedthat72%ofteenssaidtheyseldomorneverarguewiththeirfriends.Yvonnewonderswhetherthisnationalresultwouldbetrueinherlargehighschool.Soshesurveysarandomsampleof150studentsatherschoolandfoundthat96studentsinthesamplesaidtheyrarelyorneverarguewithfriends.Asigni$icancetestyieldsap-valueof0.0291.

(a)StatehypothesesforYvonne'ssigni$icancetest.Besuretode$ineanyparameters.

(b)Interpretthisresultincontext.

(c)Dothedataprovideconvincingevidenceagainstthenullhypothesis?Explain.

4.Askedtoexplainthemeaningof"statisticallysigni$icantattheα=0.05level,"astudentsays,"Thismeansthattheprobabilitythatthenullhypothesisistrueislessthan0.05."Isthisexplanationcorrect?Whyorwhynot?

The p-value is the probability that the statistic (we calculated from our

sample data) would take a value as extreme or more

extreme than the one actually observed IF the null hypothesis (Ho) is

true.

Page 8: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

8

ErrorWithHypothesisTeststhereisalwaysthepossibilitythatwewillmakeamistake.Therearetwotypeswecouldmake:

1.Werejectthenullhypothesiswheninfactit'strue.2.Wefailtorejectthenullhypothesiswheninfactit'sfalse

Page 9: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

9

SupposeIbelievethatthemeanspeedonWashingtonBlvdnearW-Lisover40mph.

Iestablishthefollowinghypotheses:

Ho:

Ha:

KeepinmindtheACTUALmeanspeedisunknown-butitiseitherexactly40mphor(basedonmyhypotheses)itishigher.

TypeIError

H0:

Page 10: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

10

AconsumeradvocacygroupclaimsthatthemeanmileagefortheCarterMotorCompany'snewsedanislessthan32milespergallon.IdentifythetypeIerrorforthetest.

a.Rejecttheclaimthatthemeanisequalto32milespergallonwhenitisactually32milespergallon.

b.Rejecttheclaimthatthemeanisequalto32milespergallonwhenitisactuallylessthan32milespergallon.

c.Failtorejecttheclaimthatthemeanisequalto32milespergallonwhenitisactuallylessthan32milespergallon.

d.Failtorejecttheclaimthatthemeanisequalto32milespergallonwhenitisactuallygreaterthan32milespergallon.

TypeIIError

H0:

Ha:

Page 11: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

11

Highwaysafetyengineerstestnewroadsigns,hopingthatincreasedre$lectivitywillmakethemmorevisibletodrivers.Volunteersdrivethroughatestcoursewithseveralofthenewandoldstylesignsandratewhichkindshowsupthebest.

WhatwouldatypeIerrorbeinthiscase?

WhatwouldatypeIIerrorbeinthiscase?

Thefeasibilityofconstructingapro$itableelectricity-producingwindmilldependsontheaveragevelocityofthewind.Foracertaintypeofwindmill,theaveragewindspeedwouldhavetoexceed20mphinorderforitsconstructiontobefeasible.Totestwhetherornotaparticularsiteisappropriateforthiswindmill,50readingsofthewindvelocityaretaken,andtheaverageiscalculated.Thetestisdesignedtoanswerthequestion,isthesitefeasible?Thatis,istheresuf$icientevidencetoconcludethattheaveragewindvelocityexceeds20mph?Wewanttotestthefollowinghypotheses.

H0:μ=20Ha:μ>20

WhatwouldatypeIIerrorinthiscasemean?

Page 12: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

12

Howcanwereduceerror?TypeI:Reducethesigni$icancelevel(α)

TypeII:IncreasePower(probabilityofthetesttoCORRECTLYrejectHo)IncreasesamplesizeIncreaseeffectsizeIncreasethesigni$icancelevel(α)

SEEHANDOUTONERRORANDPOWER!

H0:

Ha:

PowerThepowerofatestistheprobabilitythatitcorrectlyrejectsafalsenullhypothesis.

Inordertocalculatethepowerofatestweimaginethenullhypothesisisfalse.Thevalueofthepowerdependsonhowfarthetrueparameterliesfromthevalueofthenullhypothesis.Thisiscalledtheeffectsize.

Page 13: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

13

IncreasingPower

H0:

Ha:

DECREASE

TYPE II

ERROR!!!

Section9.1Homework:p.546#s1-27odd,28-30all

Page 14: Section 9.1 Notes - Washington-Liberty · Section 9.1 Notes 3 So based on our results either: (1) Princeton Review's claim is correct (p = 0.9) and, by very bad luck, a very unlikely

Section 9.1 Notes

14