stat 13, intro. to statistical methods for the life and …frederic/13/f16/day09.pdf1. comparing 2...

Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.

1. Comparing 2 proportions, and smoking and gender example continued.2. 5 number summary, IQR, and Geysers.3. Comparing two means with simulations, and the bicycling example. Read ch6.

NO LECTURE THU NOV 3! Review for the midterm will be Nov 1.Recall there is also no lecture or office hour Tue Nov 8. http://www.stat.ucla.edu/~frederic/13/F16 .Bring a PENCIL and CALCULATOR and any books or notes you want to the midterm and final. HW3 is due Tue Nov 1. 4.CE.10, 5.3.28, 6.1.17, and 6.3.14. In 5.3.28d, use the theory-based formula. You do not need to use an applet.

1

HW3 is due Thu Nov 3. 4.CE.10, 5.3.28, 6.1.17, and 6.3.14.

4.CE.10 starts out "Studies have shown that children in the U.S. who have been spanked have a significantly lower IQ score on average...."

5.3.28 starts out "Recall the data from the Physicians' Health Study: Of the 11,034 physicians who took the placebo ...."

6.1.17 starts out "The graph below displays the distribution of word lengths ...."

6.3.14 starts out "In an article titled 'Unilateral Nostril Breathing Influences Lateralized Cognitive Performance' that appeared ...."

2

Aclarificationontheformulas• Themarginoferrorforthedifferenceinproportionsis

𝑀𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟 ⨯ SE,whereSE = 345(78345):5

+ 34<(7834<):<

�

Intesting,thenullhypothesisisnodifferencebetweenthetwogroups,soweusedtheSE

�̂�(1 − �̂�)𝑛7

+�̂�(1 − �̂�)

𝑛B

�

where�̂� istheproportioninbothgroupscombined.But

inCIs,weusetheformula 345(78345):5

+ 34<(7834<):<

�

becausewearenotassuming�̂�7 =�̂�BwithCIs.

SmokingandGender• Ourstatisticistheobservedsampledifferenceinproportions,0.097.

• Pluggingin1.96 ⨯ 345(78345):5

+ 34<(7834<):<

� =0.044,

weget0.097± 0.044asour95%CI.• Wecouldalsowritethisintervalas(0.053,0.141).• Weare95%confidentthattheprobabilityofaboy

babywhereneitherfamilysmokesminustheprobabilityofaboybabywherebothparentssmokeisbetween0.053and0.141.

SmokingandGender

• Howwouldtheintervalchangeiftheconfidencelevelwas99%?

• TheSE= 345(78345):5

+ 34<(7834<):<

� =.0224.

• Previously,fora95%CI,itwas0.097± 1.96x.0224=0.097± 0.044.• Fora99%CI,itis0.097± 2.576x.0224=0.097± 0.058.

SmokingandGender• Writtenasthestatistic± marginoferror,the99%CIforthedifferencebetweenthetwoproportionsis

0.097± 0.058.• Marginoferror• 0.058forthe99%confidenceinterval• 0.044forthe95%confidenceinterval

SmokingandGender• Howwouldthe95%confidenceintervalchangeifwewereestimating

𝜋smoker – 𝜋nonsmoker

insteadof𝜋nonsmoker – 𝜋smoker?

SmokingandGender• (−0.141,−0.053)or −0.097± 0.044insteadof• (0.053,0.141)or 0.097± 0.044.

• Thenegativesignsindicatetheprobabilityofaboyborntosmokingparentsislowerthanthatfornonsmokingparents.

SmokingandGenderValidityConditionsofTheory-Based• Sameaswithasingleproportion.• Shouldhaveatleast10observationsineachofthecellsofthe2x2table.

SmokingParents Non-smokingParents

Total

Male 255 1975 2230Female 310 1627 1937Total 565 3602 4167

SmokingandGender• Thestrongsignificantresultinthisstudyyieldedquiteabitofpresswhenitcameout.• Soonotherstudiescameoutwhichfoundnorelationshipbetweensmokingandgender(Parazinni etal.2004,Obel etal.2003).• James(2004)arguedthatconfoundingvariableslike socialfactors,diet,environmentalexposureorstresswerethereasonfortheassociationbetweensmokingandgenderofthebaby.Theseareallconfoundedsinceitwasanobservationalstudy.Differentstudiescouldeasilyhavehaddifferentlevelsoftheseconfoundingfactors.

2.Fivenumbersummary,IQR,andgeysers.

6.1: Comparing Two Groups: Quantitative Response6.2: Comparing Two Means: Simulation-Based Approach6.3: Comparing Two Means: Theory-Based Approach

Section 6.1

ExploringQuantitativeData

Quantitativevs.CategoricalVariables

• Categorical• Valuesforwhicharithmeticdoesnotmakesense.• Gender,ethnicity,eyecolor…

• Quantitative• Youcanaddorsubtractthevalues,etc.• Age,height,weight,distance,time…

GraphsforaSingleVariable

Categorical

Quantitative

Bar Graph Dot Plot

ComparingTwoGroupsGraphically

Categorical

Quantitative

NotationCheckStatistics� �̅� Samplemean� �̂� Sampleproportion.

Parameters� 𝜇 Populationmean� 𝜋 Population

proportionorprobability.

Statistics summarize a sample and parameters summarize a population

Quartiles• Suppose25%oftheobservationsliebelowacertainvaluex.Thenxiscalledthelowerquartile(or25th percentile).• Similarly,if25%oftheobservationsaregreaterthanx,thenxiscalledtheupperquartile (or75thpercentile).• Thelowerquartilecanbecalculatedbyfindingthemedian,andthendeterminingthemedianofthevaluesbelowtheoverallmedian.Similarlytheupperquartileismedian{xi :xi> overallmedian}.

IQRandFive-NumberSummary• Thedifferencebetweenthequartilesiscalledtheinter-quartilerange (IQR),anothermeasureofvariabilityalongwithstandarddeviation.• Thefive-numbersummary forthedistributionofaquantitativevariableconsistsoftheminimum,lowerquartile,median,upperquartile,andmaximum.• TechnicallytheIQRisnottheinterval(25thpercentile,75thpercentile),butthedifference75th percentile– 25th .• Differentsoftwareusedifferentconventions,butwewillusetheconventionthat,ifthereisarangeofpossiblequantiles,youtakethemiddleofthatrange.• Forexample,supposedataare1,3,7,7,8,9,12,14.• M=7.5,25th percentile=5,75th percentile=10.5.IQR=5.5.

IQRandFive-NumberSummary• Formediansandquartiles,wewillusetheconvention,ifthereisarangeofpossibilities,takethemiddleoftherange.• InR,thisistype=2.type=1meanstaketheminimum.• x=c(1,3,7,7,8,9,12,14)• quantile(x,.25,type=2)##5.5• IQR(x,type=2)##5.5• IQR(x,type=1)##6.Canyouseewhy?

• Forexample,supposedataare1,3,7,7,8,9,12,14.• M=7.5,25th percentile=5,75th percentile=10.5.IQR=5.5.

GeyserEruptionsExample6.1

OldFaithfulInter-EruptionTimes

• Howdothefive-numbersummaryandIQRdifferforinter-eruptiontimesbetween1978and2003?

OldFaithfulInter-EruptionTimes

• 1978IQR=81– 58=23• 2003IQR=98– 87=11

Boxplots

Min Qlower Med Qupper Max

Boxplots(Outliers)• Adatavaluethatismorethan1.5× IQRabovetheupperquartileorbelowthelowerquartileisconsideredanoutlier.

• Whentheseoccur,thewhiskersonaboxplotextendouttothefarthestvaluenotconsideredanoutlierandoutliersarerepresentedbyadotoranasterisk.

CancerPamphletReadingLevels

• Shortetal.(1995)comparedreadinglevelsofcancelpatientsandreadabilitylevelsofcancerpamphlets.Whatisthe:• Medianreadinglevel?• Meanreadinglevel?

• Arethedataskewedonewayortheother?

• Skewedabittotheright• Meantotherightofmedian

3.ComparingTwoMeans:Simulation-BasedApproachandbicyclingtoworkexample.Section 6.2

Comparisonwithproportions.

• Wewillbecomparingmeans,muchthesamewaywecomparedtwoproportionsusingrandomizationtechniques.• Thedifferencehereisthattheresponsevariableisquantitative(theexplanatoryvariableisstillbinarythough).Soifcardsareusedtodevelopanulldistribution,numbersgoonthecardsinsteadofwords.

BicyclingtoWorkExample6.2

BicyclingtoWork• Doesbicycleweightaffectcommutetime?• BritishMedicalJournal(2010)presentedtheresultsofarandomizedexperimentdonebyJeremyGroves,whowantedtoknowifbicycleweightaffectedhiscommutetowork.• For56days(JanuarytoJuly)Grovestossedacointodecideifhewouldbikethe27milestoworkonhiscarbonframebike(20.9lbs)orsteelframebicycle(29.75lbs).• Herecordedthecommutetimeforeachtrip.

BicyclingtoWork• Whataretheobservationalunits?• Eachtriptoworkonthe56differentdays.

• Whataretheexplanatoryandresponsevariables?• ExplanatoryiswhichbikeGrovesrode(categorical–binary)• Responsevariableishiscommutetime(quantitative)

BicyclingtoWork• Nullhypothesis: Commutetimeisnotaffectedbywhichbikeisused.• Alternativehypothesis: Commutetimeisaffectedbywhichbikeisused.

BicyclingtoWork• Inchapter5weusedthedifferenceinproportions of“successes”betweenthetwogroups.• Nowwewillcomparethedifferenceinaverages betweenthetwogroups.• Theparametersofinterestare:• µcarbon =Longtermaveragecommutetimewithcarbonframedbike• µsteel =Longtermaveragecommutetimewithsteelframedbike.

BicyclingtoWork• µisthepopulationmean.Itisaparameter.• Usingthesymbolsµcarbon andµsteel,wecanrestatethehypotheses.

• H0: µcarbon =µsteel• Ha: µcarbon ≠µsteel .

BicyclingtoWorkRemember:• Thehypothesesareaboutthelongtermassociationbetweencommutetimeandbikeused,notjusthis56trips.• Hypothesesarealwaysaboutpopulationsorprocesses,notthesampledata.

BicyclingtoWork

Samplesize Samplemean SampleSD

Carbonframe 26 108.34min 6.25min

Steelframe 30 107.81min 4.89 min

BicyclingtoWork• Thesampleaverageandvariabilityforcommutetimewashigherforthecarbonframebike• Doesthisindicateatendency?• Orcouldahigheraveragejustcomefromtherandomassignment?PerhapsthecarbonframebikewasrandomlyassignedtodayswheretrafficwasheavierorweathersloweddownDr.Grovesonhiswaytowork?

BicyclingtoWork• Isitpossible togetadifferenceof0.53minutesifcommutetimeisn’taffectedbythebikeused?• ThesametypeofquestionwasaskedinChapter5forcategoricalresponsevariables.• Thesameanswer.Yesit’spossible,howlikelythough?

BicyclingtoWork• The3SStrategyStatistic:• Chooseastatistic:• Theobserveddifferenceinaveragecommutetimes

�̅�carbon – �̅�steel =108.34- 107.81=0.53minutes

BicyclingtoWorkSimulation:• Wecanimaginesimulatingthisstudywithindexcards.• Writeall56timeson56cards.

• Shuffleall56cardsandrandomlyredistributeintotwostacks:• Onewith26cards(representingthetimesforthecarbon-framebike)• Another30cards(representingthetimesforthesteel-framebike)

BicyclingtoWorkSimulation(continued):• Shufflingassumesthenullhypothesisofnoassociationbetweencommutetimeandbike• Aftershufflingwecalculatethedifferenceintheaveragetimesbetweenthetwostacksofcards.• Repeatthismanytimestodevelopanulldistribution• Let’sseewhatthislookslike

mean = 107.87mean = 108.34

CarbonFrameSteelFrame114116 123

113

113

118 109106111

119

mean = 108.27

108.27 – 107.87 = 0.40

103103 112

102

110

102 107100101

104

103105 106 102111

106108 106105 107

mean = 107.81

116116 118

113

113

113 105104110

109

111111 105

102

106

109 10898103

110

112102 106 102101

105

Shuffled Differences in Means

mean = 108.37mean = 108.27


113

113

118 109106111

119

mean = 107.69

107.69 – 108.37 = -0.68

103103 112

102

110

102 107100101

104

103105 106 102111

106108 106105 107

mean = 107.87

116116 118

113

113

113 105104110

109

111111 105

102

106

109 10898103

110

112102 106 102101

105


mean = 108.13mean = 107.69


113

113

118 109106111

119

mean = 107.97

107.97 – 108.13 = -0.16

103103 112

102

110

102 107100101

104

103105 106 102111

106108 106105 107

mean = 108.37

116116 118

113

113

113 105104110

109

111111 105

102

106

109 10898103

110

112102 106 102101

105


MoreSimulations-2.11

-1.20 -1.21-1.93 -1.53

-1.11

0.71-0.52

1.79

0.022.53 1.90

-0.98

0.81 0.551.89

-0.31

-2.50

0.38

-1.51

0.22

1.500.13

0.44

1.46

-0.64-1.10Nineteen of our 30 simulated statistics

were as or more extreme than our observed difference in means of 0.53, hence our estimated p-value for this null distribution is 19/30 = 0.63.


BicyclingtoWork• Using1000simulations,weobtainap-valueof72%.• Whatdoesthisp-valuemean?• Ifmeancommutetimesforthebikesarethesameinthelongrun,andwerepeatedrandomassignmentofthelighterbiketo26daysandtheheavierto30days,adifferenceasextremeas0.53minutesormorewouldoccurinabout72%oftherepetitions.• Therefore,wedonothavestrongevidencethatthecommutetimesforthetwobikeswilldifferinthelongrun.ThedifferenceobservedbyDr.Grovesisnotstatisticallysignificant.

BicyclingtoWork• HaveweproventhatthebikeGroveschoosesisnotassociatedwithcommutetime?(Canweconcludethenull?)• No,alargep-valueisnot“strongevidencethatthenullhypothesisistrue.”• Itsuggeststhatthenullhypothesisisplausible• Therecouldbeasmalllong-termdifference.Buttherealsocouldbenodifference.

BicyclingtoWork• Imaginewewanttogeneratea95%confidenceintervalforthelong-rundifferenceinaveragecommutingtime.• Sampledifferenceinmeans± 1.96⨯SEforthedifferencebetweenthetwomeans

• Fromsimulations,theSE=standarddeviationofthedifferences=1.47.• 0.53± 1.96(1.47)=0.53± 2.88• -2.35to3.41.• Whatdoesthismean?

BicyclingtoWork• Weare95%confidentthatthetruelongtermdifference(carbon– steel)inaveragecommutingtimesisbetween-2.41and3.47minutes.Thecarbonframedbikeisbetween2.41minutesfasterand3.47minutesslowerthanthesteelframedbike.• Doesitmakesensethattheintervalcontains0,basedonourp-value?

BicyclingtoWorkScopeofconclusions• Canwegeneralizeourconclusiontoalargerpopulation?• TwoKeyquestions:• Wasthesamplerandomlyobtainedandrepresentativeoftheoverallpopulationofinterest?• Wasthisanexperiment?Weretheobservationalunitsrandomlyassignedtotreatments?

BicyclingtoWork• Wasthesamplerepresentativeofanoverallpopulation?• WhataboutthepopulationofalldaysDr.Grovesmightbiketowork?• No,Grovescommutedonconsecutivedaysinthisstudyanddidnotincludeallseasons.

• Wasthisanexperiment?Weretheobservationalunitsrandomlyassignedtotreatments?• Yes,heflippedacoinforthebike.• Wecanprobablydrawcause-and-effectconclusionshere.

BicyclingtoWork• WecannotgeneralizebeyondGrovesandhistwobikes.• Alimitationisthatthisstudyisnotdouble-blind• Theresearcherandthesubject(whichhappenedtobethesamepersonhere)werenotblindtowhichtreatmentwasbeingused.• Dr.Grovesknewwhichbikehewasriding,andthismighthaveaffectedhisstateofmindorhischoiceswhileriding.

stat 13, intro. to statistical methods for the life and …frederic/13/f16/day09.pdf1. comparing 2...

Documents