stat 13, intro. to statistical methods for the life and …frederic/13/f16/day09.pdf1. comparing 2...
TRANSCRIPT
Stat 13, Intro. to Statistical Methods for the Life and Health Sciences.
1. Comparing 2 proportions, and smoking and gender example continued.2. 5 number summary, IQR, and Geysers.3. Comparing two means with simulations, and the bicycling example. Read ch6.
NO LECTURE THU NOV 3! Review for the midterm will be Nov 1.Recall there is also no lecture or office hour Tue Nov 8. http://www.stat.ucla.edu/~frederic/13/F16 .Bring a PENCIL and CALCULATOR and any books or notes you want to the midterm and final. HW3 is due Tue Nov 1. 4.CE.10, 5.3.28, 6.1.17, and 6.3.14. In 5.3.28d, use the theory-based formula. You do not need to use an applet.
1
HW3 is due Thu Nov 3. 4.CE.10, 5.3.28, 6.1.17, and 6.3.14.
4.CE.10 starts out "Studies have shown that children in the U.S. who have been spanked have a significantly lower IQ score on average...."
5.3.28 starts out "Recall the data from the Physicians' Health Study: Of the 11,034 physicians who took the placebo ...."
6.1.17 starts out "The graph below displays the distribution of word lengths ...."
6.3.14 starts out "In an article titled 'Unilateral Nostril Breathing Influences Lateralized Cognitive Performance' that appeared ...."
2
Aclarificationontheformulas• Themarginoferrorforthedifferenceinproportionsis
𝑀𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑒𝑟 ⨯ SE,whereSE = 345(78345):5
+ 34<(7834<):<
�
Intesting,thenullhypothesisisnodifferencebetweenthetwogroups,soweusedtheSE
�̂�(1 − �̂�)𝑛7
+�̂�(1 − �̂�)
𝑛B
�
where�̂� istheproportioninbothgroupscombined.But
inCIs,weusetheformula 345(78345):5
+ 34<(7834<):<
�
becausewearenotassuming�̂�7 =�̂�BwithCIs.
SmokingandGender• Ourstatisticistheobservedsampledifferenceinproportions,0.097.
• Pluggingin1.96 ⨯ 345(78345):5
+ 34<(7834<):<
� =0.044,
weget0.097± 0.044asour95%CI.• Wecouldalsowritethisintervalas(0.053,0.141).• Weare95%confidentthattheprobabilityofaboy
babywhereneitherfamilysmokesminustheprobabilityofaboybabywherebothparentssmokeisbetween0.053and0.141.
SmokingandGender
• Howwouldtheintervalchangeiftheconfidencelevelwas99%?
• TheSE= 345(78345):5
+ 34<(7834<):<
� =.0224.
• Previously,fora95%CI,itwas0.097± 1.96x.0224=0.097± 0.044.• Fora99%CI,itis0.097± 2.576x.0224=0.097± 0.058.
SmokingandGender• Writtenasthestatistic± marginoferror,the99%CIforthedifferencebetweenthetwoproportionsis
0.097± 0.058.• Marginoferror• 0.058forthe99%confidenceinterval• 0.044forthe95%confidenceinterval
SmokingandGender• Howwouldthe95%confidenceintervalchangeifwewereestimating
𝜋smoker – 𝜋nonsmoker
insteadof𝜋nonsmoker – 𝜋smoker?
SmokingandGender• (−0.141,−0.053)or −0.097± 0.044insteadof• (0.053,0.141)or 0.097± 0.044.
• Thenegativesignsindicatetheprobabilityofaboyborntosmokingparentsislowerthanthatfornonsmokingparents.
SmokingandGenderValidityConditionsofTheory-Based• Sameaswithasingleproportion.• Shouldhaveatleast10observationsineachofthecellsofthe2x2table.
SmokingParents Non-smokingParents
Total
Male 255 1975 2230Female 310 1627 1937Total 565 3602 4167
SmokingandGender• Thestrongsignificantresultinthisstudyyieldedquiteabitofpresswhenitcameout.• Soonotherstudiescameoutwhichfoundnorelationshipbetweensmokingandgender(Parazinni etal.2004,Obel etal.2003).• James(2004)arguedthatconfoundingvariableslike socialfactors,diet,environmentalexposureorstresswerethereasonfortheassociationbetweensmokingandgenderofthebaby.Theseareallconfoundedsinceitwasanobservationalstudy.Differentstudiescouldeasilyhavehaddifferentlevelsoftheseconfoundingfactors.
2.Fivenumbersummary,IQR,andgeysers.
6.1: Comparing Two Groups: Quantitative Response6.2: Comparing Two Means: Simulation-Based Approach6.3: Comparing Two Means: Theory-Based Approach
Section 6.1
ExploringQuantitativeData
Quantitativevs.CategoricalVariables
• Categorical• Valuesforwhicharithmeticdoesnotmakesense.• Gender,ethnicity,eyecolor…
• Quantitative• Youcanaddorsubtractthevalues,etc.• Age,height,weight,distance,time…
GraphsforaSingleVariable
Categorical
Quantitative
Bar Graph Dot Plot
ComparingTwoGroupsGraphically
Categorical
Quantitative
NotationCheckStatistics� �̅� Samplemean� �̂� Sampleproportion.
Parameters� 𝜇 Populationmean� 𝜋 Population
proportionorprobability.
Statistics summarize a sample and parameters summarize a population
Quartiles• Suppose25%oftheobservationsliebelowacertainvaluex.Thenxiscalledthelowerquartile(or25th percentile).• Similarly,if25%oftheobservationsaregreaterthanx,thenxiscalledtheupperquartile (or75thpercentile).• Thelowerquartilecanbecalculatedbyfindingthemedian,andthendeterminingthemedianofthevaluesbelowtheoverallmedian.Similarlytheupperquartileismedian{xi :xi> overallmedian}.
IQRandFive-NumberSummary• Thedifferencebetweenthequartilesiscalledtheinter-quartilerange (IQR),anothermeasureofvariabilityalongwithstandarddeviation.• Thefive-numbersummary forthedistributionofaquantitativevariableconsistsoftheminimum,lowerquartile,median,upperquartile,andmaximum.• TechnicallytheIQRisnottheinterval(25thpercentile,75thpercentile),butthedifference75th percentile– 25th .• Differentsoftwareusedifferentconventions,butwewillusetheconventionthat,ifthereisarangeofpossiblequantiles,youtakethemiddleofthatrange.• Forexample,supposedataare1,3,7,7,8,9,12,14.• M=7.5,25th percentile=5,75th percentile=10.5.IQR=5.5.
IQRandFive-NumberSummary• Formediansandquartiles,wewillusetheconvention,ifthereisarangeofpossibilities,takethemiddleoftherange.• InR,thisistype=2.type=1meanstaketheminimum.• x=c(1,3,7,7,8,9,12,14)• quantile(x,.25,type=2)##5.5• IQR(x,type=2)##5.5• IQR(x,type=1)##6.Canyouseewhy?
• Forexample,supposedataare1,3,7,7,8,9,12,14.• M=7.5,25th percentile=5,75th percentile=10.5.IQR=5.5.
GeyserEruptionsExample6.1
OldFaithfulInter-EruptionTimes
• Howdothefive-numbersummaryandIQRdifferforinter-eruptiontimesbetween1978and2003?
OldFaithfulInter-EruptionTimes
• 1978IQR=81– 58=23• 2003IQR=98– 87=11
Boxplots
Min Qlower Med Qupper Max
Boxplots(Outliers)• Adatavaluethatismorethan1.5× IQRabovetheupperquartileorbelowthelowerquartileisconsideredanoutlier.
• Whentheseoccur,thewhiskersonaboxplotextendouttothefarthestvaluenotconsideredanoutlierandoutliersarerepresentedbyadotoranasterisk.
CancerPamphletReadingLevels
• Shortetal.(1995)comparedreadinglevelsofcancelpatientsandreadabilitylevelsofcancerpamphlets.Whatisthe:• Medianreadinglevel?• Meanreadinglevel?
• Arethedataskewedonewayortheother?
• Skewedabittotheright• Meantotherightofmedian
3.ComparingTwoMeans:Simulation-BasedApproachandbicyclingtoworkexample.Section 6.2
Comparisonwithproportions.
• Wewillbecomparingmeans,muchthesamewaywecomparedtwoproportionsusingrandomizationtechniques.• Thedifferencehereisthattheresponsevariableisquantitative(theexplanatoryvariableisstillbinarythough).Soifcardsareusedtodevelopanulldistribution,numbersgoonthecardsinsteadofwords.
BicyclingtoWorkExample6.2
BicyclingtoWork• Doesbicycleweightaffectcommutetime?• BritishMedicalJournal(2010)presentedtheresultsofarandomizedexperimentdonebyJeremyGroves,whowantedtoknowifbicycleweightaffectedhiscommutetowork.• For56days(JanuarytoJuly)Grovestossedacointodecideifhewouldbikethe27milestoworkonhiscarbonframebike(20.9lbs)orsteelframebicycle(29.75lbs).• Herecordedthecommutetimeforeachtrip.
BicyclingtoWork• Whataretheobservationalunits?• Eachtriptoworkonthe56differentdays.
• Whataretheexplanatoryandresponsevariables?• ExplanatoryiswhichbikeGrovesrode(categorical–binary)• Responsevariableishiscommutetime(quantitative)
BicyclingtoWork• Nullhypothesis: Commutetimeisnotaffectedbywhichbikeisused.• Alternativehypothesis: Commutetimeisaffectedbywhichbikeisused.
BicyclingtoWork• Inchapter5weusedthedifferenceinproportions of“successes”betweenthetwogroups.• Nowwewillcomparethedifferenceinaverages betweenthetwogroups.• Theparametersofinterestare:• µcarbon =Longtermaveragecommutetimewithcarbonframedbike• µsteel =Longtermaveragecommutetimewithsteelframedbike.
BicyclingtoWork• µisthepopulationmean.Itisaparameter.• Usingthesymbolsµcarbon andµsteel,wecanrestatethehypotheses.
• H0: µcarbon =µsteel• Ha: µcarbon ≠µsteel .
BicyclingtoWorkRemember:• Thehypothesesareaboutthelongtermassociationbetweencommutetimeandbikeused,notjusthis56trips.• Hypothesesarealwaysaboutpopulationsorprocesses,notthesampledata.
BicyclingtoWork
Samplesize Samplemean SampleSD
Carbonframe 26 108.34min 6.25min
Steelframe 30 107.81min 4.89 min
BicyclingtoWork• Thesampleaverageandvariabilityforcommutetimewashigherforthecarbonframebike• Doesthisindicateatendency?• Orcouldahigheraveragejustcomefromtherandomassignment?PerhapsthecarbonframebikewasrandomlyassignedtodayswheretrafficwasheavierorweathersloweddownDr.Grovesonhiswaytowork?
BicyclingtoWork• Isitpossible togetadifferenceof0.53minutesifcommutetimeisn’taffectedbythebikeused?• ThesametypeofquestionwasaskedinChapter5forcategoricalresponsevariables.• Thesameanswer.Yesit’spossible,howlikelythough?
BicyclingtoWork• The3SStrategyStatistic:• Chooseastatistic:• Theobserveddifferenceinaveragecommutetimes
�̅�carbon – �̅�steel =108.34- 107.81=0.53minutes
BicyclingtoWorkSimulation:• Wecanimaginesimulatingthisstudywithindexcards.• Writeall56timeson56cards.
• Shuffleall56cardsandrandomlyredistributeintotwostacks:• Onewith26cards(representingthetimesforthecarbon-framebike)• Another30cards(representingthetimesforthesteel-framebike)
BicyclingtoWorkSimulation(continued):• Shufflingassumesthenullhypothesisofnoassociationbetweencommutetimeandbike• Aftershufflingwecalculatethedifferenceintheaveragetimesbetweenthetwostacksofcards.• Repeatthismanytimestodevelopanulldistribution• Let’sseewhatthislookslike
mean = 107.87mean = 108.34
CarbonFrameSteelFrame114116 123
113
113
118 109106111
119
mean = 108.27
108.27 – 107.87 = 0.40
103103 112
102
110
102 107100101
104
103105 106 102111
106108 106105 107
mean = 107.81
116116 118
113
113
113 105104110
109
111111 105
102
106
109 10898103
110
112102 106 102101
105
Shuffled Differences in Means
mean = 108.37mean = 108.27
CarbonFrameSteelFrame114116 123
113
113
118 109106111
119
mean = 107.69
107.69 – 108.37 = -0.68
103103 112
102
110
102 107100101
104
103105 106 102111
106108 106105 107
mean = 107.87
116116 118
113
113
113 105104110
109
111111 105
102
106
109 10898103
110
112102 106 102101
105
Shuffled Differences in Means
mean = 108.13mean = 107.69
CarbonFrameSteelFrame114116 123
113
113
118 109106111
119
mean = 107.97
107.97 – 108.13 = -0.16
103103 112
102
110
102 107100101
104
103105 106 102111
106108 106105 107
mean = 108.37
116116 118
113
113
113 105104110
109
111111 105
102
106
109 10898103
110
112102 106 102101
105
Shuffled Differences in Means
MoreSimulations-2.11
-1.20 -1.21-1.93 -1.53
-1.11
0.71-0.52
1.79
0.022.53 1.90
-0.98
0.81 0.551.89
-0.31
-2.50
0.38
-1.51
0.22
1.500.13
0.44
1.46
-0.64-1.10Nineteen of our 30 simulated statistics
were as or more extreme than our observed difference in means of 0.53, hence our estimated p-value for this null distribution is 19/30 = 0.63.
Shuffled Differences in Means
BicyclingtoWork• Using1000simulations,weobtainap-valueof72%.• Whatdoesthisp-valuemean?• Ifmeancommutetimesforthebikesarethesameinthelongrun,andwerepeatedrandomassignmentofthelighterbiketo26daysandtheheavierto30days,adifferenceasextremeas0.53minutesormorewouldoccurinabout72%oftherepetitions.• Therefore,wedonothavestrongevidencethatthecommutetimesforthetwobikeswilldifferinthelongrun.ThedifferenceobservedbyDr.Grovesisnotstatisticallysignificant.
BicyclingtoWork• HaveweproventhatthebikeGroveschoosesisnotassociatedwithcommutetime?(Canweconcludethenull?)• No,alargep-valueisnot“strongevidencethatthenullhypothesisistrue.”• Itsuggeststhatthenullhypothesisisplausible• Therecouldbeasmalllong-termdifference.Buttherealsocouldbenodifference.
BicyclingtoWork• Imaginewewanttogeneratea95%confidenceintervalforthelong-rundifferenceinaveragecommutingtime.• Sampledifferenceinmeans± 1.96⨯SEforthedifferencebetweenthetwomeans
• Fromsimulations,theSE=standarddeviationofthedifferences=1.47.• 0.53± 1.96(1.47)=0.53± 2.88• -2.35to3.41.• Whatdoesthismean?
BicyclingtoWork• Weare95%confidentthatthetruelongtermdifference(carbon– steel)inaveragecommutingtimesisbetween-2.41and3.47minutes.Thecarbonframedbikeisbetween2.41minutesfasterand3.47minutesslowerthanthesteelframedbike.• Doesitmakesensethattheintervalcontains0,basedonourp-value?
BicyclingtoWorkScopeofconclusions• Canwegeneralizeourconclusiontoalargerpopulation?• TwoKeyquestions:• Wasthesamplerandomlyobtainedandrepresentativeoftheoverallpopulationofinterest?• Wasthisanexperiment?Weretheobservationalunitsrandomlyassignedtotreatments?
BicyclingtoWork• Wasthesamplerepresentativeofanoverallpopulation?• WhataboutthepopulationofalldaysDr.Grovesmightbiketowork?• No,Grovescommutedonconsecutivedaysinthisstudyanddidnotincludeallseasons.
• Wasthisanexperiment?Weretheobservationalunitsrandomlyassignedtotreatments?• Yes,heflippedacoinforthebike.• Wecanprobablydrawcause-and-effectconclusionshere.
BicyclingtoWork• WecannotgeneralizebeyondGrovesandhistwobikes.• Alimitationisthatthisstudyisnotdouble-blind• Theresearcherandthesubject(whichhappenedtobethesamepersonhere)werenotblindtowhichtreatmentwasbeingused.• Dr.Grovesknewwhichbikehewasriding,andthismighthaveaffectedhisstateofmindorhischoiceswhileriding.