orsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to...

20
hypothesis testing OR testing a claim (ch10) we will do the exact same calculations, but we will use it to answer a different question from before: ex) p = .17 (high cholesterol) in the sample, with 80 people: p < .12 ... what is the probability of this happening so far, we said - we know the population mean - we know the standard deviation OR - we know the population proportion ...but then why would we take a sample??? we will now flip our perspective we will take a "guess" about the population mean (or proportion) and consider it, given the results of the sample e.g. what if our sample had a proportion of .02 with high cholesterol....does it seem reasonable that the whole population has a proportion of .17 with high cholesterol? ex) recall that: p=.17 ... n=80 ...µ ^ =.17 ... σ ^ = .0420 ... ^ has a normal distribution find P(p < .02) hey! was p = .17 correct??

Upload: others

Post on 11-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

hypothesis testing OR testing a claim (ch10)

we will do the exact same calculations, but we will use it to answer a different question

from before:ex) p = .17 (high cholesterol)in the sample, with 80 people: p < .12 ... what is the probability of this happening

so far, we said- we know the population mean- we know the standard deviationOR- we know the population proportion

...but then why would we take a sample???

we will now flip our perspective

we will take a "guess" about the population mean (or proportion) and consider it, given the results of the sample

e.g. what if our sample had a proportion of .02 with high cholesterol....does it seem reasonable that the whole population has a proportion of .17 with high cholesterol?

ex) recall that: p=.17 ... n=80 ...µ^=.17 ... σ^ = .0420 ... ^ has a normal distributionfind P(p < .02)

hey! was p = .17 correct??

Page 2: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

hypothesis testing (formally)

1. a claim is made about the population2. get data from a sample3. do calculations and assess the plausibility of the claim

what is a claim? (note: here, we will make claims about µ or p

... could make lots of different claims...we will, but later)CLAIM:

p = valueex) p = .17 "i think that 17% of Americans have high cholesterol"

µ = value

ex) µ =40,000 "the average income in US is $40,000"µ > value

ex) µ>76 "the average basketball player is over 76"

consider the claim µ = 40000if the claim is wrong, then µ ≠ 40000 ... is one possibility

set up a hypothesis test....3 ways

hypothesis = claim

note: for technical reasons, Ho is always "="what about "≥" ... it gets confusing, so skip it

note that we identify the type of hypothesis test (and the picture) using H1

Page 3: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

Dick Vitale claims that the average college basketball player is over 76"what do you conclude about this claim, from your sample of 40 players with x = 77" (σ =

in a hypothesis test,your probability is calleda "level of confidence"

if µ=76, then the probability that x is 77" or more is .0008 ... thats unlikelyso probably, µ > 76how probably?

.9992 probably

conclusion:we are .9992 confident that µ > 76

(99.92% confident)

Page 4: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) pollution level in a riveran environmental group takes 30 measurements from a river, and they get an average pollution level of 4.5 cc/L of a certain pollutant (σ = 1.2)they want to claim that the average pollution level of the river exceeds the EPA regulation of 4 cc/Lwhat can you conclude about the claim, and at what confidence?

heres the idea:

Page 5: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) we want to to find out if the average American family has more than 1.8 kids. (because if it does, then there is a strain on the school system)we take a sample of 500 families, and we get x = 1.92 σ = .9what can we conclude?

conclusion:confidence:we are .9986 (99.86%) confident that the average number of kids in an American family is greater than 1.8p-value: we conclude that the average number of kids is above 1.8 with p-value = .0014accept/rejectwe reject Ho, accept H1 with .9986 confidence

note: strongest results are "100% confidence" or "pvalue=0"

note: the industry standard for results:you want confidence of .95 (95%) or more ... p-value of .05, or less

Q: in this example, what can you conclude at 95% confidence? (or 5% significance)A: at 95% confidence, reject Ho & accept H1

note that .9986 confidence beats .95 confidence .0014 p-value beats .05 significance

terminology:z-score from data is called test statisticz-score from probability is called critical valuesignificance (notation: α "alpha")confidencep-value = 1 - (your confidence) [its the shaded area]reject/accept

you reject Ho and accept H1 if: (your confidence) > (requested confidence)(your p-value) < (requested significance)

Page 6: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) a drug company is trying to make 200mg pillsthey want to make sure there are exactly 200mg in each pillthey take a sample of 500 pills, and find that x = 199.88 σ = .9is the average pill different from 200mg? what can they conclude?

conclusion:the average pill is different from 200mg with .9972 condfidence[pvalue = .0028]reject Ho, accept H1 with 99.72% confidence

note: we never "accept Ho" if we think the mean is 200 (µ=200), well it could also be 200.0001the best we can do is support the claim that it is not equal or less than or greater than (≠, <, > )

Q: what can you conclude at .999 confidence (or .001 significance)?A: at .999 confidence, do not reject Ho, do not accept H1

Page 7: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) is the average fashion model under 102lbs?we asked 50 models, and their average weight was 97lbs. (σ = 11 lbs)

what can we conclude?

conclusion:we conclude that the average model weighs less than 102lbs with 99.93% confidence[p-value = .0007]OR:reject Ho (and accept H1) with 99.93% confidence

Page 8: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

hypothesis test about µ, what if we dont know the population st.dev σ ?

we have to use the sample standard deviation s instead...but theres a price to paythe distribution is no longer normal

..so what is the distribution? and how do we find probabilities?

the distribution of x in this case is a Student t distribution..."t distribution"

its similar to the normal distribution- symmetric- middle value is 0- graph kind of looks like a bell curvewrinkle: the distribution changes depending on nwe need "degrees of freedom" = df = n-1

note: the t-distribution is similar to the normal distribution in shape, but it is not the same

note: Excel does not like negative t-valuesenter the positive value, the answer is the same due to symmetry

ex) P(t > 2.3) n=47

ex) P(t > 3.29) n=33

ex) P(t < -2.73) n=58

Page 9: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) how many M&Ms in a bag?supposed to be 1000, but I claim its more than that.sample 80 bags, get x = 1036 s = 97.3what can you conclude?

note:when we use s instead of σ, s replaces it in our formulas

note: Excel does NOT take negative t-scores

note: the t (or z) value which you calculate from your datais known as the test statistic

conclusion:we are 99.93% confident that the average number of M&Ms in a bag is more than 1000 [p-value = .0007]

Page 10: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

sometimes, you will be asked "what can you conclude at 98% confidence?" OR "what can you conclude at 2% significance?"

ex) we are 99.93% confident that the average number of M&Ms in a bag is more than 1000 [p-value = .0007]are you 98% confident? ....yesat 2% significance (or at 98% confidence), reject Ho, accept H1, the average number of M&Ms is more than 1000

ex) we are 89.94% confident that p > .5[p-value = .1006 or 10.06%]are we 98% confident? ....no, we cannot conclude at 98% confidence (or 2% significance) that p > .5 - do not reject Ho, do not accept H1

Compare

normal distribtution t-distribution

z tσ s

test statistic z= x - µ- t= x - µ-

σ- s- standard σ- = σ s- = sdeviation n n

Excel =normsdist(z) =tdist(t,df,#tails)gives area to the left gives area in tail(s)

when have σ dont have σ, have sto use? large sample size small sample size

Page 11: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) does the average person own more than 3 hats?you survey 15 people, who respond: 2 5 0 8 3 3 4 3 9 1 2 2 4 12 5what can you conclude, and at what confidence?

...what do you need?

you can find the mean and standard deviation using Excel:type the data values into Excel, then...=average( =stdev( ...then highlight your data values

followup:at 10% significance, do you reject H0?

at 5% significance, do you reject H0?

conclusion: we are 91.41% confident that µ>3, the average person owns more than 3 hats. [p-value = .0859]

Page 12: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) H0: µ=4H1: µ<4x = 3.2n=24for the population, σ = 1.3, normal distribution...is x normal?

ex)H0: µ= 42H1: µ< 42from the sample of 45 people, the mean is 40.6 and the standard deviation is 5.1what can you conclude?

Page 13: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) H0 : p = .62H1 : p < .62p = .54 ... n = 300

whats your conclusion about the claim, and at what confidence?

10.4 hypothesis testing...with proportions

note that the steps are the same, but some of the formulas are slightly different

Page 14: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) does the public believe the death penalty should be used?we want to find out if a majority of Americans support the death penaltywe survey 120 people, and 67 say yes.what can we conclude, and at what confidence?

conclusion:we are 89.97% confident that a majority of Americans support the death penaltyp > .5 with.8997 confidence[p-value = .1003, or 10.03%]reject Ho, accept H1 with .8997 confidence

follow-up question: what can you conclude at 5% significance?

Page 15: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) do more than 60% of people support legalizing gay marriage?out of 180 people surveyed, 119 said yes.what can you conclude, and at what confidence?

conclusion: we are 95.25% confident that p > .6(more than 60% support legalizing gay marriage)reject Ho, accept H1 with .9525 confidencep-value = .0475

Page 16: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

10.6 summary of ch10

we have to identify, given a problem, what approach to use

ex) the Health and Safety Board recommends at most 4 hours of tv each day. you think people watch more. you survey 130 people, who watch an average of 4.5 hours a day, with a standard deviation of 1.7what can you conclude, and at what confidence?

ex) you need to know if proposition 285 will pass with 2/3 of the vote.you survey 200 people and 141 say yeswhat can you conclude, and at what confidence?

ex) gasoline regulations state that gas sold at the pump should contain exactly .18 liters of ethanol in each liter of gas. you think it might be different. you survey 65 gas stations and find an average of .161 liters with a standard deviation of .06what can you conclude and at what confidence?

ex) what are the average number of large-screen tv's sold at a Best Buy store yearly?management insists they sell more than 100 tvs. 45 stores are surveyed, and they sell an average of 107.3 (in the industry, σ = 25.2)what can you conclude at 95% confidence?

ex) the US RDA for calcium is 1000mg. the Dairy Food Association is concerned that teens are not getting enough. they conduct a survey of 500 teens, who consume an average of 989mg of calcium. from the sample, standard deviation s = 110what can you conclude about the DFA claim, and at what confidence?

ex) is the average fashion model under 102lbs?we asked 50 models, and their average weight was 97lbs. (σ = 11 lbs)what can we conclude?

Page 17: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) the Health and Safety Board recommends at most 4 hours of tv each day. you think people watch more. you survey 130 people, who watch an average of 4.5 hours a day, with a standard deviation of 1.7what can you conclude, and at what confidence?

we are .9995 confident that µ > 4reject Ho, accept H1 with .9995 confidence

Page 18: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

we are 87.08% confident that proposition 285 will pass with 2/3 of the votereject Ho, accept H1 with .8708 confidencepvalue = .1292

ex) you need to know if proposition 285 will pass with 2/3 of the vote.you survey 200 people and 141 say yeswhat can you conclude, and at what confidence?

Page 19: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends

ex) gasoline regulations state that gas sold at the pump should contain exactly .18 liters of ethanol in each liter of gas. you survey 65 gas stations and find an average of .161 liters with a standard deviation of .06what can you conclude and at what confidence?

ex) what are the average number of large-screen tv's sold at a Best Buy store yearly?management insists they sell more than 100 tvs. 45 stores are surveyed, and they sell an average of 107.3 (in the industry, σ = 25.2)what can you conclude at 95% confidence?

on the exam you will be given this:

Excel commands:

NormdistNormsdistNorminvNormsinvTdistTinvAverageStdevsqrt

Page 20: ORsocrates.bmcc.cuny.edu/jsamuels/math150/150lecture-ch10.pdf · 10.6 summary of ch10 we have to identify, given a problem, what approach to use ex) the Health and Safety Board recommends