chapter 21 – more about tests
DESCRIPTION
Chapter 21 – More About Tests. Review of Hypothesis Testing. Claim made about population proportion Set up H 0 and H A Choose model and check conditions One-proportion z-test Draw Normal curve and find P-value State conclusion in context of problem. More on the Null Hypothesis. - PowerPoint PPT PresentationTRANSCRIPT
Chapter 21 – More About Tests
Review of Hypothesis Testing
Claim made about population proportion
1. Set up H0 and HA 2. Choose model and check conditions
One-proportionz-test
3. Draw Normal curve and find P-value4. State conclusion in context of problem
0ˆ( )ˆ( )
p pzSD p
0 0ˆ( ) p qSD p
n
More on the Null Hypothesis
The null hypothesis is always a statement about the population proportion.
Since we can never really prove the null hypothesis, usually it’s better to have what you want to show be true as the alternative hypothesis Then you can reject the null hypothesis in favor of the
alternative hypothesis
More on P-values
P-value is a conditional probability: The probability that we will get results at least as unusual as the ones
we saw given that the null hypothesis true.
The lower the P-value, the more confident we are in rejecting the null hypothesis
A high P-value means we aren’t surprised by what we observed. Doesn’t prove null hypothesis, but gives no reason to reject it
Essentially the P-value is the probability that the findings were due to random sampling variation or chance
Example: Reading Program
A new reading program may reduce the number of elementary school students who read below grade level. Statistical analysis of the results of a large-scale test showed that the percentage of students who did not attain the grade-level standard was reduced from 15.9% to 15.1%. The hypothesis that the new reading program produced no improvement was rejected with a P-value of 0.023
Explain what the P-value means. There’s only a 2.3% probability of seeing a sample proportion as
low as 15.1% by natural sampling variation, if the true percentage of children who did not attain the grade-level standard is 15.9%
Would you recommend the reading program to your local school? Example from DeVeaux, Intro to Stats
Alpha Levels
We talked about wondering how low a P-value we need to decide to reject the null hypothesis
We can use an alpha level or to set a threshold on our P-value Alpha level is also called the significance level
If our P-value is less than our alpha level, we will reject the null hypothesis
We would then say that the results are statistically significant
More on Alpha Levels
Alpha levels are represented using the symbol α
Typically we use α = 0.1, 0.05, or 0.01
When in doubt, we use α = 0.05
Partially depends on importance of claim being made The more important the claim or higher the stakes, the
lower an alpha level you would use
Statistically Significant
When we get a P-value below our alpha level (let’s assume 0.05), we can say “we reject the null hypothesis at the 5% level of significance”
Sometimes, statistical significance doesn’t mean the difference is important in the context of the situation
On the other hand, sometimes a significant difference may turn out to not be statistically significant Sometimes a larger sample size can rectify this
One-sided Alternative Hypothesis
We might have a one-sided or two-sided alternative hypothesis
one-sided
Figure from DeVeaux, Intro to Stats
Two-sided Alternative Hypothesis
two-sided
Notice we have α/2 in each tail
Figure from DeVeaux, Intro to Stats
Critical Values for Hypothesis Testing
Just like we used critical values in confidence intervals, we will use them with alpha levels
We could always get these from our z-table, but they are used commonly and will be provided from here on out.
If our z-score is more extreme than the critical value, then we will have a P-value smaller than our alpha level
Table from DeVeaux, Intro to Stats
Confidence Intervals and Hypothesis Tests
Confidence intervals and hypothesis tests are built on the same calculations with the same assumptions and conditions
Our conclusion about the null should be consistent with whether or not the proportion in the claim falls within the confidence interval
A 95% confidence interval corresponds with a two-sided hypothesis test with α = 5%
Example: Is Euro a fair coin?
Soon after the Euro was introduced as currency in Europe, it was widely reported that someone had spun a Euro 250 times and gotten heads 140 times. Estimate the true proportion of heads using a 95% confidence interval.
(remember to check conditions)
Does your confidence interval provide evidence that the coin is unfair when spun? Explain.
What is the significance level?
Example from DeVeaux, Intro to Stats
* (.56)(.44): .56 1.96 .56 .062250
:(.488,.622)
pqCI p zn
CI
Errors in Hypothesis Testing
Even with our careful analysis and lots of evidence, we can make an incorrect decision.
Two ways we can make mistakes with hypothesis testing:
Type I: null hypothesis is true, but we reject itType II: null hypothesis is false, but we fail to reject it
Which error is more serious depends on the situation.
Type I Error
In medical terms, this would be a false positive A healthy person is diagnosed with a disease
incorrectly Penalty for mistake?
In jury terms, this would mean an innocent person is convicted Penalty for mistake?
Setting α determines the probability of a Type I Error
Type II Error
In medical terms, this would be a false negative An infected person goes undiagnosed Penalty for mistake?
In jury terms, this would mean an guilty person is not convicted Penalty for mistake?
Much more difficult to determine the probability of a Type II Error (designated by β)
Example: Spam Filter
Suppose a spam filter uses a point system to score each email based on sender, subject, and keywords. The higher the point total, the more likely that the message is spam. We can think of the filter’s decision as a hypothesis test. The null hypothesis is that the email is a real message. A high point score would be evidence that it is junk and will therefore reject the null hypothesis and classify it as spam.
When the filter allows spam to slip through into your inbox, which kind of error is this?
Which kind of error is it when a real message gets classified as junk?
If the filter has a default cutoff score of 50 , but you reset it to 60, is that analogous to choosing a higher or lower value of α for a hypothesis test?
Reducing Errors
We can reduce α to lower the chance of a Type I Error, but then that will have the effect of raising β
The only way to really reduce both Type I and Type II errors simultaneously is to increase our sample size, which will reduce our standard deviations.
Hypothesis Testing in Minitab
Choose Stat > Basic Statistics > 1 Proportion
Hypothesis Testing in Minitab (cont’d)
Fill in text fields and check Perform Hypothesis Test checkbox
# successes
check sample size
proportionin claim
(Uncheckedgives confidence interval only)
Click Options button and then choose alternative and set confidence level
confidence
level alternative
hypothesis(not equal gives CI)
check
Hypothesis Testing in Minitab (cont’d)
Click OK and then OK again
You will see results in session window: Sample proportion Z-Value P-Value
Hypothesis Testing in Minitab (cont’d)