chapter 21 – more about tests

Chapter 21 – More About Tests

Review of Hypothesis Testing

Claim made about population proportion

1. Set up H0 and HA 2. Choose model and check conditions

One-proportionz-test

3. Draw Normal curve and find P-value4. State conclusion in context of problem

0ˆ( )ˆ( )

p pzSD p

0 0ˆ( ) p qSD p

n

More on the Null Hypothesis

The null hypothesis is always a statement about the population proportion.

Since we can never really prove the null hypothesis, usually it’s better to have what you want to show be true as the alternative hypothesis Then you can reject the null hypothesis in favor of the

alternative hypothesis

More on P-values

P-value is a conditional probability: The probability that we will get results at least as unusual as the ones

we saw given that the null hypothesis true.

The lower the P-value, the more confident we are in rejecting the null hypothesis

A high P-value means we aren’t surprised by what we observed. Doesn’t prove null hypothesis, but gives no reason to reject it

Essentially the P-value is the probability that the findings were due to random sampling variation or chance

Example: Reading Program

A new reading program may reduce the number of elementary school students who read below grade level. Statistical analysis of the results of a large-scale test showed that the percentage of students who did not attain the grade-level standard was reduced from 15.9% to 15.1%. The hypothesis that the new reading program produced no improvement was rejected with a P-value of 0.023

Explain what the P-value means. There’s only a 2.3% probability of seeing a sample proportion as

low as 15.1% by natural sampling variation, if the true percentage of children who did not attain the grade-level standard is 15.9%

Would you recommend the reading program to your local school? Example from DeVeaux, Intro to Stats

Alpha Levels

We talked about wondering how low a P-value we need to decide to reject the null hypothesis

We can use an alpha level or to set a threshold on our P-value Alpha level is also called the significance level

If our P-value is less than our alpha level, we will reject the null hypothesis

We would then say that the results are statistically significant

More on Alpha Levels

Alpha levels are represented using the symbol α

Typically we use α = 0.1, 0.05, or 0.01

When in doubt, we use α = 0.05

Partially depends on importance of claim being made The more important the claim or higher the stakes, the

lower an alpha level you would use

Statistically Significant

When we get a P-value below our alpha level (let’s assume 0.05), we can say “we reject the null hypothesis at the 5% level of significance”

Sometimes, statistical significance doesn’t mean the difference is important in the context of the situation

On the other hand, sometimes a significant difference may turn out to not be statistically significant Sometimes a larger sample size can rectify this

One-sided Alternative Hypothesis

We might have a one-sided or two-sided alternative hypothesis

one-sided

Figure from DeVeaux, Intro to Stats

Two-sided Alternative Hypothesis

two-sided

Notice we have α/2 in each tail

Figure from DeVeaux, Intro to Stats

Critical Values for Hypothesis Testing

Just like we used critical values in confidence intervals, we will use them with alpha levels

We could always get these from our z-table, but they are used commonly and will be provided from here on out.

If our z-score is more extreme than the critical value, then we will have a P-value smaller than our alpha level

Table from DeVeaux, Intro to Stats

Confidence Intervals and Hypothesis Tests

Confidence intervals and hypothesis tests are built on the same calculations with the same assumptions and conditions

Our conclusion about the null should be consistent with whether or not the proportion in the claim falls within the confidence interval

A 95% confidence interval corresponds with a two-sided hypothesis test with α = 5%

Example: Is Euro a fair coin?

Soon after the Euro was introduced as currency in Europe, it was widely reported that someone had spun a Euro 250 times and gotten heads 140 times. Estimate the true proportion of heads using a 95% confidence interval.

(remember to check conditions)

Does your confidence interval provide evidence that the coin is unfair when spun? Explain.

What is the significance level?

Example from DeVeaux, Intro to Stats

* (.56)(.44): .56 1.96 .56 .062250

:(.488,.622)

pqCI p zn

CI

Errors in Hypothesis Testing

Even with our careful analysis and lots of evidence, we can make an incorrect decision.

Two ways we can make mistakes with hypothesis testing:

Type I: null hypothesis is true, but we reject itType II: null hypothesis is false, but we fail to reject it

Which error is more serious depends on the situation.

Type I Error

In medical terms, this would be a false positive A healthy person is diagnosed with a disease

incorrectly Penalty for mistake?

In jury terms, this would mean an innocent person is convicted Penalty for mistake?

Setting α determines the probability of a Type I Error

Type II Error

In medical terms, this would be a false negative An infected person goes undiagnosed Penalty for mistake?

In jury terms, this would mean an guilty person is not convicted Penalty for mistake?

Much more difficult to determine the probability of a Type II Error (designated by β)

Example: Spam Filter

Suppose a spam filter uses a point system to score each email based on sender, subject, and keywords. The higher the point total, the more likely that the message is spam. We can think of the filter’s decision as a hypothesis test. The null hypothesis is that the email is a real message. A high point score would be evidence that it is junk and will therefore reject the null hypothesis and classify it as spam.

When the filter allows spam to slip through into your inbox, which kind of error is this?

Which kind of error is it when a real message gets classified as junk?

If the filter has a default cutoff score of 50 , but you reset it to 60, is that analogous to choosing a higher or lower value of α for a hypothesis test?

Reducing Errors

We can reduce α to lower the chance of a Type I Error, but then that will have the effect of raising β

The only way to really reduce both Type I and Type II errors simultaneously is to increase our sample size, which will reduce our standard deviations.

Hypothesis Testing in Minitab

Choose Stat > Basic Statistics > 1 Proportion

Hypothesis Testing in Minitab (cont’d)

Fill in text fields and check Perform Hypothesis Test checkbox

# successes

check sample size

proportionin claim

(Uncheckedgives confidence interval only)

Click Options button and then choose alternative and set confidence level

confidence

level alternative

hypothesis(not equal gives CI)

check


Click OK and then OK again

You will see results in session window: Sample proportion Z-Value P-Value


chapter 21 – more about tests

Documents