hypothesis testing

Hypothesis Testing A hypothesis is a claim or statement about a property of a

population (in our case, about the mean or a proportion of the population)

A hypothesis test (or test of significance) is a standard

procedure for testing a claim or statement about a property of a population.

It is extremely important to realize that we are not making

definitive conclusions. We are giving probabilistic conclusions. We are either concluding that the results we get are likely due to chance, or unlikely.

Examples If we flip a coin 100 times, and 52 come up heads, this could

easily occur by chance. There is not sufficient evidence to suggest that the coin is unfair.

If we flip a coin 100 times, and 75 come up heads, this would

be an extremely rare event if the coin was fair. The extremely low probability is evidence that the coin may not be fair.

Note: If would be very sloppy of us to conclude in the

second example that the coin is definitely unfair. Although extremely rare, 75 heads is still possible by chance from a fair coin.

Another ExampleA light bulb is advertised as having a mean life of 1000 hours.

From a sample, we find the mean life of our sample to be 900 hours. The 95% confidence interval for the population mean is 850 < μ < 1050 hours.

We CANNOT conclude:That the actual mean life of light bulbs is 900 hoursThat the advertised life is wrongThat the advertised life is correct

We CAN conclude:From our sample, we are 95% confident that the population mean is

between 850 hours and 1050 hours. Since 1000 hours is included in that interval, we do not have sufficient evidence to say that the advertised life is wrong.

Another approach

Claim: The mean life of light bulbs is less than 1000Working Assumption: The mean life of light bulbs is 1000The sample resulted in a mean life of 900Assuming that μ =1000, the probability that the mean of our sample

would be less than 900 is P( < 900) = 0.0951

There are two possible explanations for why our sample came out with a mean life of 900 hours. Either this occurred by chance (with probability 9.5%), or the actual mean life of light bulbs is less than 900. Since the probability (9.5%) isn’t horribly small, we decide that random chance is a reasonable explanation. There isn’t sufficient evidence to support the claim that the mean life of light bulbs is less than 1000 hours.

x

Formal Hypothesis TestingThe brief process

Convert your claim into a symbolic null and alternative hypothesis

Calculate a test statistic

Compare the test statistic to critical values OR Find a probability

Write a conclusion

Components of a Formal Hypothesis Test

The Null hypothesis (denoted H0) is a statement that

the value of a population parameter (such as proportion or mean) is equal to some claimed value.

The alternative hypothesis (denoted H1 or Ha) is a

statement that the value of a population parameter somehow differs from the null hypothesis. The symbolic form must be a >, < or ≠ statement.

We will be testing the null hypothesis directly (by assuming it’s true) to reach a conclusion to either reject H0 or fail to reject H0.

Note: We cannot support a claim that a parameter is equal to a value. So, the null hypothesis must always include equality, and the alternative hypothesis must be inequality.

Process

1. Identify the claim to be tested and express it in symbolic form.

2. Give the symbolic form that must be true when the original claim is false

3. Pick the one not including equality to be H1, and let the null hypotheses be that the parameter equals the value being considered.

Example

Claim: The mean IQ of statistics students is greater than 110.Symbolic form: μ > 110Opposite: μ ≤ 110

H0: μ = 110

H1: μ > 110

Note: While often your claim will be the alternative

hypothesis, it won’t always be.

Test StatisticsA test statistic is a value computed from the sample data,

used in making the decision whether or not to reject the null hypothesis.

Z value for proportion

Z value for mean (sigma known)

T value for mean (sigma unknown)

The test statistic indicates how far our sample deviates from the assumed population parameter.

npq

ppz

ˆ

n

xz

ns

xt

Critical region and significance

Critical region (or rejection region) is the set of all values of the test statistic that cause us to reject the null hypothesis.

Significance level (α) is the probability that the test

statistic will fall in the critical region when the null hypothesis is actually true. Common values are 0.01, 0.05 and 0.10

A Critical value is any value that separates the

critical region from values of the test statistic that would not cause us to reject the null hypothesis

Example

Using a significance level of α =0.05, lets find the critical value for each of these alternative hypotheses:

P ≠ 0.5: Critical region is in two tails of the normal distribution. Using the same method we used in chapter 6, we find the critical values to be z = -1.96 and z=1.96

P < 0.5: The critical region is in the left tail of the normal distribution. Using the methods from 5.2, we find c so P(z < c) = 0.05. The critical value is -1.645

P > 0.5: The critical region is in the left tail of the normal distribution. Using the methods from 5.2, we find c so P(z < c) = 0.95. The critical value is 1.645

P-Value

The P-value is the probability of getting a value of the test statistic that is at least as extreme as the one obtained for the sample data. If the P-value is very small (such as less than 0.05), we will reject the null hypothesis.

See pullout for help on how to calculate P-value. The exact process depends on your alternative hypothesis.

Decisions and Conclusions

Our final conclusion will always be one of these:

1. Reject the null hypothesis

2. Fail to reject the null hypothesis

Traditional Method

Reject H0 if the test statistic falls within the critical

region

Otherwise fail to reject the null hypothesis

Decisions and Conclusions

P-value method

Reject H0 if P-value ≤ α

Fail to reject if H0 > α

Less common methods

Find P-value, and leave conclusion to the reader

Look at whether population parameter falls in confidence interval estimate

Final Wording

If your original claim contains equality (became H0)

Reject H0: “There is sufficient evidence to warrant rejection of the claim that…”

Fail to Reject H0 : “There is not sufficient evidence to warrant rejection of the claim that…”

If your original claim does not contain equality (was H1)

Reject H0: “The sample data support the claim that…”

Fail to Reject H0 : “There is not sufficient sample evidence to support the claim that…”

Homework

7-2: 1-35 every other odd

Every odd recommended.

hypothesis testing

Documents