hypothesis testing is it statistically significant? most of the slides are adopted from:...

17
Hypothesis testing Is it statistically significant? Most of the slides are adopted from: http://wweb.uta.edu/insyopma/baker/STATISTICS/Keller7/Keller%20PP%20slides-7/Chapter11

Upload: josue-madison

Post on 16-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Hypothesis testing

Is it statistically significant?

Most of the slides are adopted from: http://wweb.uta.edu/insyopma/baker/STATISTICS/Keller7/Keller%20PP%20slides-7/Chapter11.ppt

Page 2: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Example

• In a trial a jury must decide between two hypotheses.– Null hypothesis H0: The defendant is innocent

– Alternative hypothesis H1: The defendant is guilty

• The jury must make a decision on the basis of evidence presented.

• In the language of statistics convicting the defendant is called rejecting the null hypothesis in favor of the alternative hypothesis. That is, the jury is saying that there is enough evidence to conclude that the defendant is guilty.

• If the jury acquits, then it is stating that there is not enough evidence to support the alternative hypothesis.

Page 3: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Goal

• Assess the evidence provided by data in favor of some claim (null hypothesis) about the population.

• The procedure begins with the assumption that the null hypothesis is true.

• The goal is to determine whether there is enough evidence to infer that the alternative hypothesis is true, or the null is not likely to be true.

• Results of the test are expressed in terms of a probability– This probability measures how well the data and the hypothesis agree.

Page 4: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Example usage in our work

• Usually we need to find if the performance of our proposed algorithm is statistically superior than that of a base line algorithm.

• General procedure– For i=1:n

• Generate a random train/test dataset.• Run both algorithms and get performance numbers.• Record the difference between the performance of the proposed and base line

algorithms. Let it be x.

– Null hypothesis H0 : x <= 0 i.e. base line better than proposed algorithm.

– Alternate hypothesis H1 : x > 0 Our algorithm is better.

• We want to reject H0 hypothesis in favor of H1 with certain confidence value.

Page 5: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Statistical Example

• Consider mean demand for computers during assembly lead time. The operations manager wants to know whether the mean is different from 350 units.

• Our test statistics is over mean demand.• Null hypothesis: H0 = 350• Our alternative hypothesis becomes: H1 ≠ 350• Assume σ = 75 and the sample size n = 25, and the sample

mean is calculated to be 370.16.• A large value of (say, 600) provides enough evidence. • If is close to 350 (say, 355), then this does not provide a

great deal of evidence to infer that the population mean is different than 350.

Page 6: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Statistical Example

• Assume H0 : μ = 350 is true.

• Determine the sampling distribution of sample mean .• Assume has mean 350.• Variance of is 75/sqrt(25) = 15

Is the Sample Mean in the Guts of the Sampling Distribution?

Page 7: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Hypothesis Test 1: Unstandardized test• For the guts to be the center 95% of the distribution [α =

0.05].• The critical values that define the guts will be 1.96 standard

deviations of X-Bar

UCV = 350 + 1.96*15 = 379.4LCV = 350 – 1.96*15 = 320.6

Page 8: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Hypothesis Test 2: Z-score

• Calculate z score of the sample mean.• Z = ( - )/ = (370.16 – 350)/15 = 1.344• Is this Z-Score in the guts of the sampling distribution?

Page 9: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Hypothesis Test 3: p-value• Increase “Rejection Region” until it “captures” sample mean.

– P( > 370.16) = P(Z > 1.344) = 0.0901– p-value = double of this area for two tailed test = 0.1802

• Since rejection region is defined to be 5% and our sample mean is in the 18.02% region, it in not in rejection region.

Page 10: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Statistical Conclusions

• Unstandardized Test Statistic– Since LCV (320.6) < (370.16) < UCV (379.4), we reject the null

hypothesis at a 5% level of significance.

• Standardized Test Statistic– Since -Z/2(-1.96) < Z(1.344) < Z/2 (1.96), we fail to reject the null

hypothesis at a 5% level of significance.

• P-value– Since p-value (0.1802) > 0.05 [], we fail to reject the null

hypothesis at a 5% level of significance

Page 11: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Interpreting the p-valueOverwhelming Evidence(Highly Significant)

Strong Evidence(Significant)

Weak Evidence(Not Significant)

No Evidence(Not Significant)

0 .01 .05 .10

Page 12: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Assumptions

• Distribution of test statistics follows standard deviation.– Generally one appeals to the central limit theorem to justify

assuming that a test statistic varies normally. – Central limit theorem (CLT): mean (x bar) of a sufficiently large

number of independent random variables (x), each with finite mean and variance, will be approximately normally distributed.

– If the variation of the test statistic is strongly non-normal, a Z-test should not be used.

• should be known or can be estimated reliably. – when is same for each x and all x are independent

from each other. [Follows from var(c.x) = c2 var(x) and var (x1 + x2) = var (x1)+ var (x2) if x1 and x2 are independent].

• A Z-test is appropriate when you are handling moderate to large samples (n > 30).

Page 13: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Hypothesis Test 4: Student’s t-test

• The test statistic (x bar) follows a Student’s T-distribution if the null hypothesis is true.

• T-test is appropriate for small sample size (n < 30).• Great if the populations’ standard deviation is unknown.• s= sample standard deviation.• Once a t value is determined, a p-value can be found using a

table of values from Student's t-distribution• The Student t-distribution is symmetric and bell-shaped, like

the normal distribution, but has heavier tails

Page 14: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Student’s t-test

• Two sample t-test can be used to compare means of two independent and identically distributed samples.

Page 15: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Hypothesis Test 5: Wilcoxon Signed Test

• A non-parametric statistical hypothesis test.• Alternative to the paired Student's t-test when the population

cannot be assumed to be normally distributed.• H0 : μ = 0

• Rank xis based on their magnitude. Let their rank be Ri.

• Wilcoxon signed rank statistics φi = I(xi > 0)

• Score• Find the critical value for the given sample size n and the

wanted confidence level. • Compare S to the critical value, and reject H0 if S is less than

or is equal to the critical value.

Page 16: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Thank you

Page 17: Hypothesis testing Is it statistically significant? Most of the slides are adopted from: 20PP%20slides-7/Chapter11.ppt

Two types of Errors

• Type I error– When we reject a true null hypothesis. That

is, jury convicts an innocent person. – P(Type 1 error) = α [usually 0.05 or 0.01]

• Type II error– When we don’t reject a false null hypothesis.

That is, when a guilty defendant is acquitted.– P(Type 2 error) = β

H0 T F

Reject I

Reject II

• The two probabilities are inversely related. Decreasing one increases the other, for a fixed sample size.