hypothesis test

Upload: musa-aman

Post on 04-Nov-2015

252 views

Category:

Documents


2 download

DESCRIPTION

Hypothesis Test

TRANSCRIPT

Understanding S

Douglas A Lind, William G Marchal and Samuel A Wathen. Statistical techniques in Business and EconomicsChapter 10, 11Hypothesis Test1What is a Hypothesis?2What is a Hypothesis?A Hypothesis is a statement about the value of a population parameter developed for the purpose of testing. Examples of hypotheses made about a population parameter are:The mean monthly income for systems analysts is $3,625.What is Hypothesis Testing?Hypothesis testing is a procedure, based on sample evidence and probability theory, used to determine whether the hypothesis is a reasonable statement and should not be rejected, or is unreasonable and should be rejected.

2Hypothesis Testing Steps3

3Important Things to Remember about H0 and H14H0: null hypothesis and H1: alternate hypothesisH0 and H1 are mutually exclusive and collectively exhaustive H0 is always presumed to be true H1 has the burden of proof A random sample (n) is used to reject H0 If we conclude 'do not reject H0', this does not necessarily mean that the null hypothesis is true, it only suggests that there is not sufficient evidence to reject H0; rejecting the null hypothesis then, suggests that the alternative hypothesis may be true.Equality is always part of H0 (e.g. = , , ). always part of H1 4How to Set Up a Claim as Hypothesis5In actual practice, the status quo is set up as H0If the claim is boastful the claim is set up as H1. Remember, H1 has the burden of proofIn problem solving, look for key words and convert them into symbols. Some key words include: improved, better than, as effective as, different from, has changed, etc.

5Parts of a Distribution in Hypothesis Testing6

6One-tail vs. Two-tail Test7

7Hypothesis Setups for Testing a Mean ()8

8Hypothesis Setups for Testing a Proportion ()9

9Example10Jamestown Steel Company manufactures and assembles desks and other office equipment at several plants in western New York State. The weekly production of the Model A325 desk at the Fredonia Plant follows the normal probability distribution with a mean of 200 and a standard deviation of 16. Recently, because of market expansion, new production methods have been introduced and new employees hired. The vice president of manufacturing would like to investigate whether there has been a change in the weekly production of the Model A325 desk.

10Problem11Step 1: State the null hypothesis and the alternate hypothesis.H0: = 200H1: 200(note: keyword in the problem has changed)

Step 2: Select the level of significance. = 0.01 as stated in the problem

Step 3: Select the test statistic.Use Z-distribution since is known

11Problem12

Step 4: Formulate the decision rule.Reject H0 if |Z| > Z/2

Step 5: Make a decision and interpret the result.Because 1.55 does not fall in the rejection region, H0 is not rejected. We conclude that the population mean is not different from 200. So we would report to the vice president of manufacturing that the sample evidence does not show that the production rate at the Fredonia Plant has changed from 200 per week.

12Type of Errors in Hypothesis Testing13Type I Error - Defined as the probability of rejecting the null hypothesis when it is actually true.This is denoted by the Greek letter Also known as the significance level of a test

Type II Error: Defined as the probability of accepting the null hypothesis when it is actually false.This is denoted by the Greek letter 13Test Error14

14p-Value in Hypothesis Testing15p-VALUE is the probability of observing a sample value as extreme as, or more extreme than, the value observed, given that the null hypothesis is true.

In testing a hypothesis, we can also compare the p-value to with the significance level (). If the p-value < significance level, H0 is rejected, else H0 is not rejected.

15p-Value in Hypothesis Testing - Example16Recall the last problem where the hypothesis and decision rules were set up as:H0: 200H1: > 200Reject H0 if Z > Zwhere Z = 1.55 and Z =2.33

Reject H0 if p-value < 0.0606 is not < 0.01

Conclude: Fail to reject H0

16What does it mean when p-value < ?17(a) .10, we have some evidence that H0 is not true.

(b) .05, we have strong evidence that H0 is not true.

(c) .01, we have very strong evidence that H0 is not true.

(d) .001, we have extremely strong evidence that H0 is not true.

17Testing for the Population Mean: Population Standard Deviation Unknown18When the population standard deviation () is unknown, the sample standard deviation (s) is used in its place The t-distribution is used as test statistic, which is computed using the formula:

18Testing for the Population Mean: Population Standard Deviation Unknown - Example19The McFarland Insurance Company Claims Department reports the mean cost to process a claim is $60. An industry comparison showed this amount to be larger than most other insurance companies, so the company instituted cost-cutting measures. To evaluate the effect of the cost-cutting measures, the Supervisor of the Claims Department selected a random sample of 26 claims processed last month. The sample information is reported below. At the .01 significance level is it reasonable a claim is now less than $60?

19Testing for a Population Mean with aKnown Population Standard Deviation- Example20Step 1: State the null hypothesis and the alternate hypothesis.H0: $60H1: < $60(note: keyword in the problem now less than)

Step 2: Select the level of significance. = 0.01 as stated in the problem

Step 3: Select the test statistic.Use t-distribution since is unknown

20Testing for a Population Mean with aKnown Population Standard Deviation- Example21Step 4: Formulate the decision rule.Reject H0 if t < -t,n-1

Step 5: Make a decision and interpret the result.Because -1.818 does not fall in the rejection region, H0 is not rejected at the .01 significance level. We have not demonstrated that the cost-cutting measures reduced the mean cost per claim to less than $60. The difference of $3.58 ($56.42 - $60) between the sample mean and the population mean could be due to sampling error.

21Testing for a Population Mean with an Unknown Population Standard Deviation- Example22The current rate for producing 5 amp fuses at Neary Electric Co. is 250 per hour. A new machine has been purchased and installed that, according to the supplier, will increase the production rate. A sample of 10 randomly selected hours from last month revealed the mean hourly production on the new machine was 256 units, with a sample standard deviation of 6 per hour.

At the .05 significance level can Neary conclude that the new machine is faster?

22Testing for a Population Mean with aKnown Population Standard Deviation- Example continued23Step 1: State the null and the alternate hypothesis. H0: 250; H1: > 250

Step 2: Select the level of significance. It is .05.

Step 3: Find a test statistic. Use the t distribution because the population standard deviation is not known and the sample size is less than 30.

23Testing for a Population Mean with aKnown Population Standard Deviation- Example continued24Step 4: State the decision rule. There are 10 1 = 9 degrees of freedom. The null hypothesis is rejected if t > 1.833.

Step 5: Make a decision and interpret the results. The null hypothesis is rejected. The mean number produced is more than 250 per hour.

24Tests Concerning Proportion25A Proportion is the fraction or percentage that indicates the part of the population or sample having a particular trait of interest.The sample proportion is denoted by p and is found by x/nThe test statistic is computed as follows:

25Assumptions in Testing a Population Proportion using the z-Distribution26A random sample is chosen from the population. It is assumed that the binomial assumptions discussed in Chapter 6 are met: the sample data collected are the result of counts; the outcome of an experiment is classified into one of two mutually exclusive categoriesa success or a failure; the probability of a success is the same for each trial; and (4) the trials are independentThe test we will conduct shortly is appropriate when both n and n(1- ) are at least 5.When the above conditions are met, the normal distribution can be used as an approximation to the binomial distribution

26Test Statistic for Testing a Single Population Proportion27

Sample proportionHypothesized population proportionSample size27Test Statistic for Testing a Single Population Proportion - Example28Suppose prior elections in a certain state indicated it is necessary for a candidate for governor to receive at least 80 percent of the vote in the northern section of the state to be elected. The incumbent governor is interested in assessing his chances of returning to office and plans to conduct a survey of 2,000 registered voters in the northern section of the state. Using the hypothesis-testing procedure, assess the governors chances of reelection.

28Test Statistic for Testing a Single Population Proportion - Example29Step 1: State the null hypothesis and the alternate hypothesis.H0: .80H1: < .80(note: keyword in the problem at least)

Step 2: Select the level of significance. = 0.01 as stated in the problem

Step 3: Select the test statistic.Use Z-distribution since the assumptions are met and n and n(1-) 5

29Testing for a Population Proportion - Example30Step 4: Formulate the decision rule.Reject H0 if Z D(1 - 2) 21 < 2] ..one-tailed testAlternate HypothesisH1: (1 - 2) D[1 2]..two-tailed testTest Statistics:

(replace s when not available)Rejection Regionz > z or z < -z (one tailed test) z > z/2 or z < -z/2(two tailed test)34Mary Jo Fitzpatric is the Vice President for Nursing Services at St. Lukes Memorial Hospital. Recently she noticed in the job posting for nurses that those that are unionised seem to offer higher wages. She decided to investigate and gathered the following sample information.

GroupMean WageSample Standard DeviationSample SizeUnion$20.75$2.2540Nonunion$19.80$1.9045Would it be reasonable for her to conclude that there is significant difference in earning between union and non-union nurses? Use the .01 significance level.35ExampleA manpower-development statistician is asked to determine whether the hourly wages of semiskilled workers are the same in two cities. The results of the survey are presented in the following table:

Hourly Wage Rate

36SolutionStep 1: This is a two-tailed test. The hypothesis is stated below. The significance level is 0.05 (given) H0: 1 = 2 versus H1: 1 2 Step 2: Since this is a test of the means and the degrees of freedom (n1 + n2 - 2) is in excess of 30, a z test is appropriate. The critical values are 1.96 (from a z table).

Step 3:

37SolutionStep 4: Sketch the distribution, locate the critical values and the test statistic. Step 5: Decide! Since the test statistic values lies within the rejection region then there is sufficient statistical evidence based on this sample to reject H0. The test for a difference between parameters does not have to be zero, it can be non-zero. For example: H0: 1 - 2 0.10 versus H1: 1 - 2 > 0.10

38Problem: 9-3Two research laboratories have independently produced drugs that provide relief to arthritis sufferers. The first drug was tested on a group of 90 arthritis sufferers and produced an average of 8.5 hours of relief, and a sample standard deviation of 1.8 hours. The second drug was tested on 80 arthritis sufferers, producing an average of 7.9 hours of relief, and a sample standard deviation of 2.1 hours. At the 0.05 level of significance, does the second drug provide a significantly shorter period of relief?39Problem: 9-6 Notwithstanding the Equal Pay Act of 1963, in 1993 it still appeared that men earned more than women in similar jobs. A random sample of 38 male machinetool operators found a mean hourly wage of $11.38, and the sample standard deviation was $1.84. A random sample of 45 female machine-tool operators found their mean wage to be $8.42, and the sample standard deviation was $1.31. On the basis of these samples, is it reasonable to conclude (at a = 0.01) that the male operators are earning over $2.00 more per hour than the female operators?40Testing Hypothesis about Difference between Two Population ProportionNull Hypothesis: H0: (p1 - p2)=Dor p1 = p2Alternate HypothesisH1: (p1 - p2)>D(p1 - p2) p2p1 < p2] ..one-tailed testAlternate HypothesisH1: (p1 - p2) D[p1 p2]..two-tailed test3. Test Statistics:

Rejection Regionz > z or z < -z(one tailed test) z > z/2 or z < -z/2(two tailed test)41Testing Hypothesis about Difference between Two Population Proportion: Pooled

42Example:According to a report by the American Cancer Society, more men than women smoke and twice as many smokers die prematurely than nonsmokers. In random samples of 200 males and 200 females, 62 of the males and 54 of the females were smokers. Is there sufficient evidence to conclude that the proportion of male smokers higher from the proportion of female smokers when = .01?A financial analyst wants to compare the turnover rates, in percent, for shares of oil related stocks versus other stocks. She selected 32 oil-related stocks and 49 other stocks. The mean turnover of oil related stocks is 31.4 percent and the standard deviation 5.1 percent. For the other stocks, the mean rate was computed to be 34.9 percent and the standard deviation 6.7 percent. Is there a significant difference in the turnover rates of the two types of stock?

43Problem: 9-22 A coal-fired power plant is considering two different systems for pollution abatement. The first system has reduced the emission of pollutants to acceptable levels 68 percent of the time, as determined from 200 air samples. The second, more expensive system has reduced the emission of pollutants to acceptable levels 76 percent of the time, as determined from 250 air samples. If the expensive system is significantly more effective than the inexpensive system in reducing pollutants to acceptable levels, then the management of the power plant will install the expensive system. Which system will be installed if management uses a significance level of 0.02 in making its decision?44Problem: 9-23 A group of clinical physicians is performing tests on patients to determine the effectiveness of a new antihypertensive drug. Patients with high blood pressure were randomly chosen and then randomly assigned to either the control group (which received a well-established antihypertensive) or the treatment group (which received the new drug). The doctors noted the percentage of patients whose blood pressure was reduced to a normal level within 1 year. At the 0.01 level of significance, test appropriate hypotheses to determine whether the new drug is significantly more effective than the older drug in reducing high blood pressure.GroupProportion That ImprovedNumber of PatientsTreatment0.45120Control0.3615045Test for difference between Means: small sample sizeFor small samples sizes, we must estimate a 'pooled' estimate (a.k.a. a weighted average) of the variances for the two groups. This estimate is:

and then, the estimated standard error is:

46Tests for Difference Between Two Means: Small Sample Size

47Example:A company wishes to test when the sensitivity achieved by a new program is significantly higher than achieved under the legacy program. The following information is available from test results. Sensitivity

MeanStandard DeviationSample SizeProposed921512P.M.O.84191548SolutionStep 1: This is a one-tailed test. The hypothesis is stated below. The significance level is 0.05 (given) H0: 1 2 versus H1: 1 > 2

Step 2: Since this is a test of the means and neither n1 or n2 is in excess of 30, a t test is appropriate. The critical value is 1.708 (from a t table with 25 degrees of freedom).

Step 3:

49SolutionStep 4: Sketch the distribution, locate the critical values and the test statistic. Step 5: Decide! Since the test statistic values lies within the retention region then there is no sufficient statistical evidence based on this sample to reject H0. 50Problem: 9-8 A credit-insurance organization has developed a new high-tech method of training new sales personnel. The company sampled 16 employees who were trained the original way and found average daily sales to be $688 and the sample standard deviation was $32.63. They also sampled 11 employees who were trained using the new method and found average daily sales to be $706 and the sample standard deviation was $24.84. At a = 0.05, can the company conclude that average daily sales have increased under the new plan?51Problem: 9-9 A large stock-brokerage firm wants to determine how successful its new account executives have been at recruiting clients. After completing their training, new account execs spend several weeks calling prospective clients, trying to get the prospects to open accounts with the firm. The following data give the numbers of new accounts opened in their first 2 weeks by 10 randomly chosen female account execs and by 8 randomly chosen male account execs. At a = 0.05, does it appear that the women are more effective at generating new accounts than the men are?

52