hypothesis testing

34
Hypothesis Testing

Upload: melvyn

Post on 18-Mar-2016

18 views

Category:

Documents


0 download

DESCRIPTION

Hypothesis Testing. Example:. Test the performance of two lists in terms of response rates Sample (1,000) from the first list provides a response rate of 3.5% Sample (1,200) from the second list provides a response rate of 4.5% - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Hypothesis Testing

Hypothesis Testing

Page 2: Hypothesis Testing

Example:

• Test the performance of two lists in terms of response rates

• Sample (1,000) from the first list provides a response rate of 3.5%

• Sample (1,200) from the second list provides a response rate of 4.5%

• Do the two lists (population) really have a difference or is it an artifact of the sample?

Page 3: Hypothesis Testing

The Logic of Hypothesis Testing

• When you want to make statements about a population, you usually draw samples

• How generalizable is your sample-based finding?

• Evidence has to be evaluated statistically before arriving at a conclusion regarding the hypothesis

• Depends on whether information is generated from the sample with fewer or larger observations

Page 4: Hypothesis Testing

Steps in Hyp Testing

Problem Definition

Clearly state the null and alternate hypotheses.

Choose the relevant test and the appropriate

probability distribution

Choose the critical value

Compare test statistic and critical value

Reject null

Does the test statistic fall in the critical region?

Determine the significance level

Compute relevant test statistic

Determine the degrees of freedom

Decide if one-or two-tailed test

Do not reject nullNo

Yes

Page 5: Hypothesis Testing

Basic Concepts of Hypothesis Testing• The Null and Alternate hypothesis• Choosing the relevant statistical test and appropriate

probability distribution. Depends on- Size of the sample- Whether the population standard deviation is known or not

• Choosing the Critical Value. The three criteria used are- Significance Level- Degrees of Freedom- One or Two Tailed Test

Page 6: Hypothesis Testing

Significance Level

• Indicates the percentage of sample means that is outside the cut-off limits (critical value)

• The higher the significance level () used for testing a hypothesis, the higher the probability of rejecting a null hypothesis when it is true (Type I error)

• Accepting a null hypothesis when it is false is called a Type II error and its probability is ()

Page 7: Hypothesis Testing

Significance Level (Contd.)

• When choosing a level of significance, there is an inherent tradeoff between these two types of errors

• Power of hypothesis test (1 - )• A good test of hypothesis ought to reject a

null hypothesis when it is false• 1 - should be as high a value as possible

Page 8: Hypothesis Testing

Degree of Freedom

• The number or bits of "free" or unconstrained data used in calculating a sample statistic or test statistic

• A sample mean (X) has `n' degree of freedom

• A sample variance (s2) has (n-1) degrees of freedom

Page 9: Hypothesis Testing

One or Two-tail Test

• One-tailed Hypothesis Test • Determines whether a particular population parameter is larger

or smaller than some predefined value

• Uses one critical value of test statistic

• Two-tailed Hypothesis Test • Determines the likelihood that a population parameter is

within certain upper and lower bounds

• May use one or two critical values

Page 10: Hypothesis Testing

Hypothesis Testing

DATA ANALYSISOUTCOME

In Population Accept NullHypothesis

Reject NullHypothesis

Null HypothesisTrue

Correct Decision Type I Error

Null HypothesisFalse

Type II Error CorrectDecision

Page 11: Hypothesis Testing

Hypothesis Testing About a Single Mean - Step-by-Step

1) Formulate Hypotheses2) Select appropriate formula3) Select significance level4) Calculate z or t statistic5) Calculate degrees of freedom (for t-test)6) Obtain critical value from table7) Make decision regarding the Null-hypothesis

Page 12: Hypothesis Testing

Hypothesis Testing About a Single Mean - Example 1(2 tailed)• Ho: = 5000 (hypothesized value of population)

• Ha: 5000 (alternative hypothesis)• n = 100

= 4960 = 250 = 0.05

Rejection rule: if |zcalc| > z/2 then reject Ho.

X

Page 13: Hypothesis Testing

Hypothesis Testing About a Single Mean - Example 2

• Ho: = 1000 (hypothesized value of population)

• Ha: 1000 (alternative hypothesis)• n = 12

= 1087.1• s = 191.6 = 0.01

Rejection rule: if |tcalc| > tdf, /2 then reject Ho.

X

Page 14: Hypothesis Testing

Hypothesis Testing About a Single Mean - Example 3(1 tailed)• Ho: 5000 (hypothesized value of population)

• Ha: < 5000 (alternative hypothesis)• n = 50

= 4970• = 250 = 0.01

Rejection rule: if then reject Ho.

X

CalcZZ

Page 15: Hypothesis Testing

Hypothesis Test of Difference between Means

• Mayor of a city wants to see if males and females earn the same

• A random sample of 400 males and 576 females was taken and following was found

Males Females

Mean $105.70 $112.80

Standard dev $5.00 $4.80

Page 16: Hypothesis Testing

Hypothesis Test of Difference between Means

• The appropriate test depends on- whether samples are from related or unrelated samples- whether population standard deviations are known or not- if not, whether they can be assumed to be equal or not

Page 17: Hypothesis Testing

Hypothesis Test of Difference between Means

• In salary example, the null hypothesis isHo: 1- 2 =c (=0) Ha: 1- 2 c

• Since we have unrelated samples with known (for large samples, we can use sample SD as pop SD) but unequal ’s the standard error of difference in means is

32.576

)80.4(4005 22

2

22

1

21

21 n

snsS XX

Page 18: Hypothesis Testing

Hypothesis Test of Difference between Means

• The calculated value of z is

• For =.01 and a two-tailed test, the Z-table value is 2.58

• Since is greater than , the null hypothesis is rejected

19.22)()(

21

2121

XX

calc SXXz

calcz 2/z

Page 19: Hypothesis Testing

Hypothesis Testing of Proportion

• Quality control dept of a light bulb company claims 95% of its products are defect free

• The CEO checks 225 bulbs and finds only 87% to be defect free

• Is the claim of 95% true at .05 level of significance ?

• So we have hypothesized values and sample values

,05.0 ,95.0 oo qp13.0 ,87.0 qp

Page 20: Hypothesis Testing

Hypothesis Testing of Proportion• The null hypothesis is Ho: p=0.95• The alternate hypothesis is Ha: p 0.95• First, calculate the standard error of the proportion using

hypothesized values as

• Since np and nq are large, we can use the Z table. The appropriate z value is 1.96

0145.225

05.95.

nqp oo

p

Page 21: Hypothesis Testing

Hypothesis Testing of Proportion

• The limits of the acceptance region are

• Since the sample proportion of 0.87 does not fall within the acceptance region, the CEO should reject the quality control department’s claim

)978. ,922(.)0145.96.1(95.96.1 pop

Page 22: Hypothesis Testing

Hypothesis Testing of Difference between Proportions

• Manager wants to see if John and Linda, two salespeople, have the same conversion

• He picks samples and finds that

Sample size

Number converted

Proportion converted

John 100 84 0.84 (= )

Linda 100 82 0.82(= )

jp

lp

Page 23: Hypothesis Testing

Hypothesis Testing of Difference between Proportions

• Are their conversion rates different at 0.05 significance level?• The null hypothesis is Ho:

• The alternate hypothesis is Ha:• The best estimate of p (proportion of success)

is

also,

lj pp lj pp

83.0ˆ21

21

nn

pnpnp lj .17 ˆ1 ˆ pq

Page 24: Hypothesis Testing

Hypothesis Testing of Difference between Proportions

• An estimate of the standard error of the difference of proportions is

• The z value can be calculated as

• The z value obtained from the table is 1.96 (for ). Thus, we fail to reject the null hypothesis

053.ˆˆˆˆˆ21

nqp

nqp

lj pp

38.ˆ

0)(

lj pp

ljcalc

ppz

05.

Page 25: Hypothesis Testing

The Probability Values (P-value) Approach to Hypothesis Testing

• P-value provides researcher with alternative method of testing hypothesis without pre-specifying

• Largest level of significance at which we would not reject Ho

Page 26: Hypothesis Testing

The Probability Values (P-value) Approach to Hypothesis Testing

Difference Between Using and p-value

• Hypothesis testing with a pre-specified • Researcher is trying to determine, "is the probability

of what has been observed less than ?"

• Reject or fail to reject Ho accordingly

Page 27: Hypothesis Testing

The Probability Values (P-value) Approach to Hypothesis Testing

Using the p-Value• Researcher can determine "how unlikely is the

result that has been observed?"• Decide whether to reject or fail to reject Ho

without being bound by a pre-specified significance level

• In general, the smaller the p-value, the greater is the researcher's confidence in sample findings

Page 28: Hypothesis Testing

The Probability Values (P-value) Approach to Hypothesis Testing: Example

• Ho: = 25 (hypothesized value of population)

• Ha: 25 (alternative hypothesis)• n = 50

= 25.2 = 0.7• SE( )= = 0.1; Z= =2

• From Z-table, prob Z >2 is 0.0228. As this is a 2-tailed test, the p-value is 2 0.228=.0456

X

X n

X

X

Page 29: Hypothesis Testing

The Probability Values (P-value) Approach to Hypothesis Testing

Using the p-Value• P-value is generally sensitive to sample size

• A large sample should yield a low p-value

• P-value can report the impact of the sample size on the reliability of the results

Page 30: Hypothesis Testing

Relationship between C.I and Hypothesis Testing (Example 1)

• A direct mktr knows that average no of purchases per month in entire database is 5.6

• By sampling ‘loyals’ he finds that their average is 6.1(i.e, =6.1)

• Is it merely a sampling accident?• Ho: = 5.6 (hypothesized value of population) • Ha: 5.6 (alternative hypothesis)• n = 35 = 2.5

X

Page 31: Hypothesis Testing

Relationship between C.I and Hypothesis Testing (Example 1)

X 96.10

• Std err =0.42

• The appropriate Z for =.05 is 1.96 • The Confidence Interval is

= (4.78, 6.42)• Since 6.1 falls in the interval, we cannot

reject the null hypothesis

nX

X 96.10

Page 32: Hypothesis Testing

Confidence Intervals and Hypothesis Testing

• Hypothesis testing and Confidence Intervals are two sides of the same coin.

t = = Interval estimate for

xx

tsXs

X

)(

Page 33: Hypothesis Testing

Relationship between C.I and Hypothesis Testing (Example 2)

• Revisit the first example we started with• Test the performance of two lists in terms of

response rates• Sample (1,000) from the first list provides a

response rate of 3.5%• Sample (1,200) from the second list provides

a response rate of 4.5%• Do the two lists (population) really have a

difference or is it an artifact of the sample?

Page 34: Hypothesis Testing

Relationship between C.I and Hypothesis Testing (Example 2)

– C.I. of list 1:• (0.035)+/- 1.96*(SE1)• SE1 = Sqrt[(0.035*0.965)/1000]=0.006• C.I.1=(0.0232,0.0467)

– C.I. of list 2:• (0.045)+/-1.96*(SE2)• SE2=Sqrt[(0.045*0.955)/1200]=0.006• C.I.2 =(0.033,0.0568)

– What can we infer based on these confidence Intervals?• Lack of sufficient evidence to infer that there is any difference

between the response rates in the two samples.