hypothesis testing

61
Transform to the power of digital Hypothesis Testing December 2014

Upload: sumit-sharma

Post on 17-Jul-2015

93 views

Category:

Business


1 download

TRANSCRIPT

Page 1: Hypothesis testing

Transform to the power of digital

Hypothesis TestingDecember 2014

Page 2: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Theory vs Hypothesis

2

Page 3: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

A claim(assumption) about a population parameter

What is a hypothesis?

3

– population mean

– population proportion

Example: The mean monthly cell phone bill of this city is = $42

Example: The proportion of adults in this city with cell phones is p = .68

Page 4: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

States the assumption (numerical) to be tested

Example: The average number of TV sets in U.S. Homes is at least 3

( )

Is always about a population parameter, not about a sample statistic

A claim(assumption) about a population parameter

What is a hypothesis?

4

Null Hypothesis(H0)

3μ:H0

3μ:H0 3x:H0

Page 5: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

States the assumption (numerical) to be tested

Example: The average number of TV sets in U.S. Homes is at least 3

( )

Is always about a population parameter, not about a sample statistic

The opposite of the null hypothesis

Example: The average number of TV sets in U.S. homes is less than 3 ( HA: < 3 )

A claim(assumption) about a population parameter

What is a hypothesis?

5

Null Hypothesis(H0)

3μ:H0

3μ:H0 3x:H0

Alternative Hypothesis(HA)

Page 6: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

CLAIM: The population mean age is 50H0: = 50

Hypothesis Testing Process

6

Population Sample

Select a random sample

Page 7: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

CLAIM: The population mean age is 50H0: = 50

Hypothesis Testing Process

Suppose the sample mean age is 20x = 20

Population Sample

Select a random sample

Page 8: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

CLAIM: The population mean age is 50H0: = 50

Hypothesis Testing Process

8

Suppose the sample mean age is 20x = 20

Population Sample

Is x=20 likely if = 50?

Select a random sample

Page 9: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Testing Process

9

Is x=20 likely if = 50?

Page 10: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Reject the null hypothesis

Hypothesis Testing Process

10

Is x=20 likely if = 50?

NO

Page 11: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Do not reject the null hypothesis

Reject the null hypothesis

Hypothesis Testing Process

11

Is x=20 likely if = 50?

NO

YES

Page 12: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Reason for rejecting H0

12

Sampling Distribution of x

If it is unlikely that we would get a sample mean of this value ...

20x

Page 13: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Reason for rejecting H0

13

Sampling Distribution of x

= 50If H0 is true

If it is unlikely that we would get a sample mean of this value ...

20

... if in fact this were the population mean…

x

Page 14: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Reason for rejecting H0

14

Sampling Distribution of x

= 50If H0 is true

If it is unlikely that we would get a sample mean of this value ...

... then we reject the null hypothesis that

= 50.

20

... if in fact this were the population mean…

x

Page 15: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Level of significance,

15

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution

Page 16: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Level of significance,

16

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution

Page 17: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Rejection region is shaded

Level of significance,

17

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution

H0: μ ≥ 3

HA: μ < 3 0

H0: μ ≤ 3

HA: μ > 3

a

a

Represents critical value

Lower tail test

0Upper tail test

Page 18: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Rejection region is shaded

Level of significance,

18

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution

H0: μ ≥ 3

HA: μ < 3 0

H0: μ ≤ 3

HA: μ > 3

H0: μ = 3

HA: μ ≠ 3

a

a

/2

Represents critical value

Lower tail test

0

0

a /2a

Upper tail test

Two tailed test

Page 19: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Critical Value Approach to Testing

19

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Page 20: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Critical Value Approach to Testing

20

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

n

σμx

z

n

sμx

t 1n

Known Unknown

s : population standard deviations : sample standard deviation

n : sample sizeμ : assumed population mean

x : sample mean

Page 21: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Critical Value Approach to Testing

21

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Determine the critical value(s) for a specified level of significance from a table or computer

Page 22: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Critical Value Approach to Testing

22

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Determine the critical value(s) for a specified level of significance from a table or computer

If the test statistic falls in the rejection region, reject H0 ; otherwise do not reject H0

Page 23: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

23

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Page 24: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

24

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Specify the population value of interest1

The mean number of TVs in US homes

Page 25: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

25

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

H0: μ 3

HA: μ < 3 (This is a lower tail test)

Page 26: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

26

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Specify the desired level of significance3Suppose that = .05 is chosen for this test

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Page 27: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

27

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Page 28: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

28

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Determine the rejection region4

Reject H0 Do not reject H0

= .05

-zα= -1.645

This is a one-tailed test with =.05.

Since σ is known, the cutoff value is a z value

Reject H0 if z < z = -1.645 ;

otherwise do not reject H0

z

Page 29: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

29

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Obtain sample evidence and compute the test statistic5

Suppose a sample is taken with the following results:

n = 100, x = 2.84

( = 0.8 is assumed known)

2.0.08

.16

100

0.832.84

n

σμx

z

Page 30: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Reach a decision and interpret the result6

Reject H0 Do not reject H0

= .05

-1.645 0

-2.0

z

Page 31: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

31

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Reach a decision and interpret the result6

Reject H0 Do not reject H0

= .05

-1.645 0

-2.0

z

Since z = -2.0 < -1.645, we reject the null hypothesis that the mean number of TVs in US homes is at least 3

Page 32: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

32

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Reach a decision and interpret the result6

Reject H0 Do not reject H0

= .05

-1.645 0

-2.0

z

Since z = -2.0 < -1.645, we reject the null hypothesis that the mean number of TVs in US homes is at least 3

Page 33: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

33

Specify the population value of interest1

Page 34: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

34

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Page 35: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

35

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Page 36: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

36

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Page 37: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

37

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Obtain sample evidence and compute the test statistic5

Page 38: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

38

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Obtain sample evidence and compute the test statistic5

Reach a decision and interpret the result6

Page 39: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Break

39

Page 40: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

p – Value Approach to Testing

40

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Page 41: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

p – Value Approach to Testing

41

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Same as before

Page 42: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

p – Value Approach to Testing

42

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Obtain the p-value from a table or computer

Page 43: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

p – Value Approach to Testing

43

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Obtain the p-value from a table or computer

Compare the p-value with

If p-value < , reject H0

If p-value , do not reject H0

Page 44: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

44

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Page 45: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

45

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Obtain sample evidence and compute the test statistic5

Suppose a sample is taken with the following results:

n = 100, x = 2.84

( = 0.8 is assumed known)

2.8684100

0.81.6453

n

σzμx αα zα= 1.645

Page 46: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Obtain sample evidence and compute the test statistic5

p-value =.0228

= .05

2.8684 3

2.84

x

.02282.0)P(z

1000.8

3.02.84zP

3.0)μ|2.84xP(

Page 47: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Here: p-value = .0228 = .05

Since .0228 < .05, we reject the null hypothesis

Example

47

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Reach a decision and interpret the result6Compare the p-value with

If p-value < , reject H0

If p-value , do not reject H0

p-value =.0228

= .05

2.8684 3

2.84

Page 48: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Two-Tailed Test

48

Reject H0Reject H0

a/2=.025

Do not reject H0

tα/2

a/2=.025

Page 49: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

49

Involves categorical values

Page 50: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

50

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Page 51: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

51

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of population in the “success” category is denoted by p

sizesample

sampleinsuccessesofnumber

n

xp

Page 52: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

52

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of population in the “success” category is denoted by p

sizesample

sampleinsuccessesofnumber

n

xp

p can be approximated by a normal distribution with mean

and standard deviation pμP

n

p)p(1σ

p

Page 53: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

53

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of population in the “success” category is denoted by p

sizesample

sampleinsuccessesofnumber

n

xp

p can be approximated by a normal distribution with mean

and standard deviation pμP

n

p)p(1σ

p

n)p(p

ppz

1

Page 54: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

54

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.

Page 55: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

55

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.

2.47

500.08).08(1

.08.05

np)p(1

ppz

Page 56: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

56

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.

2.47

500.08).08(1

.08.05

np)p(1

ppz

Critical Values: ± 1.96

z0

Reject Reject

.025.025

1.96-2.47

-1.96

Page 57: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

57

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.

2.47

500.08).08(1

.08.05

np)p(1

ppz

Critical Values: ± 1.96

z0

Reject Reject

.025.025

1.96-2.47

-1.96

Reject H0 at = .05

There is sufficient evidence to reject the company’s claim of 8% response rate.

Page 58: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Errors in Making Decisions

58

Reject a true null hypothesisType I error

Fail to reject a false null hypothesisType II error

The probability of Type I Error is Called level of significance of the test

The probability of Type I Error is β1- β is called the power of the test, i.e. Prob(rejecting the false null hypothesis)

Page 59: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Errors in Making Decisions

59

Reject a true null hypothesisType I error

Fail to reject a false null hypothesisType II error

The probability of Type I Error is Called level of significance of the test

The probability of Type I Error is β1- β is called the power of the test, i.e. Prob(rejecting the false null hypothesis)

H0 True H0 False

Reject No error (1 - α)

Type II Error ( β )

Do not reject Type I Error(α)

No Error ( 1 - β )

Key:Outcome

(Probability)

State of Nature

Page 60: Hypothesis testing

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Song

60

Page 61: Hypothesis testing

Transform to the power of digital

Thank you