hypothesis testing

Post on 17-Jul-2015

93 Views

Category:

Business

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Transform to the power of digital

Hypothesis TestingDecember 2014

Copyright © 2013 Capgemini Consulting. All rights reserved.

Theory vs Hypothesis

2

Copyright © 2013 Capgemini Consulting. All rights reserved.

A claim(assumption) about a population parameter

What is a hypothesis?

3

– population mean

– population proportion

Example: The mean monthly cell phone bill of this city is = $42

Example: The proportion of adults in this city with cell phones is p = .68

Copyright © 2013 Capgemini Consulting. All rights reserved.

States the assumption (numerical) to be tested

Example: The average number of TV sets in U.S. Homes is at least 3

( )

Is always about a population parameter, not about a sample statistic

A claim(assumption) about a population parameter

What is a hypothesis?

4

Null Hypothesis(H0)

3μ:H0

3μ:H0 3x:H0

Copyright © 2013 Capgemini Consulting. All rights reserved.

States the assumption (numerical) to be tested

Example: The average number of TV sets in U.S. Homes is at least 3

( )

Is always about a population parameter, not about a sample statistic

The opposite of the null hypothesis

Example: The average number of TV sets in U.S. homes is less than 3 ( HA: < 3 )

A claim(assumption) about a population parameter

What is a hypothesis?

5

Null Hypothesis(H0)

3μ:H0

3μ:H0 3x:H0

Alternative Hypothesis(HA)

Copyright © 2013 Capgemini Consulting. All rights reserved.

CLAIM: The population mean age is 50H0: = 50

Hypothesis Testing Process

6

Population Sample

Select a random sample

Copyright © 2013 Capgemini Consulting. All rights reserved.

CLAIM: The population mean age is 50H0: = 50

Hypothesis Testing Process

Suppose the sample mean age is 20x = 20

Population Sample

Select a random sample

Copyright © 2013 Capgemini Consulting. All rights reserved.

CLAIM: The population mean age is 50H0: = 50

Hypothesis Testing Process

8

Suppose the sample mean age is 20x = 20

Population Sample

Is x=20 likely if = 50?

Select a random sample

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Testing Process

9

Is x=20 likely if = 50?

Copyright © 2013 Capgemini Consulting. All rights reserved.

Reject the null hypothesis

Hypothesis Testing Process

10

Is x=20 likely if = 50?

NO

Copyright © 2013 Capgemini Consulting. All rights reserved.

Do not reject the null hypothesis

Reject the null hypothesis

Hypothesis Testing Process

11

Is x=20 likely if = 50?

NO

YES

Copyright © 2013 Capgemini Consulting. All rights reserved.

Reason for rejecting H0

12

Sampling Distribution of x

If it is unlikely that we would get a sample mean of this value ...

20x

Copyright © 2013 Capgemini Consulting. All rights reserved.

Reason for rejecting H0

13

Sampling Distribution of x

= 50If H0 is true

If it is unlikely that we would get a sample mean of this value ...

20

... if in fact this were the population mean…

x

Copyright © 2013 Capgemini Consulting. All rights reserved.

Reason for rejecting H0

14

Sampling Distribution of x

= 50If H0 is true

If it is unlikely that we would get a sample mean of this value ...

... then we reject the null hypothesis that

= 50.

20

... if in fact this were the population mean…

x

Copyright © 2013 Capgemini Consulting. All rights reserved.

Level of significance,

15

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution

Copyright © 2013 Capgemini Consulting. All rights reserved.

Level of significance,

16

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution

Copyright © 2013 Capgemini Consulting. All rights reserved.

Rejection region is shaded

Level of significance,

17

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution

H0: μ ≥ 3

HA: μ < 3 0

H0: μ ≤ 3

HA: μ > 3

a

a

Represents critical value

Lower tail test

0Upper tail test

Copyright © 2013 Capgemini Consulting. All rights reserved.

Rejection region is shaded

Level of significance,

18

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution

H0: μ ≥ 3

HA: μ < 3 0

H0: μ ≤ 3

HA: μ > 3

H0: μ = 3

HA: μ ≠ 3

a

a

/2

Represents critical value

Lower tail test

0

0

a /2a

Upper tail test

Two tailed test

Copyright © 2013 Capgemini Consulting. All rights reserved.

Critical Value Approach to Testing

19

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Copyright © 2013 Capgemini Consulting. All rights reserved.

Critical Value Approach to Testing

20

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

n

σμx

z

n

sμx

t 1n

Known Unknown

s : population standard deviations : sample standard deviation

n : sample sizeμ : assumed population mean

x : sample mean

Copyright © 2013 Capgemini Consulting. All rights reserved.

Critical Value Approach to Testing

21

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Determine the critical value(s) for a specified level of significance from a table or computer

Copyright © 2013 Capgemini Consulting. All rights reserved.

Critical Value Approach to Testing

22

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Determine the critical value(s) for a specified level of significance from a table or computer

If the test statistic falls in the rejection region, reject H0 ; otherwise do not reject H0

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

23

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

24

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Specify the population value of interest1

The mean number of TVs in US homes

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

25

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

H0: μ 3

HA: μ < 3 (This is a lower tail test)

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

26

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Specify the desired level of significance3Suppose that = .05 is chosen for this test

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

27

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

28

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Determine the rejection region4

Reject H0 Do not reject H0

= .05

-zα= -1.645

This is a one-tailed test with =.05.

Since σ is known, the cutoff value is a z value

Reject H0 if z < z = -1.645 ;

otherwise do not reject H0

z

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

29

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Obtain sample evidence and compute the test statistic5

Suppose a sample is taken with the following results:

n = 100, x = 2.84

( = 0.8 is assumed known)

2.0.08

.16

100

0.832.84

n

σμx

z

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Reach a decision and interpret the result6

Reject H0 Do not reject H0

= .05

-1.645 0

-2.0

z

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

31

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Reach a decision and interpret the result6

Reject H0 Do not reject H0

= .05

-1.645 0

-2.0

z

Since z = -2.0 < -1.645, we reject the null hypothesis that the mean number of TVs in US homes is at least 3

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

32

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Reach a decision and interpret the result6

Reject H0 Do not reject H0

= .05

-1.645 0

-2.0

z

Since z = -2.0 < -1.645, we reject the null hypothesis that the mean number of TVs in US homes is at least 3

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

33

Specify the population value of interest1

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

34

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

35

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

36

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

37

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Obtain sample evidence and compute the test statistic5

Copyright © 2013 Capgemini Consulting. All rights reserved.

Steps to Follow

38

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Obtain sample evidence and compute the test statistic5

Reach a decision and interpret the result6

Copyright © 2013 Capgemini Consulting. All rights reserved.

Break

39

Copyright © 2013 Capgemini Consulting. All rights reserved.

p – Value Approach to Testing

40

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Copyright © 2013 Capgemini Consulting. All rights reserved.

p – Value Approach to Testing

41

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Same as before

Copyright © 2013 Capgemini Consulting. All rights reserved.

p – Value Approach to Testing

42

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Obtain the p-value from a table or computer

Copyright © 2013 Capgemini Consulting. All rights reserved.

p – Value Approach to Testing

43

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )

Obtain the p-value from a table or computer

Compare the p-value with

If p-value < , reject H0

If p-value , do not reject H0

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

44

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Determine the rejection region4

Specify the population value of interest1

Formulate the appropriate null and alternative hypotheses2

Specify the desired level of significance3

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

45

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Obtain sample evidence and compute the test statistic5

Suppose a sample is taken with the following results:

n = 100, x = 2.84

( = 0.8 is assumed known)

2.8684100

0.81.6453

n

σzμx αα zα= 1.645

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Obtain sample evidence and compute the test statistic5

p-value =.0228

= .05

2.8684 3

2.84

x

.02282.0)P(z

1000.8

3.02.84zP

3.0)μ|2.84xP(

Copyright © 2013 Capgemini Consulting. All rights reserved.

Here: p-value = .0228 = .05

Since .0228 < .05, we reject the null hypothesis

Example

47

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)

Reach a decision and interpret the result6Compare the p-value with

If p-value < , reject H0

If p-value , do not reject H0

p-value =.0228

= .05

2.8684 3

2.84

Copyright © 2013 Capgemini Consulting. All rights reserved.

Two-Tailed Test

48

Reject H0Reject H0

a/2=.025

Do not reject H0

tα/2

a/2=.025

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

49

Involves categorical values

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

50

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

51

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of population in the “success” category is denoted by p

sizesample

sampleinsuccessesofnumber

n

xp

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

52

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of population in the “success” category is denoted by p

sizesample

sampleinsuccessesofnumber

n

xp

p can be approximated by a normal distribution with mean

and standard deviation pμP

n

p)p(1σ

p

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Tests for Proportions

53

Involves categorical values

Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)

Fraction or proportion of population in the “success” category is denoted by p

sizesample

sampleinsuccessesofnumber

n

xp

p can be approximated by a normal distribution with mean

and standard deviation pμP

n

p)p(1σ

p

n)p(p

ppz

1

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

54

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

55

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.

2.47

500.08).08(1

.08.05

np)p(1

ppz

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

56

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.

2.47

500.08).08(1

.08.05

np)p(1

ppz

Critical Values: ± 1.96

z0

Reject Reject

.025.025

1.96-2.47

-1.96

Copyright © 2013 Capgemini Consulting. All rights reserved.

Example

57

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.

2.47

500.08).08(1

.08.05

np)p(1

ppz

Critical Values: ± 1.96

z0

Reject Reject

.025.025

1.96-2.47

-1.96

Reject H0 at = .05

There is sufficient evidence to reject the company’s claim of 8% response rate.

Copyright © 2013 Capgemini Consulting. All rights reserved.

Errors in Making Decisions

58

Reject a true null hypothesisType I error

Fail to reject a false null hypothesisType II error

The probability of Type I Error is Called level of significance of the test

The probability of Type I Error is β1- β is called the power of the test, i.e. Prob(rejecting the false null hypothesis)

Copyright © 2013 Capgemini Consulting. All rights reserved.

Errors in Making Decisions

59

Reject a true null hypothesisType I error

Fail to reject a false null hypothesisType II error

The probability of Type I Error is Called level of significance of the test

The probability of Type I Error is β1- β is called the power of the test, i.e. Prob(rejecting the false null hypothesis)

H0 True H0 False

Reject No error (1 - α)

Type II Error ( β )

Do not reject Type I Error(α)

No Error ( 1 - β )

Key:Outcome

(Probability)

State of Nature

Copyright © 2013 Capgemini Consulting. All rights reserved.

Hypothesis Song

60

Transform to the power of digital

Thank you

top related