hypothesis testing

Transform to the power of digital

Hypothesis TestingDecember 2014

Copyright © 2013 Capgemini Consulting. All rights reserved.

Theory vs Hypothesis

2


A claim(assumption) about a population parameter

What is a hypothesis?

3

– population mean

– population proportion

Example: The mean monthly cell phone bill of this city is = $42

Example: The proportion of adults in this city with cell phones is p = .68


States the assumption (numerical) to be tested

Example: The average number of TV sets in U.S. Homes is at least 3

( )

Is always about a population parameter, not about a sample statistic



4

Null Hypothesis(H0)

3μ:H0

3μ:H0 3x:H0


States the assumption (numerical) to be tested

Example: The average number of TV sets in U.S. Homes is at least 3

( )

Is always about a population parameter, not about a sample statistic

The opposite of the null hypothesis

Example: The average number of TV sets in U.S. homes is less than 3 ( HA: < 3 )



5

Null Hypothesis(H0)

3μ:H0

3μ:H0 3x:H0

Alternative Hypothesis(HA)


CLAIM: The population mean age is 50H0: = 50

Hypothesis Testing Process

6

Population Sample

Select a random sample




Suppose the sample mean age is 20x = 20

Population Sample





8

Suppose the sample mean age is 20x = 20

Population Sample

Is x=20 likely if = 50?




9



Reject the null hypothesis


10


NO


Do not reject the null hypothesis

Reject the null hypothesis


11


NO

YES


Reason for rejecting H0

12

Sampling Distribution of x

If it is unlikely that we would get a sample mean of this value ...

20x



13


= 50If H0 is true


20

... if in fact this were the population mean…

x



14


= 50If H0 is true


... then we reject the null hypothesis that

= 50.

20

... if in fact this were the population mean…

x


Level of significance,

15

Defines unlikely values of sample statistic if null hypothesis is true

Defines rejection region of the sampling distribution



16




Rejection region is shaded


17



H0: μ ≥ 3

HA: μ < 3 0

H0: μ ≤ 3

HA: μ > 3

a

a

Represents critical value

Lower tail test

0Upper tail test


Rejection region is shaded


18



H0: μ ≥ 3

HA: μ < 3 0

H0: μ ≤ 3

HA: μ > 3

H0: μ = 3

HA: μ ≠ 3

a

a

/2

Represents critical value

Lower tail test

0

0

a /2a

Upper tail test

Two tailed test


Critical Value Approach to Testing

19

Convert sample statistic (e.g.: x ) to test statistic ( Z or t statistic )



20


n

σμx

z

n

sμx

t 1n

Known Unknown

s : population standard deviations : sample standard deviation

n : sample sizeμ : assumed population mean

x : sample mean



21


Determine the critical value(s) for a specified level of significance from a table or computer



22


Determine the critical value(s) for a specified level of significance from a table or computer

If the test statistic falls in the rejection region, reject H0 ; otherwise do not reject H0


Example

23

Test the claim that the true mean # of

TV sets in US homes is at least 3

(Assume σ = 0.8)


Example

24



(Assume σ = 0.8)

Specify the population value of interest1

The mean number of TVs in US homes


Example

25



(Assume σ = 0.8)


Formulate the appropriate null and alternative hypotheses2

H0: μ 3

HA: μ < 3 (This is a lower tail test)


Example

26



(Assume σ = 0.8)

Specify the desired level of significance3Suppose that = .05 is chosen for this test




Example

27



(Assume σ = 0.8)

Determine the rejection region4



Specify the desired level of significance3


Example

28



(Assume σ = 0.8)


Reject H0 Do not reject H0

= .05

-zα= -1.645

This is a one-tailed test with =.05.

Since σ is known, the cutoff value is a z value

Reject H0 if z < z = -1.645 ;

otherwise do not reject H0

z


Example

29



(Assume σ = 0.8)

Obtain sample evidence and compute the test statistic5

Suppose a sample is taken with the following results:

n = 100, x = 2.84

( = 0.8 is assumed known)

2.0.08

.16

100

0.832.84

n

σμx

z


Example



(Assume σ = 0.8)

Reach a decision and interpret the result6


= .05

-1.645 0

-2.0

z


Example

31



(Assume σ = 0.8)



= .05

-1.645 0

-2.0

z

Since z = -2.0 < -1.645, we reject the null hypothesis that the mean number of TVs in US homes is at least 3


Example

32



(Assume σ = 0.8)



= .05

-1.645 0

-2.0

z

Since z = -2.0 < -1.645, we reject the null hypothesis that the mean number of TVs in US homes is at least 3


Steps to Follow

33



Steps to Follow

34




Steps to Follow

35





Steps to Follow

36






Steps to Follow

37







Steps to Follow

38








Break

39


p – Value Approach to Testing

40

p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true



41



Same as before



42



Obtain the p-value from a table or computer



43



Obtain the p-value from a table or computer

Compare the p-value with

If p-value < , reject H0

If p-value , do not reject H0


Example

44



(Assume σ = 0.8)






Example

45



(Assume σ = 0.8)


Suppose a sample is taken with the following results:

n = 100, x = 2.84

( = 0.8 is assumed known)

2.8684100

0.81.6453

n

σzμx αα zα= 1.645


Example



(Assume σ = 0.8)


p-value =.0228

= .05

2.8684 3

2.84

x

.02282.0)P(z

1000.8

3.02.84zP

3.0)μ|2.84xP(


Here: p-value = .0228 = .05

Since .0228 < .05, we reject the null hypothesis

Example

47



(Assume σ = 0.8)

Reach a decision and interpret the result6Compare the p-value with

If p-value < , reject H0

If p-value , do not reject H0

p-value =.0228

= .05

2.8684 3

2.84


Two-Tailed Test

48

Reject H0Reject H0

a/2=.025

Do not reject H0

tα/2

a/2=.025


Hypothesis Tests for Proportions

49

Involves categorical values



50


Two possible outcomes

“Success” (possesses a certain characteristic)

“Failure” (does not possesses that characteristic)



51





Fraction or proportion of population in the “success” category is denoted by p

sizesample

sampleinsuccessesofnumber

n

xp



52






sizesample


n

xp

p can be approximated by a normal distribution with mean

and standard deviation pμP

n

p)p(1σ

p



53






sizesample


n

xp

p can be approximated by a normal distribution with mean

and standard deviation pμP

n

p)p(1σ

p

n)p(p

ppz

1


Example

54

A marketing company claims that it receives 8% responses from its mailing. To test this claim, a random sample of 500 were surveyed with 25 responses.

Test at the = .05 significance level.


Example

55



2.47

500.08).08(1

.08.05

np)p(1

ppz


Example

56



2.47

500.08).08(1

.08.05

np)p(1

ppz

Critical Values: ± 1.96

z0

Reject Reject

.025.025

1.96-2.47

-1.96


Example

57



2.47

500.08).08(1

.08.05

np)p(1

ppz

Critical Values: ± 1.96

z0

Reject Reject

.025.025

1.96-2.47

-1.96

Reject H0 at = .05

There is sufficient evidence to reject the company’s claim of 8% response rate.


Errors in Making Decisions

58

Reject a true null hypothesisType I error

Fail to reject a false null hypothesisType II error

The probability of Type I Error is Called level of significance of the test

The probability of Type I Error is β1- β is called the power of the test, i.e. Prob(rejecting the false null hypothesis)


Errors in Making Decisions

59

Reject a true null hypothesisType I error

Fail to reject a false null hypothesisType II error

The probability of Type I Error is Called level of significance of the test

The probability of Type I Error is β1- β is called the power of the test, i.e. Prob(rejecting the false null hypothesis)

H0 True H0 False

Reject No error (1 - α)

Type II Error ( β )

Do not reject Type I Error(α)

No Error ( 1 - β )

Key:Outcome

(Probability)

State of Nature


Hypothesis Song

60

Transform to the power of digital

Thank you

hypothesis testing

Business

capgemini consulting

null hypothesish0 copyright

random sample copyright

notion of innocent

sample mean age

proven guiltyrefers

status quoalways

population parameterwhat