1 outline 1.count data 2.properties of the multinomial experiment 3.testing the null hypothesis...

33
1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

Upload: candace-mccoy

Post on 20-Jan-2018

213 views

Category:

Documents


0 download

DESCRIPTION

3 Count data Usually, we count things in a sample in order to make an inference to a population. – E.g., are the proportions of people choosing each brand different from one another? – Or, are the proportions of people choosing each brand different from some hypothetical values in the population?

TRANSCRIPT

Page 1: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

1

Outline

1. Count data2. Properties of the multinomial experiment3. Testing the null hypothesis4. Examples

Page 2: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

2

Count data

• Sometimes, the data we have to analyze are produced by counting things.– How many people choose each of Brands A, B,

and C of coffee?

Page 3: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

3

Count data

• Usually, we count things in a sample in order to make an inference to a population.

– E.g., are the proportions of people choosing each brand different from one another?

– Or, are the proportions of people choosing each brand different from some hypothetical values in the population?

Page 4: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

4

Count data

• To answer such questions, we need to know approximately how much difference between the various counts could be produced by sampling error.

• We determine that quantity using the ‘multinomial probability distribution,’ an extension of the binomial probability distribution.

Page 5: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

5Properties of the Multinomial

Experiment1. There are n identical trials2. There are k possible outcomes on each trial3. The probabilities of the outcomes are the

same across trials4. Trials are all independent of each other5. The multinomial random variables are the k

values n1, n2, …, nk.

Page 6: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

6

Testing the null hypothesis

• We often want to test the null hypothesis that all the categories are equal in frequency.

• If we asked 60 people which of Brands A, B, and C they prefer, equal frequency would look like this:

A B C20 20 20

Page 7: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

7

Testing the null hypothesis

• At other times, we might want to test a specific null hypothesis, such as that B and C are equally popular, but A is twice as popular as either:

A B C30 15 15

• In both cases, we call the values shown the “expected values.”

Page 8: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

8

Testing the null hypothesis

• The null hypothesis can be tested using the statistic χ 2.

χ2 = Σ[ni – E(ni)]2

E(ni)

• χ 2 increases as the observed values, ni, get further from the expected values E(ni).

Page 9: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

9

Chi-square – example

• Suppose we want to know whether there is any population preference for brands of coffee among brands A, B, and C.

• We need to know two things:– How should choices among the brands be

distributed in a sample if there is no preference (all are equally popular)?

– How are choices distributed in our sample?

Page 10: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

10

Chi-square – example

• We ask a sample of 90 people for their preference

• If there is no preference, each brand should be chosen by ⅓ of the people asked:A B C30 30 30 These are the

“expected values” –

– expected if the null hypothesis is true

Page 11: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

11

Chi-square – example

• We ask a sample of 90 people for their preference

• The actual choices look like this:A B C15 42 33

These are the “observed values”

Page 12: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

12

Expected vs. Observed Values

A B C30 30 30

A B C15 42 33

Expected values – each value = ⅓ * 90

Observed values

Page 13: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

13

Chi-square – example

χ 2 = Σ[ni – E(ni)]2

E(ni)

χ 2 = (15-30)2 + (42-30)2 + (33-30)2

30 30 30

= 12.6

Page 14: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

14Chi-square – the formal hypothesis testHO: PA = PB = PC = ⅓

HA: Something different – at least one P > ⅓

Test statistic: χ 2 = Σ[ni – E(ni)]2

E(ni)

where d.f. = (k-1; k = number of categories)

Page 15: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

15Chi-square – the formal hypothesis

test• Rejection region: χ 2

obt > χ 2crit = χ 2

(.05, 2) = 5.9915

• (note: rejection region is always > χ 2crit)

• Decision: since χ 2obt > χ 2

crit, reject HO. Brands are not equally popular

Page 16: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

16

Chi-square – Example 1At a recent meeting of the Coin Flippers Society, each member flipped three coins simultaneously and the number of tails occurring was recorded. Shown below are the numbers of members who had certain numbers of tails. Is there evidence that the coin flipping outcomes were different from what would be expected if all the coins used were fair? (α = .01)Number of Tails Number of Members

0 651 1822 1943 59

Page 17: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

17

Chi-square – Example 1• Shown below are the numbers of members who

had certain numbers of tails.

• Number of tails = the categories people fall into

• Number of members = number of people in each category.

• Number of members is the dependent variable. Do you see why?

Page 18: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

18

Chi-square – Example 1

• To begin, we need to compute the expected values for each of the categories. That is, we need to figure out how many of our 500 members would fall into each category if all the coins used were fair.

• Wait a minute! How do we know there are 500 members?

Page 19: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

19

Chi-square – Example 1At a recent meeting of the Coin Flippers Society, each member flipped three coins simultaneously and the number of tails occurring was recorded. Shown below are the numbers of members who had certain numbers of tails. Is there evidence that the coin flipping outcomes were different from what would be expected if all the coins used were fair? (α = .01)Number of Tails Number of Members

0 651 1822 1943 59

Σ = 500

Page 20: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

20

Chi-square – Example 1• How many possible outcomes are there for

one trial?HHHHHTHTHTHHHTTTHTTHHTTT

There are 8 possible outcomes

Page 21: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

21

Chi-square – Example 1

• Of these eight possible outcomes, how many involve getting 0 tails? Just one – HHH.

• How many involve getting 1 tail? 3 – HHT, HTH, THH.

• How many involve getting 2 tails? 3 – HTT, THT, TTH.

• How many involve getting 3 tails? 1 - TTT

Page 22: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

22

Chi-square – Example 1

HO: P0 = .125, P1 = .375, P2 = .375, P3 = .125

HA: At least one P is different from the value specified in HO.

Test statistic: χ 2 = Σ[ni – E(ni)]2

E(ni)

Rejection region: χ 2obt > χ 2

crit = χ 2(.01, 3) = 11.3449

Page 23: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

23

Chi-square – Example 1

Now we compute the expected values using (a) the probabilities in HO and (b) our sample n:

P0 * 500 = .125 * 500 = 62.5

P1 * 500 = .375 * 500 = 187.5

P2 * 500 = .375 * 500 = 187.5

P3 * 500 = .125 * 500 = 62.5

Page 24: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

24

Chi-square – Example 1χ 2 = [65–62.5]2 + [182–187.5]2 + [194–187.5]2 + [59–62.5]2

62.5 187.5 187.5 62.5

= 0.68267

Decision: Do not reject. There is no evidence that the coin flipping outcomes were different from what would be expected if all the coins used were fair.

Page 25: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

25

Chi-square – Example 2There is an “old wives’ tale” that babies don’t tend to be born randomly during the day but tend more to be born in the middle of the night, specifically between the hours of 1 AM and 5 AM. To investigate this, a researcher collects birth-time data from a large maternity hospital. The day was broken into 4 parts: Morning (5 AM to 1 PM), Mid-day (1 PM to 5 PM), Evening (5 PM to 1 AM), and Mid-night (1 AM to 5 AM). The number of births at these times for the last three months (January to March) are shown on the next slide.

Page 26: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

26

Chi-square – Example 2Morning 110

Mid-day 50

Evening 100

Mid-night 100

Does it appear that births are not randomly distributed throughout the day? (α = .01)

Page 27: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

27

Chi-square – Example 2

• The critical thing about a chi-square question is usually the expected values. In the previous example, we computed the expected values on the basis of probabilities of various outcomes for a fair coin.

• In this question, expected values for the number of births in each segment of the day will be based on one variable: how long in hours is each segment.

Page 28: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

28

Chi-square – Example 2

Morning: 5 AM to 1 PM = 8 hours

Mid-day: 1 PM to 5 PM = 4 hours

Evening: 5 PM to 1 AM = 8 hours

Mid-night: 1 AM to 5 AM = 4 hours

These periods are not all equal in length!

Page 29: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

29

Chi-square – Example 2

If time of day was irrelevant to when babies are born, we would expect every period of, say, 4 hours to produce the same number of babies. Since the Morning and Evening segments each contain two 4-hour periods and the Mid-day and Midnight segments each contain one 4-hour period, our expected values will be:

Morning Mid-day Evening Midnight 1/3 1/6 1/3 1/6

Page 30: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

30

Chi-square – Example 2

Our sample totals 360 babies. In 1/6 of a day (4 hours) we would expect 360/6 = 60 babies to be born, under the null hypothesis, giving these expected values for the four segments of the day:

Morning Mid-day Evening Midnight 120 60 120 60

Page 31: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

31

Chi-square – Example 2

HO: Pmorn = 1/3, Pmidday = 1/6, Peven = 1/3, Pmidnight = 1/3

HA: At least one P different from value specified in HO.

Test statistic: χ 2 = Σ[ni – E(ni)]2

E(ni)

Rejection region: χ 2obt > χ 2

crit = χ 2(.05, 3) = 7.81

Page 32: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

32

Chi-square – Example 2

χ 2obt = [110-120]2 + … + [100-60]2

120 60

= 100 + 100 + 400 + 1600 120 60 120 60

Page 33: 1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

33

Chi-square – Example 2

χ 2obt = 32.50

Decision: Reject HO. Births are not randomly scattered throughout the day.