chi square (2)

29
Chi Square Up to this point, the inference to the population has been concerned with “scores” on one or more variables, such as CAT scores, mathematics achievement, and hours spent on the computer. We used these scores to make the inferences about population means. To be sure not all research questions involve score data. Today the data that we analyze consists of frequencies; that is, the number of individuals falling into categories. In other words, the variables are measured on a nominal scale. The test statistic for frequency data is Pearson Chi-Square. The magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies.

Upload: kevin-nyakundi

Post on 10-Apr-2016

219 views

Category:

Documents


0 download

DESCRIPTION

Introduction to Chi Square

TRANSCRIPT

Page 1: Chi Square (2)

Chi SquareUp to this point, the inference to the population has

been concerned with “scores” on one or more variables, such as CAT scores, mathematics achievement, and hours spent on the computer.

We used these scores to make the inferences about population means. To be sure not all research questions involve score data.

Today the data that we analyze consists of frequencies; that is, the number of individuals falling into categories. In other words, the variables are measured on a nominal scale.

The test statistic for frequency data is Pearson Chi-Square. The magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies.

Page 2: Chi Square (2)

Chi Square used when data are nominal (both IV and

DV) Comparing frequencies of distributions

occurring in different categories or groups Tests whether group distributions are

different Shoppers’ preference for the taste of 3 brands of

candy determines the association between IV and

DV by counting the frequencies of distribution Gender relative to study preference (alone or in

group)

Page 3: Chi Square (2)

Chi Square• Test statistic is based on counts that

represent the number of items that fall in each category

• For testing significance of patterns in qualitative data

• Test statistics measures the agreement between actual counts and expected counts assuming the null hypothesis

Page 4: Chi Square (2)

Chi Square1. Establish the level of significance:α2. Formulate the statistical hypothesis3. Calculate the test statistic4. Determine the degree of freedom5. Compare computed test statistic against a

tabled/critical value6. Determine the appropriate test

Page 5: Chi Square (2)

Chi Square2 tables are included in most statistics texts

and consist of columns and rows, with columns representing areas under the curve and rows associated with the degrees of freedom (df) which, for 2 tests of homogeneity and independence are: df = (r-1)(c-1).

Page 6: Chi Square (2)

Chi SquareThe chi-square distribution can be used to

see whether or not an observed counts agree with an expected counts. Let

o = observed count ande = Expected count

Page 7: Chi Square (2)

Formula

e

eo

FFF 2

2 )(

Page 8: Chi Square (2)

8

4. Calculating Test Statistics

e

eo

FFF 2

2 )(

Observed

frequencies

Expe

cted

fre

quen

cy

Expected

frequency

Page 9: Chi Square (2)

An example of a one variable problem or “Goodness of fit”Example from book:Four candidates for a school board position.

You poll a random sample of 200 voters regarding their candidate of choice. Do differences exist among the proportions of registered voters preferring each school board candidate.

Page 10: Chi Square (2)

Example continuedMartzial Breece Dunton artesaniObserved Frequency

fo = 40 fo = 62 fo = 56 fo = 42 Σfo = 200

ExpectedFrequency

fe =50 fe = 50 fe = 50 fe = 50 Σfe = 200

(fo - fe )2

fe

(40 – 50)2

50 (-10)2

50= 2.00

(62 – 50)2

50= (12)2

50= 2.88

( 56 – 50)2

50 (6)2

50= .72

(42 – 50)2

50 (-8)2

50= 1.28

χ = Σ fo – fe )2

fe

= 2.00 + 2.88 + .72 + 1.28 + 6.88

Page 11: Chi Square (2)

Example continuedThe one varialbe chi square test is quite simple.

The expected frequencies are simply the total n divided by the number of groups. In this case there were four candidates and n = 200, therefore, 50 voters go in each cell.

The observed frequencies are simply what is truly obtained in the sample.

Next you compute the discrepancies between the observed and expected frequencies divided by the expected frequencies and then sum to get your chi square statistic. In this case chi square = 6.88.

Page 12: Chi Square (2)

The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable

The critical tabled values are based on sampling distributions of the Pearson chi-square statistic

If calculated 2 is greater than 2 table value, reject Ho

Page 13: Chi Square (2)

Example continuedNext you compute your df.For a one variable case it is simple df = C – 1; where C = number of columns here that would be 4 – 1 = 3.Now you find your chi square critical in Table F in

the back of the book for 3 degrees of freedom and alpha is .05. The result is 7.81.

Now we see that our chi square obtained of 6.88 falls short of the chi square critical of 7.81 and we would retain our Null hypothesis. This means that all four candidates are doing equally well in the polls.

Page 14: Chi Square (2)

Chi Square for two variablesIn a two variable chi square the principal is

the same as before. The big difference is in calculating the expected frequencies for each cell. See page 349 for example.

The formula for calculating the expected frequencies is

fe = (frow ) (fcol ) nThe easiest way to learn chi square is by

example.

Page 15: Chi Square (2)

ExampleSuppose a researcher is interested in voting

preferences on gun control issues.A questionnaire was developed and sent to a

random sample of 90 voters. The researcher also collects information

about the political party membership of the sample of 90 respondents.

Page 16: Chi Square (2)

Bivariate Frequency Table or Contingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Page 17: Chi Square (2)

17

Bivariate Frequency Table or Contingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Obser

ved

frequ

encie

s

Page 18: Chi Square (2)

18

Bivariate Frequency Table or Contingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Row

frequency

Page 19: Chi Square (2)

19

Bivariate Frequency Table or Contingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90Column frequency

Page 20: Chi Square (2)

20

1. Determine Appropriate Test1. Party Membership ( 2 levels) and Nominal2. Voting Preference ( 3 levels) and Nominal

Page 21: Chi Square (2)

2. Establish Level of Significance

Alpha of .05

21

Page 22: Chi Square (2)

22

3. Determine The Hypothesis• Ho : There is no difference between D & R in

their opinion on gun control issue.

• Ha : There is an association between responses to the gun control survey and the party membership in the population.

Page 23: Chi Square (2)

23

4. Calculating Test Statistics

Favor Neutral Oppose f row

Democrat fo =10fe =13.9

fo =10fe =13.9

fo =30fe=22.2

50

Republican fo =15fe =11.1

fo =15fe =11.1

fo =10fe =17.8

40

f column 25 25 40 n = 90

Page 24: Chi Square (2)

24

4. Calculating Test StatisticsFavor Neutral Oppose f row

Democrat fo =10fe =13.9

fo =10fe =13.9

fo =30fe=22.2

50

Republican fo =15fe =11.1

fo =15fe =11.1

fo =10fe =17.8

40

f column 25 25 40 n = 90

= 50*25/90

Page 25: Chi Square (2)

25

4. Calculating Test StatisticsFavor Neutral Oppose f row

Democrat fo =10fe =13.9

fo =10fe =13.9

fo =30fe=22.2

50

Republican fo =15fe =11.1

fo =15fe =11.1

fo =10fe =17.8

40

f column 25 25 40 n = 90

= 40* 25/90

Page 26: Chi Square (2)

26

4. Calculating Test Statistics

8.17)8.1710(

11.11)11.1115(

11.11)11.1115(

2.22)2.2230(

89.13)89.1310(

89.13)89.1310(

222

2222

= 11.03

Page 27: Chi Square (2)

5. Determine Degrees of Freedom

•df = (R-1)(C-1)

27

Num

ber of

levels in

column

variable

Num

ber of levels in row

variable

Page 28: Chi Square (2)

5. Determine Degrees of Freedom

df = (R-1)(C-1) =(2-1)(3-1) = 2

28

Page 29: Chi Square (2)

29

6. Compare computed test statistic against a tabled/critical valueα = 0.05df = 2Critical tabled value = 5.991Test statistic, 11.03, exceeds critical valueNull hypothesis is rejectedDemocrats & Republicans differ significantly

in their opinions on gun control issues