brm chi square test
TRANSCRIPT
1
Chi-Square Test
Introduction
• The chi-square is Non parametric test
• Makes no assumption about distribution
of population
• Use data at the nominal scale
Chi-Square Test
In Chi-square we compare the observed frequency of some
Observation With the expected frequency
The comparison of observed and expected frequencies is used to
calculate the value of the chi-square statistic
The symbol for chi-square and the formula are as follows:
where
• O is the observed frequency, and
• E is the expected frequency.
How will you show whether the distribution of absenteeism is significant..?
The HR Manager at Georgetown Paper Ltd. Is
concerned about absenteeism among workers. He
decides to sample the company records to
determine whether absenteeism is distributed evenly
throughout the six day work week.
Week Day Number Absent
Mon 12
Tue 9
Wed 11
Thu 10
Fri 9
Sat 9
2
Chi-Square Test (goodness-of-fit test)
Suppose we required to purchase
computer for college use, select
Apple Computers, IBM
Computers, or Some other brand
of computer. We want to know if
there is a significant difference
among the frequencies with which
these three brands of computers
are selected or if the students select
equally among the three brands.
The data for 100 students is
recorded in the table.
Computer
No. of
people
preferred
IBM 47
Apple 36
Other 17
Frequency Table
Frequency with which students select computer brand
Computer Observed
Frequency
IBM 47
Apple 36
Other 17
From the table we can see that:
Expected
Frequency (O-E)2/E
33.333 5.604
33.333 0.213
33.333 8.003
Total (chi-square)
13.820
Equal expected frequencies
This table also indicated the expected frequency for
each category. Since there are 100 measures or
observations and there are three categories (Apple,
IBM, and Other) we would indicate the expected
frequency for each category to be 100/3 or 33.333.
3
Critical Value for Chi-Square
The degree of freedom
It is the number of constraints needed to calculate the critical value
Degree of freedom (df)= (C-1)
Chi-Square Test (goodness-of-fit test)
Null hypothesis:
• there are no differences between the observed and
the expected frequencies.
Alternate hypothesis
• there are significant differences between the
observed and expected frequencies.
Set the alpha level.
• alpha level at .05
• Calculate the critical value with respect to the
degrees of freedom & alpha level
df = C – 1
=(3 -1)= 2
for df =2 & Alpha =.05, the critical value is 5.991
Chi-Square Test (goodness-of-fit test)
4
• Write the decision rule for rejecting the null
hypothesis.
Reject H0 if Chi-Square > or = 5.991.
Chi-Square Test (goodness-of-fit test)
Write a summary statement based on the decision.
Since our calculated value of (13.820) is greater than
5.991, we reject the null hypothesis and not reject the
alternative hypothesis.
Chi-Square (goodness-of-fit test) with equal expected frequencies
Chi-Square Test (goodness-of-fit test)
In a national study, students required to buy computers
for college use bought IBM computers 50% of the time,
Apple computers 25% of the time, and other computers
25% of the time. A survey on 100 new students shows that
36 bought Apple Computers, 47 bought IBM computers,
and 17 bought some other brand of computer. We want to
know if these frequencies of computer buying behavior is
similar to or different than the national study data.
5
• Null hypothesis
• there are no differences between the observed
and the expected frequencies.
• Alternate hypothesis
• there are significant differences between the
observed and expected frequencies.
• Set the alpha level
• alpha level is .05
Chi-Square Test (goodness-of-fit test)
Computer Observed
Frequency
Expected
Frequency (O-E)2/E
IBM 47 50 0.18
Apple 36 25 4.84
Other 17 25 2.56
Total (chi-square)
7.58
From the table we can see that:
Chi-Square (goodness-of-fit test) with unequal expected frequencies
• Calculate the critical value with respect to the
degrees of freedom & alpha level
df = (C - 1) = 2
for df =2 & Alpha =.05
the critical value is 5.991
• Write the decision rule for rejecting the null
hypothesis.
Reject H0 if Chi-Square >= 5.991.
Chi-Square (goodness-of-fit test) with unequal expected frequencies
6
Write a summary statement based on the decision.
Since our calculated value of (7.58) is greater
than 5.991, we reject the null hypothesis and not reject
the alternative hypothesis.
Chi-Square (goodness-of-fit test) with unequal expected frequencies
Chi square test: Test of
Independence
Chi-Square :test of independence
we wants to know if there is a significant
difference in the frequencies with which males
come from small, medium, or large cities and
females comes. The two variables we are
considering here are hometown size (small,
medium, or large) and gender (male or female).
Another way of putting our research question is:
Is gender independent of size of hometown?
7
The data for 30 females and 6 males is in the following table.
Frequency with which males and females
come from small, medium, and large cities
Small Medium Large Totals
Female 10 14 6 30
Male 4 1 1 6
Totals 14 15 7 36
Chi-Square :test of independence
The formula for chi-square :
where
O is the observed frequency, and
E is the expected frequency.
Chi-Square :test of independence
• The degrees of freedom for the two-dimensional chi-square statistic is:
df = (C - 1)(R - 1)
• where C is the number of columns or levels of the first variable and R is the number of rows or levels of the second variable.
• In the table above we have the observed frequencies. Now we must calculate the expected frequency for each of the six cells. For two-variable chi-square we find the expected frequencies with the formula:
Expected Frequency for a Cell =
(Column Total X Row Total)/Grand Total
Chi-Square :test of independence
8
• In the table above we can see that the Column Totals are 14 (small), 15 (medium), and 7 (large), while the Row
Totals are 30 (female) and 6 (male). The grand total is 36. • Using the formula we can thus find the expected
frequency for each cell.
• The expected frequency for the small female cell is 14X30/36 = 11.667
• The expected frequency for the medium female cell is 15X30/36 = 12.500
• The expected frequency for the large female cell is 7X30/36 = 5.833
• The expected frequency for the small male cell is 14X6/36 = 2.333
• The expected frequency for the medium male cell is 15X6/36 = 2.500
• The expected frequency for the large male cell is 7X6/36 = 1.167
Chi-Square :test of independence
We can put these expected frequencies in our table and also include the values
for (O - E)2/E. The sum of all these will of course be the value of chi-square.
Observed frequencies, expected frequencies, and (O - E)2/E for males and
females from small, medium, and large cities
Small Medium Large Totals
Observed Expected (O-E)2/E Observed Expected (O-E)2/E Observed Expected (O-E)2/E
Female 10 11.667 0.238 14 12.500 0.180 6 5.833 0.005 30
Male 4 2.333 1.191 1 2.500 0.900 1 1.167 0.024 6
Totals 14
15
7
36
Chi-Square :test of independence
From the table we can see that:
and df = (C - 1)(R - 1) = (3 - 1)(2 - 1) = (2)(1) = 2
Chi-Square :test of independence
9
State the null hypothesis and the alternative hypothesis
Set the alpha level.
Calculate the value of the appropriate statistic. Also indicate
the degrees of freedom
df = (C - 1)(R - 1) = (2)(1) = 2 at this df & 5%α
Write the decision rule for rejecting the null hypothesis.
Reject H0 if >= 5.991.
Chi-Square :test of independence
To write the decision rule we can know the critical value
by looking at Table and noting the tabled value for the column
for the .05 level and the row for 2 df.
Write a summary statement based on the decision.
Fail to reject H0
Note: Since our calculated value of (2.538) is not greater than
5.991, we fail to reject the null hypothesis and so reject the
alternative hypothesis.
Chi-Square :test of independence
27
Example
Ms. Jan Kilpatrick is the marketing manager for a manufacturer of sports cards. She plans to begin selling a series of cards with pictures and playing statistics of former Major League Baseball players. One of the problems is the selection of the former players. At a baseball card show at South Mall last weekend, she set up a booth and offered cards of the following six Hall of Fame baseball players: Tom Seaver, Nolan Ryan, Ty Cobb, George Brett, Hank Aaron, and Johnny Bench. At the end of the day she sold a total of 120 cards. The number of cards sold for each old-time player is shown in the table on the right. Can she conclude the sales are not the same for each player?
10
28
Step 1: State the null hypothesis and the alternate hypothesis.
H0: there is no difference between fo and fe
H1: there is a difference between fo and fe
Step 2: Select the level of significance.
α = 0.05 as stated in the problem
Step 3: Select the test statistic.
The test statistic follows the chi-square distribution,
designated as χ2
29
Step 4: Formulate the decision rule.
070.11
2
5,05.2
2
16,05.2
2
1,2
2
1,22 if 0HReject
ef
efof
ef
efof
ef
efof
kef
efof
k
30
11
31
Step 5: Compute the value of the Chi-square
statistic and make a decision
e
eo
f
ff2
2
32
34.40
The computed χ2 of 34.40 is in the rejection region, beyond the critical value of 11.070. The
decision, therefore, is to reject H0 at the .05 level . Conclusion: The difference between the observed and the expected frequencies is not due to
chance. Rather, the differences between f0 and fe and are large enough to be considered
significant. It is unlikely that card sales are the same among the six players.
33
Step 1: State the null hypothesis and the alternate hypothesis.
H0: There is no relationship between adjustment to civilian life
and where the individual lives after being released from prison.
H1: There is a relationship between adjustment to civilian life
and where the individual lives after being released from prison.
Step 2: Select the level of significance. α = 0.01 as stated in the problem
Step 3: Select the test statistic. The test statistic follows the chi-square distribution, designated as χ2
Contingency Analysis - Example
12
34
Step 4: Formulate the decision rule.
345.11
if HReject
2
3,01.2
2
)3)(1(,01.2
2
)14)(12(,2
2
)1)(1(,22
0
e
eo
e
eo
e
eo
e
eo
cr
f
ff
f
ff
f
ff
f
ff
Contingency Analysis - Example
35
Computing Expected Frequencies (fe)
(120)(50)
200
36
Computing the Chi-square Statistic
13
37
Conclusion
5.729
The computed χ2 of 5.729 is in the “Do not rejection H0” region. The null hypothesis is not rejected
at the .01 significance level.
We conclude there is no evidence of a relationship between adjustment to civilian life and where
the prisoner resides after being released from prison. For the Federal Correction Agency’s
advisement program, adjustment to civilian life is not related to where the ex-prisoner lives.