Chi-Square
Non-parametric test Non-parametric test (distribution-free)(distribution-free)
Nominal level dependent Nominal level dependent measuremeasure
Categorical VariablesCategorical Variables
• Generally the count of objects falling in Generally the count of objects falling in each of several categories.each of several categories.
• Examples:Examples: number of fraternity, sorority, and nonaffiliated number of fraternity, sorority, and nonaffiliated
members of a classmembers of a class
number of students choosing answers: 1, 2, 3, number of students choosing answers: 1, 2, 3, 4, or 54, or 5
• Emphasis on frequency in each categoryEmphasis on frequency in each category
One-way ClassificationOne-way Classification
• Observations sorted on only one Observations sorted on only one dimensiondimension
• Example:Example: Observe children and count red, green, Observe children and count red, green,
yellow, or orange Jello choicesyellow, or orange Jello choices
Are these colors chosen equally often, or is Are these colors chosen equally often, or is there a preference for one over the other?there a preference for one over the other?
Cont.
One-way--cont.One-way--cont.
• Want to compare observed Want to compare observed frequencies with frequencies frequencies with frequencies predicted by null hypothesispredicted by null hypothesis
• Chi-square test used to compare Chi-square test used to compare expected and observedexpected and observed Called Called goodness-of-fitgoodness-of-fit chi-square ( chi-square (22))
Goodness-of-Fit Chi-Goodness-of-Fit Chi-squaresquare
• Fombonne (1989) Season of birth and Fombonne (1989) Season of birth and childhood psychosischildhood psychosis
• Are children born at particular times of Are children born at particular times of year more likely to be diagnosed with year more likely to be diagnosed with childhood psychosischildhood psychosis
• He knew the % normal children born in He knew the % normal children born in each montheach month e.g. .8.4% normal children born in Januarye.g. .8.4% normal children born in January
Fombonne’s DataFombonne’s Data
Chi-Square (Chi-Square (22))
• Compare Observed (O) with Compare Observed (O) with Expected (E)Expected (E)
• Take size of E into accountTake size of E into account With large E, a large (O-E) is not With large E, a large (O-E) is not
unusual.unusual.
With small E, a large (O-E) is unusual.With small E, a large (O-E) is unusual.
EEO 2
2 )(
Calculation of Calculation of 22
58.1446.16
)64.1628(...
43.16)43.1612(
47.17)47.1713(
)(
222
22
E
EO
2.05(11) = 19.68
Abbreviated Chi-Square Tabledf = .25 .10 .05 .025 .01
1 1.32 2.71 3.84 5.02 6.63
2 2.77 4.61 5.99 7.38 9.21
3 4.11 6.25 7.82 9.35 11.35
4 5.39 7.78 9.49 11.14 13.28
5 6.63 9.24 11.0 12.83 15.09
6 7.84 10.64 12.59 14.45 16.81
… … … … … …
11 13.70 17.28 19.68 21.92 24.72
ConclusionsConclusions• Obtained Obtained 22= 14.58= 14.58
• dfdf = = cc - 1, where - 1, where cc = # categories = # categories
• Critical value of Critical value of 2 2 on 11 on 11 dfdf = 19.68 = 19.68
• Since 19.68 > 14.58, do not reject Since 19.68 > 14.58, do not reject HH00
• Conclude that birth month distribution of Conclude that birth month distribution of children with psychoses doesn’t differ from children with psychoses doesn’t differ from normal.normal.
Jello ChoicesJello Choices
• Red Red GreenGreen YellowYellow OrangeOrange
• 3535 20 20 25 25 20 20
• Is there a significant preference for Is there a significant preference for one color of jello over other colors?one color of jello over other colors?
• RedRed Green Green YellowYellow OrangeOrange
O: 35O: 35 20 20 25 25 20 20
E: 25E: 25 25 25 25 25 25 25
XX2 2 = = (35-25)(35-25)22/25 + (20-25)/25 + (20-25)22/25 + (25-25)/25 + (25-25)22/25 + (20-/25 + (20-25)25)22/25= 6/25= 6
There was not one jello color chosen There was not one jello color chosen significantly more often than any other significantly more often than any other jello color, Xjello color, X22 (3, N= 100) = 6, p > .05 (3, N= 100) = 6, p > .05
Contingency TablesContingency Tables
• Two independent variablesTwo independent variables Are men happier than women?Are men happier than women?
• Male vs. Female X Happy vs Not HappyMale vs. Female X Happy vs Not Happy
Intimacy (Yes/No) X Intimacy (Yes/No) X Depression/NondepressionDepression/Nondepression
Intimacy and DepressionIntimacy and Depression
• Everitt & Smith (1979)Everitt & Smith (1979)
• Asked depressed and non-depressed Asked depressed and non-depressed women about intimacy with women about intimacy with boyfriend/husbandboyfriend/husband
• Data on next slideData on next slide
DataData
Chi-Square on Chi-Square on Contingency TableContingency Table
• Same formulaSame formula
• Expected frequenciesExpected frequencies E = E = RT X CTRT X CT
GT GT• RTRT = Row total, = Row total, CTCT = Column total, = Column total,
GTGT = Grand total = Grand total
EEO 2
2 )(
Expected FrequenciesExpected Frequencies
• EE1111 = 37*138/419 = 12.19 = 37*138/419 = 12.19
• EE1212 = 37*281/419 = 24.81 = 37*281/419 = 24.81
• EE2121 = 382*138/419 = 125.81 = 382*138/419 = 125.81
• EE2222 = 382*281/419 = 256.19 = 382*281/419 = 256.19
• Enter on following table Enter on following table
Observed and Expected Observed and Expected Freq.Freq.
Chi-Square CalculationChi-Square Calculation
61.25
19.25619.256270
81.12581.125112
81.24)81.2411(
19.12)19.1226()(
22
2222
E
EO
84.3)1(2
05.
Degrees of FreedomDegrees of Freedom
• For contingency table, For contingency table, dfdf = ( = (RR - 1)( - 1)(CC - 1)- 1)
• For our example this is (2 - 1)(2 - 1) For our example this is (2 - 1)(2 - 1) = 1= 1 Note that knowing any Note that knowing any oneone cell and the cell and the
marginal totals, you could reconstruct marginal totals, you could reconstruct all other cells.all other cells.
ConclusionsConclusions
• Since 25.61 > 3.84, reject Since 25.61 > 3.84, reject HH00
• Conclude that depression and intimacy are Conclude that depression and intimacy are not independent.not independent. How one responds to “satisfaction with How one responds to “satisfaction with
intimacy” depends on whether they are intimacy” depends on whether they are depressed.depressed.
Could be depression-->dissatisfaction, lack of Could be depression-->dissatisfaction, lack of intimacy --> depression, depressed people see intimacy --> depression, depressed people see world as not meeting needs, etc.world as not meeting needs, etc.
Larger Contingency Larger Contingency TablesTables
• Jankowski & LeitenbergJankowski & Leitenberg (pers. comm.)(pers. comm.) Does abuse continue?Does abuse continue?
• Do adults who are, and are not, being Do adults who are, and are not, being abused differ in childhood history of abused differ in childhood history of abuse?abuse?
• One variable = adult abuse (yes or no)One variable = adult abuse (yes or no)
• Other variable = number of abuse Other variable = number of abuse categories (out of 4) suffered as childrencategories (out of 4) suffered as children
Sexual, Physical, Alcohol, or Personal violenceSexual, Physical, Alcohol, or Personal violence
Adult Abuse
No Yes Total0 512
(494.49)54(71.51)
566
1 227(230.65)
37(33.35)
264
2 59(64.65)
15(9.35)
74
NumberChildhoodAbuseCategories
3-4 18(26.21)
12(3.79)
30
Total 816 118 934
Chi-Square CalculationChi-Square Calculation
62.29
79.379.312
21.2621.2618
...51.71
)51.7154(49.494
)49.494512()(
22
2222
E
EO
82.7)3(2
05.
ConclusionsConclusions
• 29.62 > 7.8229.62 > 7.82 Reject Reject HH00
Conclude that adult abuse is related to Conclude that adult abuse is related to childhood abusechildhood abuse
Increasing levels of childhood abuse are Increasing levels of childhood abuse are associated with greater levels of adult abuse.associated with greater levels of adult abuse.• e.g. Approximately 10% of nonabused children e.g. Approximately 10% of nonabused children
are later abused as adults.are later abused as adults.
Cont.
Nonindependent Nonindependent ObservationsObservations
• We require that observations be We require that observations be independent.independent. Only one score from each respondentOnly one score from each respondent
Sum of frequencies must equal number Sum of frequencies must equal number of respondentsof respondents
• If we don’t have independence of If we don’t have independence of observations, test is not valid.observations, test is not valid.
Small Expected Small Expected FrequenciesFrequencies
• Rule of thumb: Rule of thumb: EE >> 5 in each cell 5 in each cell Not firm ruleNot firm rule
Violated in earlier example, but probably not a Violated in earlier example, but probably not a problemproblem
• More of a problem in tables with few cells.More of a problem in tables with few cells.
• Never have expected frequency of 0.Never have expected frequency of 0.
• Collapse adjacent cells if necessary.Collapse adjacent cells if necessary.
Cont.
Expected Frequencies--Expected Frequencies--cont.cont.
• More of a problem in tables with few More of a problem in tables with few cells.cells.
• Never have expected frequency of Never have expected frequency of 0.0.
• Collapse adjacent cells if necessary.Collapse adjacent cells if necessary.
Effect SizeEffect Size
• Phi and Cramer’s PhiPhi and Cramer’s Phi
Define Define NN and and kk Not limited to 2X2 tablesNot limited to 2X2 tables
Cont.
Effect Size—cont.Effect Size—cont.
• Everitt & cc dataEveritt & cc data
Cont.
Effect Size—Odss Ratio.Effect Size—Odss Ratio.
• Odds Dep|Lack IntimacyOdds Dep|Lack Intimacy 26/112 = .23226/112 = .232
• Odds Dep | No LackOdds Dep | No Lack 11/270 = .04111/270 = .041
• Odds Ratio = .232/.041 = 5.69Odds Ratio = .232/.041 = 5.69
• Odds Depressed = 5.69 times great if Odds Depressed = 5.69 times great if experiencing lack of intimacy.experiencing lack of intimacy.
Effect Size—Risk Ratio.Effect Size—Risk Ratio.
• Risk Depression/Lack IntimacyRisk Depression/Lack Intimacy 26/138 = .18826/138 = .188
• Risk Depression | No LackRisk Depression | No Lack 11/281 = .03911/281 = .039
• Odds Ratio = .188/.039 = 4.83Odds Ratio = .188/.039 = 4.83
• Risk of Depressed = 4.83 times greater if Risk of Depressed = 4.83 times greater if experiencing lack of intimacy.experiencing lack of intimacy.