fundamental statistics in applied linguistics research spring 2010 weekend ma program on applied...
TRANSCRIPT
Fundamental Statistics in Fundamental Statistics in Applied Linguistics ResearchApplied Linguistics Research
Spring 2010Spring 2010Weekend MA Program on Applied English Weekend MA Program on Applied English
Dr. Da-Fu HuangDr. Da-Fu Huang
7. Finding group differences with Chi-Square when all variables are categorical
7.1 Two main uses of the chi-square testTest for goodness of fit of the dataTest for group independence
7. Finding group differences with Chi-Square
7.2 Test for goodness of fitOnly one categorical variable with 2 or more
levels of choicesWhether observed frequencies match expected
frequencies if every chance were equally likelyMeasure how good the fit is to the probabilities
that we expectDesired Foreign Language at one university (χ2 = 8.2, p = .09, df = 4)
Chinese Spanish French German Japanese
23 20 15 13 29
7. Finding group differences with Chi-Square
7.3 Test for group independence2 or more categorical variables with 2 or more
levels of choicesWhether there is any association between the
variablesDesired Foreign Language at two universities
Language
Chinese Spanish French German Japanese Total
HT U 23 20 15 13 29 100
BC U 14 25 10 26 25 100
Total 37 45 25 39 54 200
7. Finding group differences with Chi-Square 7.3 Test for group independence Observed and expected frequencies for the foreign language survey
Chinese Spanish French German Japanese Total
Observed frequencies
HT U 23 20 15 13 29 100
BC U 14 25 10 26 25 100
Total 37 45 25 39 54 200
Expected frequencies
HT U (100*37)/200 (100*45)/200 (100*25)/200 (100*39)/200 (100*54)/200
BC U (100*37)/200 (100*45)/200 (100*25)/200 (100*39)/200 (100*54)/200
HT U 18.5 22.5 12.5 18.5 27
BC U 18.5 22.5 12.5 18.5 27
χ2 = Σ [(O – E)2 / E] ( = 8.374, p = .07, df = 4 ) (df = # levels – 1)
7. Finding group differences with Chi-Square 7.4 Situations that look like Chi-square but are notScenario #1: Case study, only one participant
The binomial testScenario #2: Binary choice, only one variable
with exactly 2 levelsThe binomial test
Scenario #3: Matched pairs with categorical outcomeThe McNemar test
Scenario #4: Summary over a number of similar items by the same participants
Application activities (8.1.4): PP215-216
7. Finding group differences with Chi-Square
7.5 Data inspection: Tables and Crosstabs
7.5.1 Summary tables for goodness-of-fit data Analyze > Descriptive Statistics > Frequencies
Student English proficiency
Frequency Percent Valid Percent
Cumulative
Percent
Low 9 16.7 16.7 16.7
Mid 26 48.1 48.1 64.8
High 19 35.2 35.2 100.0
Valid
Total 54 100.0 100.0
7. Finding group differences with Chi-Square
7.5 Data inspection: Tables and Crosstabs
7.5.1 Summary tables for goodness-of-fit data Analyze > Descriptive Statistics > Frequencies
Student English proficiency
Frequency Percent Valid Percent
Cumulative
Percent
Low 9 16.7 16.7 16.7
Mid 26 48.1 48.1 64.8
High 19 35.2 35.2 100.0
Valid
Total 54 100.0 100.0
7. Finding group differences with Chi-Square 7.5 Data inspection: Tables and Crosstabs
7.5.2 Summary tables for group-independence data (crosstabs)
Analyze > Descriptive Statistics > Crosstabs Move variables into Row, Column, and Layer (when
more than 2 variables)
Student English proficiency * Major1 Crosstabulation
Count
Major1
non-English
majors English majors Total
High 9 4 13
Mid 25 4 29
Student English proficiency
Low 12 0 12
Total 46 8 54
7. Finding group differences with Chi-Square
7.5 Data inspection: Tables and Crosstabs
7.5.3 Bar plots with one and two categorical variables
Graphs > Legacy Dialogs > Bar With one variable, choose Simple, and
Summaries For Groups Of CasesWith 2 variables, choose Clustered, and
Summaries For Groups Of Cases. Put the variables in “Category Axis” and “Define clusters by” boxes
7. Finding group differences with Chi-Square
7.6 Assumptions of Chi-Square (PP226-228)Independence of observations (no repeated
measures)Nominal data (no inherent rank or order)Data are normally distributed (there are at least
5 cases in every cell)Non-occurrences must be included as well as
occurrences
7. Finding group differences with Chi-Square 7.7 Chi-Square statistic test
7.7.1 One-way goodness-of-fit Chi-Square in SPSS Analyze > Nonparametric Tests > Chi-Square Put variable in “Test Variable List” box
Test Statistics
Student English
proficiency
Chi-Square 8.111a
df 2
Asymp. Sig. .017
a. 0 cells (.0%) have expected
frequencies less than 5. The
minimum expected cell
frequency is 18.0.
Student English proficiency
Observed N Expected N Residual
Low 9 18.0 -9.0
Mid 26 18.0 8.0
High 19 18.0 1.0
Total 54
7. Finding group differences with Chi-Square
7.7 Chi-Square statistic test
7.7.2 Two-way group-independence Chi-Square in SPSS Analyze > Descriptive Statistics > Crosstabs Tick “Display clustered bar charts” box for a bar plotOpen Statistics and tick “Chi-Square” and “Phi and
Cramer’s V” boxesOpen Cells and tick “Expected values” and all of the
boxes under “Percentages”
Chi-Square statistic test (Two-way group-independence )
Test Statistics
Student English
proficiency
students from
different colleges
Chi-Square 8.111a 16.000b
df 2 4
Asymp. Sig. .017 .003
a. 0 cells (.0%) have expected frequencies less
than 5. The minimum expected cell frequency is
18.0.
b. 0 cells (.0%) have expected frequencies less
than 5. The minimum expected cell frequency is
10.8.
Chi-Square statistic test (Two-way group-independence )
Chi-Square Tests
Value df
Asymp. Sig.
(2-sided)
Pearson Chi-Square 10.431a 8 .236
Likelihood Ratio 10.737 8 .217
Linear-by-Linear Association 2.182 1 .140
N of Valid Cases 54
a. 11 cells (73.3%) have expected count less than 5. The minimum
expected count is .33.
Assuming that the variables are ordinal. Report this if your variables have inherent rank
Alternative to Pearson Chi-Square. Should be equivalent to the Chi-Square when sample sizes are large.
Chi-Square statistic test (Two-way group-independence )
Measures of effect size for the chi-square Phi (2 x 2 contingency tables with 2 levels /var) Cramer’s V
(larger than 2 x 2 with more than 2 levels/var)
Symmetric Measures
Value Approx. Sig.
Phi .440 .236 Nominal by Nominal
Cramer's V .311 .236
N of Valid Cases 54
Measures of effect size for the chi-square
Phi (2 x 2 contingency tables with 2 levels /var)
Cramer’s V (larger than 2 x 2 with more than 2 levels/var)w = phi (2x2 tables); = V √r-1 ( >2 levels ) (V = Cramer’s V; r =
the # of rows or columns whichever is smaller)
Odds ratio (= N11*N22 / N12*N21)Table subscripts
N11 N12
N21 N22
Reporting Chi-square test results
Contingency table with a summary of data and statistical results
Chi-square valueDfP-valueEffect size (for test for group independence)Phi, Cramer’s V, w, or odds ratioExample reporting (P239)
Contingency table (2 X 3)
Student English proficiency * Major1 Crosstabulation
Count
Major1
non-English majors English majors Total
High 9 4 13
Mid 25 4 29
Student English proficiency
Low 12 0 12
Total 46 8 54