statistics study guide chi-square

Upload: ldlewis

Post on 02-Nov-2015

9 views

Category:

Documents


0 download

DESCRIPTION

Chi-Square Hypothesis Tests for Association or Independence

TRANSCRIPT

  • AP Statistics Page 1 of 4 Study Sheet: Chi-Square Hypothesis Tests for Association or Independence

    _________________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)

    Introduction By now you should be familiar with how to use the X2 statistic in goodness-of-fit tests. The idea in a goodness-of-fit test is to determine how well a sample set of outcomes matches an expected set of outcomes. For example, if you rolled a six-sided die 60 times, you'd expect to get 10 ones, 10 twos, and so on up to 10 sixes. The X2 statistic allows you to compare your set of observations with this set of expectations, but this isn't its only use. The X2 statistic can also be used to determine whether two categorical variables are associated in some way. Association and Independence Categorical variables may be related or they may be independent. For example, gender might be related to one's opinions about television programs, but it might not be related to restaurant preferences. If knowing the values of one variable gives us information about the values of another variable, we say the variables are related or associated. If knowing the values of one variable gives us no information about the values of another variable, we say the variables are not related or are not associated. We can also say they are independent.

    Computing the X2 Statistic If two variables are independent, then observed values tend to match expected values. If

    they match exactly, then 22 0.O Ex

    E

    We decide whether sample values are the result of chance by looking at the extent to which X2 differs from 0. To compute X2:

    Arrange your data into a two-variable table with one category running vertically and the other horizontally.

    Calculate the marginal distributions of each variable (that is, determine row and column totals).

    Calculate the expected values of each cell using the following formula: column totalexpeced value row totalsample size .

    Calculate 2observed-expectedexpected

    for each cell of the table. Add these values together

    to get X2.

    Calculate the degrees of freedom by multiplying the number of rows minus one by the number of columns minus one:

    df = (number of rows 1)(number of columns 1)

  • AP Statistics Page 2 of 4 Study Sheet: Chi-Square Hypothesis Tests for Association or Independence

    _________________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)

    Hypothesis Tests for Independence Hypothesis tests for independence are done in much the same way as other hypothesis tests, except that the null hypothesis is usually stated in words, rather than symbolically. To conduct a hypothesis test for independence, follow these steps:

    State your null and alternative hypotheses: Ho : The variables are independent, or there is no association between the variables. Ha : The variables are not independent, or there is an association between the variables.

    Justify your use of the test: The expected value for every cell of your table must be at least five for the chi-square test to be valid.

    Compute your test statistic and p-value: The chi-square test statistic is computed as 22 .O EX

    E

    Find the p-value by using a table or by using the X2cdf function on the TI-83/TI-84 or Chi square Cdf on the TI-89 calculator. Although you'll probably use your calculator for your calculations, you should know how to calculate this statistic and find your p-value without it.

    State your conclusion: Be sure to state your conclusion in terms of the hypothesis being tested. Use your p-value to determine the likelihood that your X2 statistic is the result of chance alone if the variables are independent.

    Key Terms Two-way table: A two-way table compares two categorical variables. The rows of the table take on the values of the row variable and the columns take on the values of the column variable. We usually say there are r rows and c columns. Thus a two-way table can be described as an r X c table. Chi-Square Test for Independence or Association: We use the X2 statistic to test hypotheses when we want to determine whether two categorical variables are independent. The variables are independent if knowledge of the values of one variable gives no information about the values of the other. If they aren't independent, we say they're associated or related. Worked Examples Performing a Chi-Square Significance Test of Association You want to see whether there's a relationship between gender and preferred exercise routine, so you collect the following data:

    Cardio Weights Both Neither Male 43 65 47 23

    Female 72 24 43 18

    A. Set up your null and alternative hypotheses. B. Compute the marginal distributions for each variable (row and column totals). C. Compute the expected values for each cell. D. Justify your use of a chi-square hypothesis test of independence. Are all necessary

    conditions met?

  • AP Statistics Page 3 of 4 Study Sheet: Chi-Square Hypothesis Tests for Association or Independence

    _________________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)

    E. Compute your test statistic (X2). F. Determine your degrees of freedom. G. Find your p-value in a table. H. What conclusion would you draw if .05?

    Answers A. Ho : The variables are independent; there is no relationship between gender and preferred exercise routine. Ha : The variables are not independent; there is a relationship between gender and preferred exercise routine. B.

    Cardio Weights Both Neither TOTAL Male 43 65 47 23 178

    Female 72 24 43 18 157 TOTAL 115 89 90 41 335

    C.

    Cardio Weights Both Neither TOTAL

    Male (115/335)(178)

    = 61.1 (89/335)(178)

    = 47.29 (90/335)(178)

    = 47.82 (41/335)(178)

    = 21.79 178

    Female (115/335)(157)

    = 53.9 (89/335)(157)

    = 41.71 (90/335)(157)

    = 42.18 (41/335)(157)

    = 19.22 157

    TOTAL 115 89 90 41 335 D. All the expected values are greater than 5, so a chi-square hypothesis test of independence will be valid for this situation. E.

    2

    2 2 2 2 22

    2 2 2 2

    25.76

    43 61.1 65 47.29 47 47.82 23 21.79

    61.1 47.29 47.82 21.79

    72 53.9 24 41.71 43 42.18 18 19.22

    53.9 41.71 42.18 19.22

    5.36 6.63 .014 .0672 6.078 7.52 0.159 .7744 25.76254

    X

    O EX

    E

    F. df = 3 degrees of freedom = (number of rows 1)(number of columns 1) = (2 1)(4 1) = (1)(3) = 3

  • AP Statistics Page 4 of 4 Study Sheet: Chi-Square Hypothesis Tests for Association or Independence

    _________________________________ Copyright 2011 Apex Learning Inc. (See Terms of Use at www.apexvs.com/TermsOfUse)

    G. p < .0005 Find 3 degrees of freedom in the left column of the table and move across until you find the range containing your X2 statistic. For 3 degrees of freedom, the highest value of X2 on the table is 17.73 (p = .0005), so your p-value is less than .0005. H. We have evidence, since ,p that allows us to reject the null hypothesis which stated that the variables are independent. If the variables were independent, a X2 statistic of 25.76 would occur due to chance alone less than .05% of the time, therefore we can reasonably claim that the variables are associated.