1 chapter 11: analyzing the association between categorical variables section 11.1: what is...
TRANSCRIPT
![Page 1: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/1.jpg)
1
Chapter 11: Analyzing the Association Between Categorical Variables
Section 11.1: What is Independence and What is Association?
![Page 2: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/2.jpg)
2
Learning Objectives
1. Comparing Percentages
2. Independence vs. Dependence
![Page 3: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/3.jpg)
3
Learning Objective 1: Example: Is There an Association Between Happiness and Family Income?
![Page 4: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/4.jpg)
4
The percentages in a particular row of a table are called conditional percentages
They form the conditional distribution for happiness, given a particular income level
Learning Objective 1: Example: Is There an Association Between Happiness and Family Income?
![Page 5: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/5.jpg)
5
Learning Objective 1: Example: Is There an Association Between Happiness and Family Income?
![Page 6: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/6.jpg)
6
Guidelines when constructing tables with conditional distributions: Make the response variable the column
variable
Compute conditional proportions for the response variable within each row
Include the total sample sizes
Learning Objective 1: Example: Is There an Association Between Happiness and Family Income?
![Page 7: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/7.jpg)
7
Learning Objective 2:Independence vs. Dependence
For two variables to be independent, the population percentage in any category of one variable is the same for all categories of the other variable
For two variables to be dependent (or associated), the population percentages in the categories are not all the same
![Page 8: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/8.jpg)
8
Learning Objective 2:Independence vs. Dependence
Are race and belief in life after death independent or dependent?
The conditional distributions in the table are similar but not exactly identical
It is tempting to conclude that the variables are dependent
![Page 9: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/9.jpg)
9
Learning Objective 2:Independence vs. Dependence
Are race and belief in life after death independent or dependent?
The definition of independence between variables refers to a population
The table is a sample, not a population
![Page 10: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/10.jpg)
10
Even if variables are independent, we would not expect the sample conditional distributions to be identical
Because of sampling variability, each sample percentage typically differs somewhat from the true population percentage
Learning Objective 2:Independence vs. Dependence
![Page 11: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/11.jpg)
11
Chapter 11: Analyzing the Association Between Categorical Variables
Section 11.2: How Can We Test Whether Categorical Variables Are Independent?
![Page 12: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/12.jpg)
12
Learning Objectives
1. A Significance Test for Categorical Variables
2. What Do We Expect for Cell Counts if the Variables Are Independent?
3. How Do We Find the Expected Cell Counts?
4. The Chi-Squared Test Statistic
5. The Chi-Squared Distribution
6. The Five Steps of the Chi-Squared Test of Independence
![Page 13: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/13.jpg)
13
Learning Objectives
7. Chi-Squared is Also Used as a “Test of Homogeneity”
8. Chi-Squared and the Test Comparing Proportions in 2x2 Tables
9. Limitations of the Chi-Squared Test
![Page 14: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/14.jpg)
14
Learning Objective 1:A Significance Test for Categorical Variables
Create a table of frequencies divided into the categories of the two variables The hypotheses for the test are:
H0: The two variables are independent
Ha: The two variables are dependent (associated)
The test assumes random sampling and a large sample size (cell counts in the frequency table of at least 5)
![Page 15: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/15.jpg)
15
Learning Objective 2:What Do We Expect for Cell Counts if the Variables Are Independent? The count in any particular cell is a
random variable Different samples have different count
values The mean of its distribution is called an
expected cell count This is found under the presumption that
H0 is true
![Page 16: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/16.jpg)
16
Learning Objective 3:How Do We Find the Expected Cell Counts?
Expected Cell Count:
For a particular cell,
The expected frequencies are values that have the same row and column totals as the observed counts, but for which the conditional distributions are identical (this is the assumption of the null hypothesis).
size sample Total
al)Column tot( total)(Rowcount cell Expected
![Page 17: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/17.jpg)
17
Learning Objective 3:How Do We Find the Expected Cell Counts?Example
![Page 18: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/18.jpg)
18
Learning Objective 4:The Chi-Squared Test Statistic
The chi-squared statistic summarizes how far the observed cell counts in a contingency table fall from the expected cell counts for a null hypothesis
count expected
count) expected -count observed( 2
2
![Page 19: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/19.jpg)
19
State the null and alternative hypotheses for this test
H0: Happiness and family income are independent
Ha: Happiness and family income are dependent (associated)
Learning Objective 4:Example: Happiness and Family Income
![Page 20: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/20.jpg)
20
Report the statistic and explain how it was calculated:
To calculate the statistic, for each cell, calculate:
Sum the values for all the cells The value is 73.4
22
count expected
count) expected-count (observed 2
2
Learning Objective 4:Example: Happiness and Family Income
![Page 21: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/21.jpg)
21
Learning Objective 4:Example: Happiness and Family Income
![Page 22: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/22.jpg)
22
The larger the value, the greater the evidence against the null hypothesis of independence and in support of the alternative hypothesis that happiness and income are associated
2
Learning Objective 4:The Chi-Squared Test Statistic
![Page 23: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/23.jpg)
23
Learning Objective 5:The Chi-Squared Distribution
To convert the test statistic to a P-value, we use the sampling distribution of the statistic
For large sample sizes, this sampling distribution is well approximated by the chi-squared probability distribution
2
![Page 24: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/24.jpg)
24
Learning Objective 5:The Chi-Squared Distribution
![Page 25: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/25.jpg)
25
Main properties of the chi-squared distribution:
It falls on the positive part of the real number line
The precise shape of the distribution depends on the degrees of freedom:
df = (r-1)(c-1)
Learning Objective 5:The Chi-Squared Distribution
![Page 26: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/26.jpg)
26
Main properties of the chi-squared distribution: The mean of the distribution equals the
df value It is skewed to the right The larger the value, the greater the
evidence against H0: independence
Learning Objective 5:The Chi-Squared Distribution
![Page 27: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/27.jpg)
27
Learning Objective 5:The Chi-Squared Distribution
![Page 28: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/28.jpg)
28
Learning Objective 6:The Five Steps of the Chi-Squared Test of Independence
1. Assumptions:
Two categorical variables
Randomization
Expected counts ≥ 5 in all cells
![Page 29: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/29.jpg)
29
Learning Objective 6:The Five Steps of the Chi-Squared Test of Independence
2. Hypotheses:
H0: The two variables are independent
Ha: The two variables are dependent (associated)
![Page 30: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/30.jpg)
30
Learning Objective 6:The Five Steps of the Chi-Squared Test of Independence
3. Test Statistic:
count expected
count) expected -count observed( 2
2
![Page 31: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/31.jpg)
31
Learning Objective 6:The Five Steps of the Chi-Squared Test of Independence4. P-value: Right-tail probability above the
observed value, for the chi-squared distribution with df = (r-1)(c-1)
5. Conclusion: Report P-value and interpret in context If a decision is needed, reject H0 when P-value ≤
significance level
![Page 32: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/32.jpg)
32
Learning Objective 7:Chi-Squared is Also Used as a “Test of Homogeneity” The chi-squared test does not depend on which
is the response variable and which is the explanatory variable
When a response variable is identified and the population conditional distributions are identical, they are said to be homogeneous The test is then referred to as a test of
homogeneity
![Page 33: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/33.jpg)
33
Learning Objective 8:Chi-Squared and the Test Comparing Proportions in 2x2 Tables In practice, contingency tables of size 2x2 are very
common. They often occur in summarizing the responses of two groups on a binary response variable. Denote the population proportion of success by p1 in
group 1 and p2 in group 2 If the response variable is independent of the group,
p1=p2, so the conditional distributions are equal H0: p1=p2 is equivalent to H0: independence
z2 2 where
z ˆ p 1 ˆ p 2 se0
![Page 34: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/34.jpg)
34
Learning Objective 8:Example: Aspirin and Heart Attacks Revisited
![Page 35: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/35.jpg)
35
What are the hypotheses for the chi-squared test for these data?
The null hypothesis is that whether a doctor has a heart attack is independent of whether he takes placebo or aspirin
The alternative hypothesis is that there’s an association
Learning Objective 8: Example: Aspirin and Heart Attacks Revisited
![Page 36: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/36.jpg)
36
Report the test statistic and P-value for the chi-squared test:
The test statistic is 25.01 with a P-value of 0.000
This is very strong evidence that the population proportion of heart attacks differed for those taking aspirin and for those taking placebo
Learning Objective 8: Example: Aspirin and Heart Attacks Revisited
![Page 37: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/37.jpg)
37
The sample proportions indicate that the aspirin group had a lower rate of heart attacks than the placebo group
Learning Objective 8: Example: Aspirin and Heart Attacks Revisited
![Page 38: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/38.jpg)
38
Learning Objective 9:Limitations of the Chi-Squared Test
If the P-value is very small, strong evidence exists against the null hypothesis of independence
But… The chi-squared statistic and the P-value
tell us nothing about the nature of the strength of the association
![Page 39: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/39.jpg)
39
Learning Objective 9:Limitations of the Chi-Squared Test
We know that there is statistical significance, but the test alone does not indicate whether there is practical significance as well
![Page 40: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/40.jpg)
40
Learning Objective 9:Limitations of the Chi-Squared Test
The chi-squared test is often misused. Some examples are: when some of the expected frequencies are
too small when separate rows or columns are
dependent samples data are not random quantitative data are classified into categories
- results in loss of information
![Page 41: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/41.jpg)
41
Learning Objective 10:“Goodness of Fit” Chi-Squared Tests
The Chi-Squared test can also be used for testing particular proportion values for a categorical variable. The null hypothesis is that the distribution of the
variable follows a given probability distribution; the alternative is that it does not
The test statistic is calculated in the same manner where the expected counts are what would be expected in a random sample from the hypothesized probability distribution
For this particular case, the test statistic is referred to as a goodness-of-fit statistic.
![Page 42: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/42.jpg)
42
Chapter 11: Analyzing the Association Between Categorical Variables
Section 11.3: How Strong is the Association?
![Page 43: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/43.jpg)
43
Learning Objectives
1. Analyzing Contingency Tables
2. Measures of Association
3. Difference of Proportions
4. The Ratio of Proportions: Relative Risk
5. Properties of the Relative Risk
6. Large Chi-square Does Not Mean There’s a Strong Association
![Page 44: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/44.jpg)
44
Learning Objective 1:Analyzing Contingency Tables
Is there an association?
The chi-squared test of independence addresses this
When the P-value is small, we infer that the variables are associated
![Page 45: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/45.jpg)
45
Learning Objective 1:Analyzing Contingency Tables
How do the cell counts differ from what independence predicts?
To answer this question, we compare each observed cell count to the corresponding expected cell count
![Page 46: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/46.jpg)
46
Learning Objective 1:Analyzing Contingency Tables
How strong is the association?
Analyzing the strength of the association reveals whether the association is an important one, or if it is statistically significant but weak and unimportant in practical terms
![Page 47: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/47.jpg)
47
Learning Objective 2:Measures of Association
A measure of association is a statistic or a parameter that summarizes the strength of the dependence between two variables a measure of association is useful for
comparing associations
![Page 48: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/48.jpg)
48
Learning Objective 3:Difference of Proportions
An easily interpretable measure of association is the difference between the proportions making a particular response
Case (a) exhibits the weakest possible association – no association. The difference of proportions is 0
Case (b) exhibits the strongest possible association: The difference of proportions is 1
![Page 49: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/49.jpg)
49
Learning Objective 3:Difference of Proportions
In practice, we don’t expect data to follow either extreme (0% difference or 100% difference), but the stronger the association, the larger the absolute value of the difference of proportions
![Page 50: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/50.jpg)
50
Learning Objective 3:Difference of Proportions Example: Do Student Stress and Depression Depend on Gender?
Which response variable, stress or depression, has the stronger sample association with gender?
The difference of proportions between females and males was 0.35 – 0.16 = 0.19 for feeling stressed
The difference of proportions between females and males was 0.08 – 0.06 = 0.02 for feeling depressed
![Page 51: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/51.jpg)
51
In the sample, stress (with a difference of proportions = 0.19) has a stronger association with gender than depression has (with a difference of proportions = 0.02)
Learning Objective 3:Difference of Proportions Example: Do Student Stress and Depression Depend on Gender?
![Page 52: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/52.jpg)
52
Learning Objective 4:The Ratio of Proportions: Relative Risk
Another measure of association, is the ratio of two proportions: p1/p2
In medical applications in which the proportion refers to an adverse outcome, it is called the relative risk
![Page 53: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/53.jpg)
53
Learning Objective 4: Example: Relative Risk for Seat Belt Use and Outcome of Auto Accidents
Treating the auto accident outcome as the response variable, find and interpret the relative risk
![Page 54: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/54.jpg)
54
The adverse outcome is death
The relative risk is formed for that outcome
For those who wore a seat belt, the proportion who died equaled 510/412,878 = 0.00124
For those who did not wear a seat belt, the proportion who died equaled 1601/164,128 = 0.00975
Learning Objective 4: Example: Relative Risk for Seat Belt Use and Outcome of Auto Accidents
![Page 55: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/55.jpg)
55
The relative risk is the ratio:
0.00124/0.00975 = 0.127
The proportion of subjects wearing a seat belt who died was 0.127 times the proportion of subjects not wearing a seat belt who died
Learning Objective 4: Example: Relative Risk for Seat Belt Use and Outcome of Auto Accidents
![Page 56: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/56.jpg)
56
Many find it easier to interpret the relative risk but reordering the rows of data so that the relative risk has value above 1.0
Learning Objective 4: Example: Relative Risk for Seat Belt Use and Outcome of Auto Accidents
![Page 57: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/57.jpg)
57
Reversing the order of the rows, we calculate the ratio: 0.00975/0.00124 = 7.9
The proportion of subjects not wearing a seat belt who died was 7.9 times the proportion of subjects wearing a seat belt who died
Learning Objective 4: Example: Relative Risk for Seat Belt Use and Outcome of Auto Accidents
![Page 58: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/58.jpg)
58
A relative risk of 7.9 represents a strong association
This is far from the value of 1.0 that would occur if the proportion of deaths were the same for each group
Wearing a set belt has a practically significant effect in enhancing the chance of surviving an auto accident
Learning Objective 4: Example: Relative Risk for Seat Belt Use and Outcome of Auto Accidents
![Page 59: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/59.jpg)
59
Learning Objective 5:Properties of the Relative Risk
The relative risk can equal any nonnegative number
When p1= p2, the variables are independent and relative risk = 1.0
Values farther from 1.0 (in either direction) represent stronger associations
![Page 60: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/60.jpg)
60
Learning Objective 6:Large Does Not Mean There’s a Strong Association A large chi-squared value provides strong
evidence that the variables are associated It does not imply that the variables have a
strong association This statistic merely indicates (through its P-
value) how certain we can be that the variables are associated, not how strong that association is
![Page 61: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/61.jpg)
61
Chapter 11: Analyzing the Association Between Categorical Variables
Section 11.4: How Can Residuals Reveal The Pattern of Association?
![Page 62: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/62.jpg)
62
Learning Objectives
1. Association Between Categorical Variables
2. Residual Analysis
![Page 63: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/63.jpg)
63
Learning Objective 1:Association Between Categorical Variables
The chi-squared test and measures of association such as (p1 – p2) and p1/p2 are fundamental methods for analyzing contingency tables
The P-value for summarized the strength of evidence against H0: independence
![Page 64: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/64.jpg)
64
Learning Objective 1:Association Between Categorical Variables
If the P-value is small, then we conclude that somewhere in the contingency table the population cell proportions differ from independence
The chi-squared test does not indicate whether all cells deviate greatly from independence or perhaps only some of them do so
![Page 65: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/65.jpg)
65
Learning Objective 2:Residual Analysis
A cell-by-cell comparison of the observed counts with the counts that are expected when H0 is true reveals the nature of the evidence against H0
The difference between an observed and expected count in a particular cell is called a residual
![Page 66: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/66.jpg)
66
Learning Objective 2:Residual Analysis
The residual is negative when fewer subjects are in the cell than expected under H0
The residual is positive when more subjects are in the cell than expected under H0
![Page 67: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/67.jpg)
67
Learning Objective 2:Residual Analysis
To determine whether a residual is large enough to indicate strong evidence of a deviation from independence in that cell we use a adjusted form of the residual: the standardized residual
![Page 68: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/68.jpg)
68
Learning Objective 2:Residual Analysis
The standardized residual for a cell=
(observed count – expected count)/se
A standardized residual reports the number of standard errors that an observed count falls from its expected count
The se describes how much the difference would tend to vary in repeated sampling if the variables were independent Its formula is complex Software can be used to find its value
A large standardized residual value provides evidence against independence in that cell
![Page 69: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/69.jpg)
69
Learning Objective 2: Example: Standardized Residuals for Religiosity and Gender
“To what extent do you consider yourself a religious person?”
![Page 70: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/70.jpg)
70
Interpret the standardized residuals in the table The table exhibits large positive residuals for the
cells for females who are very religious and for males who are not at all religious.
In these cells, the observed count is much larger than the expected count
There is strong evidence that the population has more subjects in these cells than if the variables were independent
Learning Objective 2: Example: Standardized Residuals for Religiosity and Gender
![Page 71: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/71.jpg)
71
The table exhibits large negative residuals for the cells for females who are not at all religious and for males who are very religious
In these cells, the observed count is much smaller than the expected count
There is strong evidence that the population has fewer subjects in these cells than if the variables were independent
Learning Objective 2: Example: Standardized Residuals for Religiosity and Gender
![Page 72: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/72.jpg)
72
Chapter 11: Analyzing the Association Between Categorical Variables
Section 11.5: What if the Sample Size is Small?
Fisher’s Exact Test
![Page 73: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/73.jpg)
73
Learning Objectives
1. Fisher’s Exact Test
2. Example using Fisher’s Exact Test
3. Summary of Fisher’s Exact Test of Independence for 2x2 Tables
![Page 74: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/74.jpg)
74
Learning Objective 1:Fisher’s Exact Test
The chi-squared test of independence is a large-sample test
When the expected frequencies are small, any of them being less than about 5, small-sample tests are more appropriate
Fisher’s exact test is a small-sample test of independence
![Page 75: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/75.jpg)
75
Learning Objective 1:Fisher’s Exact Test
The calculations for Fisher’s exact test are complex
Statistical software can be used to obtain the P-value for the test that the two variables are independent
The smaller the P-value, the stronger the evidence that the variables are associated
![Page 76: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/76.jpg)
76
Learning Objective 2:Fisher’s Exact Test Example: Tea Tastes Better with Milk Poured First?
This is an experiment conducted by Sir Ronald Fisher
His colleague, Dr. Muriel Bristol, claimed that when drinking tea she could tell whether the milk or the tea had been added to the cup first
![Page 77: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/77.jpg)
77
Experiment: Fisher asked her to taste eight cups of tea:
Four had the milk added first Four had the tea added first She was asked to indicate which four
had the milk added first The order of presenting the cups was
randomized
Learning Objective 2:Fisher’s Exact Test Example: Tea Tastes Better with Milk Poured First?
![Page 78: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/78.jpg)
78
Results:
Learning Objective 2:Fisher’s Exact Test Example: Tea Tastes Better with Milk Poured First?
![Page 79: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/79.jpg)
79
Analysis:
Learning Objective 2:Fisher’s Exact Test Example: Tea Tastes Better with Milk Poured First?
![Page 80: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/80.jpg)
80
The one-sided version of the test pertains to the alternative that her predictions are better than random guessing
Does the P-value suggest that she had the ability to predict better than random guessing?
Learning Objective 2:Fisher’s Exact Test Example: Tea Tastes Better with Milk Poured First?
![Page 81: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/81.jpg)
81
The P-value of 0.243 does not give much evidence against the null hypothesis
The data did not support Dr. Bristol’s claim that she could tell whether the milk or the tea had been added to the cup first
Learning Objective 2:Fisher’s Exact Test Example: Tea Tastes Better with Milk Poured First?
![Page 82: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/82.jpg)
82
Learning Objective 3:Summary of Fisher’s Exact Test of Independence for 2x2 Tables Assumptions:
Two binary categorical variables Data are random
Hypotheses: H0: the two variables are independent (p1=p2)
Ha: the two variables are associated
(p1≠p2 or p1>p2 or p1<p2)
![Page 83: 1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?](https://reader030.vdocuments.us/reader030/viewer/2022032723/56649f525503460f94c76127/html5/thumbnails/83.jpg)
83
Learning Objective 3:Summary of Fisher’s Exact Test of Independence for 2x2 Tables Test Statistic:
First cell count (this determines the others given the margin totals)
P-value: Probability that the first cell count equals the
observed value or a value even more extreme as predicted by Ha
Conclusion: Report the P-value and interpret in context. If
a decision is required, reject H0 when P-value ≤ significance level