analysis of variance (anova)

50
ANALYSIS OF VARIANCE (ANOVA) Avjinder Singh Kaler and Kristi Mai

Upload: avjinder-avi-kaler

Post on 21-Apr-2017

2.323 views

Category:

Data & Analytics


4 download

TRANSCRIPT

Page 1: Analysis of Variance (ANOVA)

ANALYSIS OF VARIANCE (ANOVA) Avjinder Singh Kaler and Kristi Mai

Page 2: Analysis of Variance (ANOVA)

Estimating a Population Variance/Standard Deviation β€’ πœ’2 (Chi-Square) Distribution

Comparing Variation in Two Samples β€’ F Distribution

One-Way Analysis of Variance (ANOVA)

Multiple Comparison Tests β€’ Tukey Test

Two-Way Analysis of Variance (ANOVA)

Page 3: Analysis of Variance (ANOVA)

Main Ideas:

β€’ The sample variance is the best point estimate of the population

variance and the sample standard deviation is typically used to

estimate the population standard deviation

β€’ We can use a sample variance to construct a C.I. to estimate the true

value of a population variance and we can also use a sample

standard deviation to construct a C.I. to estimate the true value of a

population standard deviation

β€’ We can also test claims about a population variance or standard

deviation

Page 4: Analysis of Variance (ANOVA)

If a population has a normal distribution, then the following formula described

the πœ’2 distribution: πœ’2 =π‘›βˆ’1 βˆ—π‘ 2

𝜎2

This is a Chi-Square-score and is a measure of relative standing

We NEED degrees of freedom for the πœ’2 distribution

β€’ 𝑑𝑓 = 𝑛 βˆ’ 1 (in this situation)

β€’ Although this value for degrees of freedom is common, 𝑑𝑓 are NOT always 𝑛 βˆ’ 1

Properties of the Chi-Square Distribution:

β€’ The πœ’2 distribution is NOT symmetric like the t-distribution or the Normal distribution

Note: Because the distribution is NOT symmetric, the C.I. will NOT be 𝑠2 Β± 𝐸

β€’ The values of πœ’2 can be β‰₯ 0 but cannot be negative

β€’ The πœ’2 distribution is different for different degrees of freedom

Page 5: Analysis of Variance (ANOVA)
Page 6: Analysis of Variance (ANOVA)

Main Ideas: The sample variance is the best point estimate of the population

variance and the sample standard deviation is typically used to estimate the population standard deviation

We can use two sample variances to test claims about the difference between two population variances

Page 7: Analysis of Variance (ANOVA)

β€’ If two populations are normally distributed with equal variances ( i.e. 𝜎12 = 𝜎2

2, then

the following formula describes the F distribution: 𝐹 =𝑠12

𝑠22

β€’ This is an F-score and is a measure of relative standing

β€’ Notice that this distribution compares the two variations in the form of a ratio

β€’ We NEED two different degrees of freedom for the F distribution β€’ In this particular situation, we have:

β€’ Numerator 𝑑𝑓 = 𝑛1 βˆ’ 1

β€’ Denominator 𝑑𝑓 = 𝑛2 βˆ’ 1

β€’ Properties of the F Distribution: β€’ The F distribution is NOT symmetric like the t-distribution or the Normal distribution

β€’ The values of F can be β‰₯ 0 but cannot be negative

β€’ The F distribution is different for different degrees of freedom and depends on TWO different degrees of freedom

Page 8: Analysis of Variance (ANOVA)
Page 9: Analysis of Variance (ANOVA)

Main Ideas: We can actually extend the hypothesis testing foundation that we

already have to test the claim that three or more population means are all equal β€’ i.e. 𝐻0: πœ‡1 = πœ‡2 = β‹― = πœ‡π‘˜ vs. 𝐻1: 𝐴𝑑 π‘™π‘’π‘Žπ‘ π‘‘ π‘œπ‘›π‘’ π‘šπ‘’π‘Žπ‘› π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘ 

We can test this null hypothesis by analyzing sample variances

This test (a One-Way ANOVA) is appropriate when we wish to compare three or more population means within a set of quantitative data that is categorized according to one treatment (or factor) β€’ Treatment (factor) – a characteristic allowing us to distinguish between the

different populations of interest

We CANNOT simply test two samples at a time

Page 10: Analysis of Variance (ANOVA)

Requirements: The populations have different distributions that are approximately

normal β€’ Loose requirement – only a problem is a population is very far from normal

The populations have the same variance 𝜎2 β€’ Loose requirement – the ratio of variances can be as large as 9:1

The samples are SRS of quantitative data

The samples are independent of each other

The different samples are from populations that are categorized in only one way

Page 11: Analysis of Variance (ANOVA)

Test Statistic:

𝐹 =𝑀𝑆(π‘‡π‘Ÿπ‘’π‘Žπ‘‘π‘šπ‘’π‘›π‘‘)

𝑀𝑆(πΈπ‘Ÿπ‘Ÿπ‘œπ‘Ÿ)β‰ˆ

π‘‰π‘Žπ‘Ÿπ‘–π‘Žπ‘›π‘π‘’ 𝑏𝑒𝑑𝑀𝑒𝑒𝑛 π‘ π‘Žπ‘šπ‘π‘™π‘’π‘ 

π‘‰π‘Žπ‘Ÿπ‘–π‘Žπ‘›π‘π‘’ π‘€π‘–π‘‘β„Žπ‘–π‘› π‘ π‘Žπ‘šπ‘π‘™π‘’π‘ 

Note: p-values and critical values are from the F distribution

The F test statistic is very sensitive to sample means, even though it is based on variance

Degrees of Freedom β€’ Equal Sample Sizes

Numerator 𝑑𝑓 = π‘˜ βˆ’ 1

Denominator 𝑑𝑓 = π‘˜(𝑛 βˆ’ 1)

β€’ Unequal Sample Sizes Numerator 𝑑𝑓 = π‘˜ βˆ’ 1

Denominator 𝑑𝑓 = 𝑁 βˆ’ π‘˜

β€’ Notation: π‘˜: number of samples

𝑛: number of values in each sample (i.e. sample size)

𝑁: total number of values in all samples combined

Page 12: Analysis of Variance (ANOVA)

Notice the One-Way ANOVA is an F test – like comparing variances. Specifically, it is a right-tailed F Test.

Conclusion Cautions:

β€’ Rejecting the null hypothesis does NOT tell us that all of the means are different!

β€’ In fact, rejecting the null hypothesis cannot tell us which mean(s) is(are) different

Page 13: Analysis of Variance (ANOVA)

Use the performance IQ

scores listed in Table 12-1

and a significance level

of Ξ± = 0.05 to test the

claim that the three

samples come from

populations with means

that are all equal.

Page 14: Analysis of Variance (ANOVA)

Here are summary statistics from the collected data:

Page 15: Analysis of Variance (ANOVA)

Requirement Check:

1. The three samples appear to come from populations that are

approximately normal.

2. The three samples have standard deviations that are not dramatically

different.

3. We can treat the samples as simple random samples.

4. The samples are independent of each other and the IQ scores are not

matched in any way.

5. The three samples are categorized according to a single factor: low

lead, medium lead, and high lead.

Page 16: Analysis of Variance (ANOVA)

The hypotheses are:

The significance level is Ξ± = 0.05.

H0

:1

2

3

H1: At least one of the means is different from the others.

Page 17: Analysis of Variance (ANOVA)
Page 18: Analysis of Variance (ANOVA)

From StatCrunch results, the p-value is 0.020 when rounded.

Because the P-value is less than the significance level of Ξ± = 0.05, we

can reject the null hypothesis.

There is sufficient evidence that the three samples come from

populations with means that are different.

We cannot conclude formally that any particular mean is different from

the others, but it appears that greater blood lead levels are associated

with lower performance IQ scores.

Page 19: Analysis of Variance (ANOVA)

Larger values of the test statistic result in smaller P-values, so the ANOVA

test is right-tailed.

Assuming that the populations have the same variance Οƒ2 (as required

for the test), the F test statistic is the ratio of these two estimates of Οƒ2:

1) variation between samples (based on variation among sample

means)

2) variation within samples (based on the sample variances)

Page 20: Analysis of Variance (ANOVA)
Page 21: Analysis of Variance (ANOVA)

Multiple Comparison Tests – these tests should be used to identify where the

difference(s) in the means lie if the null hypothesis in the One-Way ANOVA is

rejected. Multiple comparison tests use pairs of means to identify which means

are different while still accounting for the multiple testing problem mentioned

previously by making adjustments to ensure an adequate significance level

β€’ Examples: Duncan, SNK, Scheffe, Dunnett, LSD, Bonferroni, and Tukey Tests

In this course we will utilize the Tukey Test!

β€’ The Tukey Test provides associated p-values for the comparison of each pair of

means

This test will allow you to identify if the means of any two of the π‘˜ many means differ

The Null Hypothesis in this test assumes the equality of the two means being

compared

Page 22: Analysis of Variance (ANOVA)

β€’ The average MPG for 2000-2010 vehicles from four car manufacturers are

compared. We would like to see if there is a difference in average MPG.

(Notice that we are testing to see if there is a difference in the means of a

quantitative variable, MPG, across four different factors/treatments, the

manufacturer)

β€’ After deeming a One-Way ANOVA appropriate for this research question

and checking the requirements for this statistical procedure, we find that

at least one of the mean MPGs differs due to a low p-value (for instance,

0.0003). (Refer back to the One-Way ANOVA section and the hypotheses

for this test)

Page 23: Analysis of Variance (ANOVA)

β€’ Since the One-Way ANOVA revealed a difference but cannot tell us,

specifically, where the difference in mean MPGs is, we decide to

perform a Tukey Test to answer our research question in full. Where is the

difference? Which manufacturer has a higher/lower mean MPG?

β€’ The Tukey Test compares all π‘˜ means, two at a time. (Note: This can be

done here because a Tukey Test does control the overall significance

level for pairwise comparisons)

Page 24: Analysis of Variance (ANOVA)
Page 25: Analysis of Variance (ANOVA)

We introduce the method of two-way analysis of variance, which is

used with data partitioned into categories according to two factors.

The methods of this section require that we begin by testing for an

interaction between the two factors.

Then we test whether the row or column factors have effects.

Page 26: Analysis of Variance (ANOVA)

Main Ideas:

β€’ We can actually extend the One-Way ANOVA to test the claim that three or more population means are all equal when the data is categorized in TWO ways (not just one)

β€’ This test (a Two-Way ANOVA) is appropriate when we wish to compare three or more population means within a set of quantitative data that is categorized according to two treatments (or factors)

We CANNOT simply test the effect of the two factors by utilizing two One-Way ANOVAs because the One-Way ANOVA test would ignore the possible interaction between the two factors involved

Page 27: Analysis of Variance (ANOVA)

Interaction – there is an interaction between two factors if the effect of one of the factors changes for different categories of the other factor (like a combination effect or a synergy effect)

β€’ Interaction plots can be used to visually assess if an interaction effect is present

β€’ We must test for an interaction effect first!

Page 28: Analysis of Variance (ANOVA)

1. For each cell, the populations have distributions that are approximately normal

2. The populations have the same variance 𝜎2

3. The samples are SRS of quantitative data

4. The samples are independent of each other

5. The samples are from populations that are categorized in two ways

6. All of the cells have the same number of sample values (i.e. a balanced design)

β€’ Not a general requirement for Two-Way ANOVA but we won’t have unbalanced designs

Page 29: Analysis of Variance (ANOVA)

* Notice that there are up to three tests being performed during a Two-Way ANOVA *

First Test: The Test for an Interaction Effect

Hypotheses:

β€’ 𝐻0:ݐܧ ݐݎݐܫ 𝐻1: Thݐܧ ݐݎݐܫ ݏ ݎ

Test Statistic:

β€’ 𝐹 =𝑀𝑆 πΌπ‘›π‘‘π‘’π‘Ÿπ‘Žπ‘π‘‘π‘–π‘œπ‘›

𝑀𝑆(πΈπ‘Ÿπ‘Ÿπ‘œπ‘Ÿ) *Note: P-Values and Critical Values are from the 𝐹 Distribution

Conclusion:

β€’ If the null hypothesis of β€˜No Interaction Effect’ is rejected, then there is a significant interaction effect and we CANNOT proceed to test for main effects. So, if we reject 𝐻0, we must STOP.

Page 30: Analysis of Variance (ANOVA)

Second and Third Tests: The Test for main (Row/Column Factor) Effects

Hypotheses:

β€’ H0:ݐܧ ݎݐܨ έ‘ά₯/έ“ H1: There έ‘ά₯/έ“ ݏ Factor ݐܧ

Test Statistic:

β€’ 𝐹 =𝑀𝑆 π‘…π‘œπ‘€

𝑀𝑆(πΈπ‘Ÿπ‘Ÿπ‘œπ‘Ÿ) and 𝐹 =

𝑀𝑆 πΆπ‘œπ‘™π‘’π‘šπ‘›

𝑀𝑆(πΈπ‘Ÿπ‘Ÿπ‘œπ‘Ÿ)

β€’ Note: P-Values and Critical Values are from the 𝐹 Distribution

Notice the Two-Way ANOVA is a two-step procedure that performs one to three separate 𝐹 tests

Page 31: Analysis of Variance (ANOVA)

The data in the table are categorized with two factors:

1. Gender: Male or Female

2. Blood Lead Level: Low, Medium, or High

The subcategories are called cells, and the response variable is IQ score.

Page 32: Analysis of Variance (ANOVA)

Let’s explore the IQ data in the table by calculating the mean for each

cell and constructing an interaction graph.

Page 33: Analysis of Variance (ANOVA)

An interaction effect is suggested if the line segments are far from being parallel.

No interaction effect is suggested if the line segments are approximately parallel.

For the IQ scores, it appears there is an interaction effect: β€’ Females with high lead exposure appear to have lower IQ scores, while

males with high lead exposure appear to have high IQ scores.

Page 34: Analysis of Variance (ANOVA)

Step 1: Interaction Effect – test the null hypothesis that there is no interaction Step 2: Row/Column Effects – if we conclude there is no interaction effect, proceed with these two hypothesis tests

β€’ Row Factor: no effects from row

β€’ Column Factor: no effects from column

All tests use the 𝐹 distribution.

Page 35: Analysis of Variance (ANOVA)
Page 36: Analysis of Variance (ANOVA)
Page 37: Analysis of Variance (ANOVA)

Given the performance IQ scores in the table at the beginning of this section, use two-way ANOVA to test for an interaction effect, an effect from the row factor of gender, and an effect from the column factor of blood lead level. Use a 0.05 level of significance.

Page 38: Analysis of Variance (ANOVA)

Requirement Check: 1. For each cell, the sample values appear to be from a normally distributed

population.

2. The variances of the cells are 95.3, 146.7, 130.8, 812.7, 142.3, and 143.8, which

are considerably different from each other. We might have some

reservations that the population variances are equal – but for the purposes of

this example, we will assume the requirement is met.

3. The samples are simple random samples.

4. The samples are independent of each other; the subjects are not matched in

any way.

5. The sample values are categorized in two ways (gender and blood lead

level).

6. All the cells have the same number (five) of sample values.

Page 39: Analysis of Variance (ANOVA)

The StatCrunch output is displayed below:

Page 40: Analysis of Variance (ANOVA)

Step 1: Test that there is no interaction between the two factors. The test statistic is F = 0.43 and the P-value is 0.655, so we fail to reject the null hypothesis. It does not appear that the performance IQ scores are affected by an interaction between gender and blood lead level. There does not appear to be an interaction effect, so we proceed to test for row and column effects.

Page 41: Analysis of Variance (ANOVA)

Step 2: We now test:

H0:ݎݐܨ έ“ (gender) ݐܧ H1: There έ“ ݏ Factor (gender) ݐܧ

For the row factor, F = 0.07 and the P-value is 0.791. Fail to reject the null hypothesis, there is no evidence that IQ scores are affected by the gender of the subject.

H0:ݎݐܨ έ‘ά₯ (blood lead level) ݐܧ H1: There έ‘ά₯ ݏ Factor (blood lead level) ݐܧ

For the column factor, F = 0.10 and the P-value is 0.906. Fail to reject the null hypothesis, there is no evidence that IQ scores are affected by the level of lead exposure.

Page 42: Analysis of Variance (ANOVA)

Interpretation: Based on the sample data, we conclude that IQ scores do not appear to be affected by gender or blood lead level. Caution: β€’ Two-way analysis of variance is not one-way analysis of variance done twice. β€’ Be sure to test for an interaction between the two factors.

Page 43: Analysis of Variance (ANOVA)

To better understand the method of two-way analysis of variance, let’s repeat Example 1 after adding 30 points to each of the performance IQ scores of the females only. That is, in Table 12-3, add 30 points to each of the listed scores for females.

Page 44: Analysis of Variance (ANOVA)

Step 1:

β€’ Interaction Effect: The display shows a p-value of 0.655 for an interaction effect. Because that p-value is not less than or equal to 0.05, we fail to reject the null hypothesis of no interaction effect. There does not appear to be an interaction effect.

Page 45: Analysis of Variance (ANOVA)

Step 2:

β€’ Row Effect: The display shows a p-value less than 0.0001 for the row variable of

gender, so we reject the null hypothesis of no effect from the factor of gender. In

this case, the gender of the subject does appear to have an effect on

performance IQ scores.

β€’ Column Effect: The display shows a p-value of 0.906 for the column variable of

blood lead level, so we fail to reject the null hypothesis of no effect from the

factor of blood lead level. The blood lead level does not appear to have an

effect on performance IQ scores.

Page 46: Analysis of Variance (ANOVA)

Interpretation:

By adding 30 points to each score of the female subjects, we do conclude that there is an effect due to the gender of the subject, but there is not apparent effect from an interaction or from the blood lead level.

Page 47: Analysis of Variance (ANOVA)

If our sample data consist of only one observation per cell, there is no

variation within individual cells and sample variances cannot be

calculated for individual cells.

If it seems reasonable to assume there is no interaction between the two

factors, make that assumption and test separately:

H0:ݐܧ ݎݐܨ έ‘ά₯/έ“ H1: There έ‘ά₯/έ“ ݏ Factor ݐܧ

(The mechanics of the tests are the same as presented earlier.)

Page 48: Analysis of Variance (ANOVA)

If we use only the first entry from each cell in Table 12-3, we get the StatCrunch results shown below. Use a 0.05 significance level to test for an effect from the row factor of gender and also test for an effect from the column factor of blood lead level. Assume that there is no effect from an interaction between gender and blood lead level.

Page 49: Analysis of Variance (ANOVA)

β€’ Row Factor:

β€’ We first use the results from StatCrunch display to test the null hypothesis of no effects from the row factor of gender (male, female). This test statistic (0.02) is not significant, because the corresponding p-value is 0.901. We fail to reject the null hypothesis. It appears that performance IQ scores are not affected by the gender of the subject.

β€’ Column Factor:

β€’ We now use the StatCrunch display to test the null hypothesis of no effect from the column factor of blood lead level (low, medium, high). The test statistic (1.16) is not significant because the corresponding p-value is 0.463. We fail to reject the null hypothesis, so it appears that the performance IQ scores are not affected by the blood lead level.

Page 50: Analysis of Variance (ANOVA)

β€’ Complete Practice Problems 7