and the triola statistics seriesdenis.santos/arquivos/slides/... ·  · 2010-06-0512-2 one-way...

65
Lecture Slides Elementary Statistics Tenth Edition Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. by Mario F. Triola

Upload: dinhliem

Post on 20-Apr-2018

219 views

Category:

Documents


5 download

TRANSCRIPT

Lecture Slides

Elementary StatisticsTenth Edition Tenth Edition

and the Triola Statistics Series

by Mario F. Triola

SlideSlide 1Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

by Mario F. Triola

Chapter 12

Analysis of Variance

12-1 Overview12-1 Overview

12-2 One-Way ANOVA

12-3 Two-Way ANOVA

SlideSlide 2Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

12-3 Two-Way ANOVA

Section 12 -1 & 12-2 Overview and One -Way

ANOVAANOVA

SlideSlide 3Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Created by Erin Hodgess, Houston, TexasRevised to accompany 10th Edition, Jim Zimmer, Chattanooga State,

Chattanooga, TN

Key Concept

This section introduces the method This section introduces the method of one-way analysis of variance , which is used for tests of hypotheses that three or more population means are all equal.

SlideSlide 4Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Overview

�Analysis of variance (ANOVA) is a method for testing the hypothesis method for testing the hypothesis that three or more population means are equal.

�For example:

H : µ = µ = µ = . . . µ

SlideSlide 5Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

H0: µ1 = µ2 = µ3 = . . . µk

H1: At least one mean is different

ANOVA Methods Require the F-Distribution

1. The F-distribution is not symmetric; it is skewed to the right. is skewed to the right.

2. The values of F can be 0 or positive; they cannot be negative.

3. There is a different F-distribution for each pair of degrees of freedom for the

SlideSlide 6Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

each pair of degrees of freedom for the numerator and denominator.

Critical values of F are given in Table A -5

F - distribution

SlideSlide 7Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Figure 12-1

One-Way ANOVA

1. Understand that a small P-value (such as

An Approach to Understanding ANOVA

1. Understand that a small P-value (such as 0.05 or less) leads to rejection of the null hypothesis of equal means. With a large P-value (such as greater than 0.05), fail to reject the null hypothesis of equal means.

SlideSlide 8Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

2. Develop an understanding of the underlying rationale by studying the examples in this section.

One-Way ANOVAAn Approach to Understanding ANOVA

3. Become acquainted with the nature of the SS(sum of squares) and MS (mean square) values and their role in determining the F test statistic, but use statistical software packages or a calculator for finding those values.

SlideSlide 9Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Definition

Treatment (or factor)

A treatment (or factor) is a property or characteristic that allows us to distinguish the different populations from one another.

Use computer software or TI -83/84 for

SlideSlide 10Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Use computer software or TI -83/84 for ANOVA calculations if possible.

One-Way ANOVA

Requirements

1. The populations have approximately normal distributions.distributions.

2. The populations have the same variance σ σ σ σ 2

(or standard deviation σσσσ ).

3. The samples are simple random samples.

SlideSlide 11Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

4. The samples are independent of each other.

5. The different samples are from populations that are categorized in only one way.

Procedure for testing Ho: µ1 = µ2 = µ3 = . . .

1. Use STATDISK, Minitab, Excel, or a TI-83/84 Calculator to obtain results. TI-83/84 Calculator to obtain results.

2. Identify the P-value from the display.

3. Form a conclusion based on thesecriteria:

If P-value ≤≤≤≤ αααα, reject the null hypothesis

SlideSlide 12Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

If P-value ≤≤≤≤ αααα, reject the null hypothesis of equal means.

If P-value > αααα, fail to reject the null hypothesis of equal means.

Procedure for testing Ho: µ1 = µ2 = µ3 = . . .

Caution when interpreting ANOVA results:Caution when interpreting ANOVA results:

When we conclude that there is sufficient evidence to reject the claim of equal population means, we cannot conclude from ANOVA that any particular mean is different from the others.

There are several other tests that can be used to

SlideSlide 13Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

There are several other tests that can be used to identify the specific means that are different, and those procedures are called multiple comparison procedures , and they are discussed later in this section.

Example: Weights of Poplar TreesDo the samples come from

populations with different means?

SlideSlide 14Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Example: Weights of Poplar Trees

H : µµµµ = µµµµ = µµµµ = µµµµ

Do the samples come from populations with different means?

For a significance level of αααα = 0.05, use STATDISK, Minitab, Excel, or a TI-83/84 calculator to test th e claim that the four samples come from populations with

H0: µµµµ1 = µµµµ2 = µµµµ3 = µµµµ4

H1: At least one of the means is different from the o thers.

SlideSlide 15Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

that the four samples come from populations with means that are not all the same.

Example: Weights of Poplar TreesDo the samples come from

populations with different means?STATDISK

Minitab

SlideSlide 16Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Example: Weights of Poplar TreesDo the samples come from

populations with different means?Excel

TI-83/84 PLUS

SlideSlide 17Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Example: Weights of Poplar Trees

H0: µµµµ1 = µµµµ2 = µµµµ3 = µµµµ4

Do the samples come from populations with different means?

H0: µµµµ1 = µµµµ2 = µµµµ3 = µµµµ4

H1: At least one of the means is different from the o thers.

The displays all show a P-value of approximately 0.007.

Because the P-value is less than the significance level of αααα = 0.05, we reject the null hypothesis of equal mean s.

SlideSlide 18Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

There is sufficient evidence to support the claim t hat the four population means are not all the same. We con clude that those weights come from populations having mea ns that are not all the same.

Estimate the common value of σσσσ 2:1. The variance between samples (also

ANOVA Fundamental Concepts

1. The variance between samples (also called variation due to treatment ) is an estimate of the common population variance σσσσ 2 that is based on the variability among the sample means .

2. The variance within samples (also called

SlideSlide 19Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

2. The variance within samples (also called variation due to error ) is an estimate of the common population variance σσσσ 2 based on the sample variances .

Test Statistic for One -Way ANOVA

ANOVA Fundamental Concepts

An excessively large test statistic is

F = variance between samplesvariance within samples

SlideSlide 20Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

An excessively large F test statistic is evidence against equal population means.

Relationships Between the F Test Statistic and P-Value

Figure 12-2

SlideSlide 21Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Calculations with Equal Sample Sizes

�Variance between samples = nsx2

�Variance within samples = sp2

�Variance between samples = n

where = variance of sample means

sx

sx2

SlideSlide 22Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

where sp = pooled variance (or the mean of the sample variances)

2

Example: Sample Calculations

SlideSlide 23Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

1. Find the variance between samples = .

Use Table 12-2 to calculate the variance between sa mples, variance within samples, and the F test statistic.

Example: Sample Calculations

2xns1. Find the variance between samples = .

For the means 5.5, 6.0 & 6.0, the sample variance is

= 0.0833

= 4 X 0.0833 = 0.3332

2. Estimate the variance within samples by

xns

2xns

2xs

SlideSlide 24Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

2. Estimate the variance within samples by calculating the mean of the sample variances. .

2ps 3.0 + 2.0 + 2.0

3= = 2.3333

3. Evaluate the F test statistic

Use Table 12-2 to calculate the variance between sa mples, variance within samples, and the F test statistic.

Example: Sample Calculations

3. Evaluate the F test statistic

F = variance between samplesvariance within samples

0.3332

SlideSlide 25Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

F = = 0.14280.33322.3333

Critical Value of F

�Right-tailed test

�Degree of freedom with k samples of the same size n

numerator df = k – 1

SlideSlide 26Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

denominator df = k(n – 1)

Calculations with Unequal Sample Sizes

F = =variance within samples

ΣΣΣΣni(xi – x)2

k –1

where x = mean of all sample scores combined

k = number of population means being compared

F = =variance between samplesvariance within samples

ΣΣΣΣ(ni – 1)si

ΣΣΣΣ(ni – 1)

2

k –1

SlideSlide 27Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

k = number of population means being compared

ni = number of values in the ith sample

xi = mean of values in the ith sample

si = variance of values in the ith sample2

Key Components of the ANOVA Method

SS(total), or total sum of squares, is a SS(total), or total sum of squares, is a measure of the total variation (around x) in all the sample data combined.

Formula 12-1 SS(total) = ΣΣΣΣ(x – x)2

SlideSlide 28Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Formula 12-1 SS(total) = ΣΣΣΣ(x – x)

Key Components of the ANOVA Method

SS(treatment), also referred to as SS(factor) or SS(between groups) or SS(between or SS(between groups) or SS(between samples), is a measure of the variation between the sample means.

SS(treatment) = n (x – x)2 + n (x – x)2 + . . . n (x – x)2Formula 12-2

SlideSlide 29Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

SS(treatment) = n1(x1 – x)2 + n2(x2 – x)2 + . . . nk(xk – x)2

= ΣΣΣΣni(xi - x)2

SS(error) , (also referred to as SS(within groups)

Key Components of the ANOVA Method

SS(error) , (also referred to as SS(within groups) or SS(within samples), is a sum of squares representing the variability that is assumed to be common to all the populations being considered.

Formula 12-32 22 2

SlideSlide 30Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

SS(error) = (n1 –1)s1 + (n2 –1)s2 + (n3 –1)s3 . . . nk(xk –1)si

= ΣΣΣΣ(ni – 1)si

Formula 12-32 22

2

2

Key Components of the ANOVA Method

Given the previous expressions for SS(total),

Formula 12-4

SS(treatment), and SS(error), the following relationship will always hold.

SlideSlide 31Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

SS(total) = SS(treatment) + SS(error)

Mean Squares (MS)

MS(treatment) is a mean square for treatment, obtained as follows:Formula 12-5

MS(treatment) = SS(treatment)k – 1

MS(error) is a mean square for error, obtained as follows:

SlideSlide 32Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

obtained as follows:

Formula 12-6

MS(error) = SS(error)N – k

Mean Squares (MS)

MS(total) is a mean square for the total variation, obtained as follows:

Formula 12-7

MS(total) = SS(total)N – 1

SlideSlide 33Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

N – 1

Test Statistic for ANOVA with Unequal Sample Sizes

MS (treatment)Formula 12-8

� Numerator df = k – 1

F = MS (treatment)MS (error)

SlideSlide 34Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

� Denominator df = N – k

Example: Weights of Poplar Trees

Table 12-3 has a format often used in computer displays.

SlideSlide 35Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Identifying Means That Are Different

After conducting an analysis of variance test, we might conclude that variance test, we might conclude that there is sufficient evidence to reject a claim of equal population means, but we cannot conclude from ANOVA that any particular mean is different from

the others.

SlideSlide 36Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

the others.

Identifying Means That Are Different

Informal procedures to identify the specific means that are different

1. Use the same scale for constructing boxplots of the data sets to see if one or more of the data sets are very different from the others.

2. Construct confidence interval estimates of the means from the data sets, then compare

SlideSlide 37Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

the means from the data sets, then compare those confidence intervals to see if one or more of them do not overlap with the others.

Bonferroni Multiple Comparison Test

Step 1. Do a separate t test for each pair of samples, but make the adjustments described in samples, but make the adjustments described in the following steps.

Step 2. For an estimate of the variance σ2 that is common to all of the involved populations, use

SlideSlide 38Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

common to all of the involved populations, use the value of MS(error).

Bonferroni Multiple Comparison Test

Step 2 (cont.) Using the value of MS(error), calculate the value of the test statistic, as calculate the value of the test statistic, as shown below. (This example shows the comparison for Sample 1 and Sample 4.)

1 4x xt

−=

SlideSlide 39Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

1 4

1 4

1 1( )

x xt

MS errorn n

−=

⋅ +

Bonferroni Multiple Comparison Test

Step 2 (cont.) Change the subscripts and use another pair of samples until all of the different another pair of samples until all of the different possible pairs of samples have been tested.

Step 3. After calculating the value of the test statistic t for a particular pair of samples, find either the critical t value or the P-value, but

SlideSlide 40Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

either the critical t value or the P-value, but make the following adjustment:

Bonferroni Multiple Comparison Test

Step 3 (cont.) P-value: Use the test statistic twith df = N-k, where N is the total number of sample values and k is the number of samples. Find the P-value the usual way, but adjust the P-value by multiplying it by the number of different possible pairings of two samples.

(For example, with four samples, there are six

SlideSlide 41Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

(For example, with four samples, there are six different possible pairings, so adjust the P-value by multiplying it by 6.)

Bonferroni Multiple Comparison Test

Step 3 (cont.) Critical value: When finding the critical value, adjust the significance level α by critical value, adjust the significance level α by dividing it by the number of different possible pairings of two samples.

(For example, with four samples, there are six different possible pairings, so adjust the P-value

SlideSlide 42Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

different possible pairings, so adjust the P-value by dividing it by 6.)

Example: Weights of Poplar Trees

Using the Bonferroni Test

Using the data in Table 12-1, we concluded that there is sufficient evidence to warrant rejectionthere is sufficient evidence to warrant rejectionof the claim of equal means.

The Bonferroni test requires a separate t test for each different possible pair of samples.

SlideSlide 43Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

0 1 2:H µ µ= 0 1 3:H µ µ= 0 1 4:H µ µ=

0 2 3:H µ µ= 0 2 4:H µ µ= 0 3 4:H µ µ=

Example: Weights of Poplar Trees

Using the Bonferroni Test

We begin with testing H0: µ1 = µ4 .We begin with testing H0: µ1 = µ4 .

From Table 12-1: x1 = 0.184, n1 = 5, x4 = 1.334, n4 = 5

1 4 0.184 1.3343.484

1 11 1 0.2723275( )

x xt

MS error

− −= = = − ⋅ +⋅ +

SlideSlide 44Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

1 4

1 1 0.2723275( )5 5

MS errorn n

⋅ +⋅ +

Example: Weights of Poplar Trees

Using the Bonferroni Test

Test statistic = – 3.484

df = N – k = 20 – 4 = 16df = N – k = 20 – 4 = 16

Two-tailed P-value is 0.003065, but adjust it by multiplying by 6 (the number of different possible pairs of samples) to get P-value = 0.01839.

Because the adjusted P-value is less than α = 0.05,

SlideSlide 45Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

reject the null hypothesis .

It appears that Samples 1 and 4 have significantly different means.

Example: Weights of Poplar TreesSPSS Bonferroni Results

SlideSlide 46Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

RecapIn this section we have discussed:

� One-way analysis of variance (ANOVA)� One-way analysis of variance (ANOVA)

Calculations with Equal Sample Sizes

Calculations with Unequal Sample Sizes

� Identifying Means That Are Different

SlideSlide 47Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Bonferroni Multiple Comparison Test

Section 12 -3 Two-Way ANOVA

SlideSlide 48Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Created by Erin Hodgess, Houston, TexasRevised to accompany 10th Edition, Jim Zimmer, Chattanooga State,

Chattanooga, TN

Key Concepts

The analysis of variance procedure introduced in Section 12 -2 is referred to as introduced in Section 12 -2 is referred to as one-way analysis of variance because the data are categorized into groups according to a single factor (or treatment).

In this section we introduce the method oftwo -way analysis of variance , which is used

SlideSlide 49Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

two -way analysis of variance , which is used with data partitioned into categories according to two factors.

Two-Way Analysis of Variance

Two-Way ANOVA involves two factors.

The data are partitioned into subcategories called cells.

SlideSlide 50Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

subcategories called cells.

Example: Poplar Tree Weights

SlideSlide 51Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

There is an interaction between

Definition

There is an interaction between two factors if the effect of one of the factors changes for different

categories of the other factor.

SlideSlide 52Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Example: Poplar Tree Weights

Exploring Data

Calculate the mean for each cell.

SlideSlide 53Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Example: Poplar Tree Weights

Exploring DataDisplay the means on a graph. If a graph such as Figure 12-3 results in line segments that are approximately parallel, we have evidence that there approximately parallel, we have evidence that there is not an interaction between the row and column variables.

SlideSlide 54Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Example: Poplar Tree Weights

Minitab

SlideSlide 55Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Requirements

1. For each cell, the sample values comefrom a population with a distribution that is approximately normal. is approximately normal.

2. The populations have the same variance σσσσ2.

3. The samples are simple random samples.

4. The samples are independent of each other.

5. The sample values are categorized two

SlideSlide 56Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

5. The sample values are categorized two ways.

6. All of the cells have the same number ofsample values.

Two-Way ANOVA calculations are quite involved, so we will assume that a software package is being used .software package is being used .

Minitab, Excel, TI-83/4 or STATDISK can be used.

SlideSlide 57Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

STATDISK can be used.

SlideSlide 58Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Figure 12 - 4Procedure for Two-Way ANOVAS

Using the Minitab display:

Step 1: Test for interaction between the effects:

Example: Poplar Tree Weights

F = = = 0.17MS(interaction) 0.05721MS(error) 0.33521

H0: There are no interaction effects.H1: There are interaction effects.

SlideSlide 59Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

The P-value is shown as 0.915, so we fail to reject the null hypothesis of no interaction between the two factors.

Step 2: Check for Row/Column Effects:

H : There are no effects from the row factors.

Using the Minitab display:

Example: Poplar Tree Weights

H0: There are no effects from the row factors.H1: There are row effects.

H0: There are no effects from the column factors.H1: There are column effects.

F = = = 0.81MS(site) 0.27225MS(error) 0.33521

SlideSlide 60Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

The P-value is shown as 0.374, so we fail to reject the null hypothesis of no effects from site.

F = = = 0.81MS(error) 0.33521

Special Case: One Observation per Cell and No Interaction

If our sample data consist of only one observation per cell, we lose MS(interaction), SS(interaction), and df(interaction) . df(interaction) .

If it seems reasonable to assume that there is no interaction between the two factors, make that assumption and then proceed as before to test the following two hypotheses separately:

SlideSlide 61Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

H0: There are no effects from the row factors.

H0: There are no effects from the column factors.

Special Case: One Observation per Cell and No Interaction

Minitab

SlideSlide 62Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Special Case: One Observation per Cell and No Interaction

Row factor:

F = = = 0.28MS(site) 0.156800MS(error) 0.562300

The P-value in the Minitab display is 0.634.

Fail to reject the null hypothesis.

SlideSlide 63Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Fail to reject the null hypothesis.

Special Case: One Observation per Cell and No Interaction

Column factor:

F = = = 0.73MS(site) 0.412217MS(error) 0.562300

The P-value in the Minitab display is 0.598.

Fail to reject the null hypothesis.

SlideSlide 64Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

Fail to reject the null hypothesis.

RecapIn this section we have discussed:

� Two-way analysis of variance (ANOVA)

� Special case: One observation per cell and no interaction

SlideSlide 65Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.