analysis of variance

24
Analysis of Analysis of Variance Variance Introduction Introduction

Upload: lizina

Post on 06-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Analysis of Variance. Introduction. Analysis of Variance. The An alysis o f Va riance is abbreviated as ANOVA Used for hypothesis testing in Simple Regression Multiple Regression Comparison of Means. Sources. There is variation anytime that all of the data values are not identical - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Analysis of Variance

Analysis of VarianceAnalysis of Variance

IntroductionIntroduction

Page 2: Analysis of Variance

Analysis of VarianceAnalysis of Variance

The The AnAnalysis alysis oof f VaVariance is abbreviated as riance is abbreviated as ANOVAANOVA

Used for hypothesis testing inUsed for hypothesis testing in Simple RegressionSimple Regression Multiple RegressionMultiple Regression Comparison of MeansComparison of Means

Page 3: Analysis of Variance

SourcesSources

There is variation anytime that all of There is variation anytime that all of the data values are not identicalthe data values are not identical

This variation can come from This variation can come from different different sourcessources such as the model such as the model or the factoror the factor

There is always the left-over There is always the left-over variation that can’t be explained by variation that can’t be explained by any of the other sources. This source any of the other sources. This source is called the erroris called the error

Page 4: Analysis of Variance

VariationVariation

VariationVariation is the sum of squares of the is the sum of squares of the deviations of the values from the mean of deviations of the values from the mean of those valuesthose values

As long as the values are not identical, As long as the values are not identical, there will be variationthere will be variation

Abbreviated as SS for Sum of SquaresAbbreviated as SS for Sum of Squares

Page 5: Analysis of Variance

Degrees of FreedomDegrees of Freedom

The The degrees of freedomdegrees of freedom are the are the number of values that are free to number of values that are free to vary once certain parameters have vary once certain parameters have been establishedbeen established

Usually, this is one less than the Usually, this is one less than the sample size, but in general, it’s the sample size, but in general, it’s the number of values minus the number number of values minus the number of parameters being estimatedof parameters being estimated

Abbreviated as dfAbbreviated as df

Page 6: Analysis of Variance

VarianceVariance

The sample The sample variancevariance is the average is the average squared deviation from the meansquared deviation from the mean

Found by dividing the variation by Found by dividing the variation by the degrees of freedomthe degrees of freedom

Variance = Variance = VariationVariation / / dfdf Abbreviated as MS for Mean of the Abbreviated as MS for Mean of the

SquaresSquares MS = MS = SSSS / / dfdf

Page 7: Analysis of Variance

FF

F is the F test statisticF is the F test statistic There will be an F test statistic for There will be an F test statistic for

each source except for the error and each source except for the error and totaltotal

F is the ratio of two sample variancesF is the ratio of two sample variances The MS column contains variancesThe MS column contains variances The F test statistic for each source is The F test statistic for each source is

the MS for that row divided by the MS the MS for that row divided by the MS of the error rowof the error row

Page 8: Analysis of Variance

FF

F requires a pair of degrees of F requires a pair of degrees of freedom, one for the numerator and freedom, one for the numerator and one for the denominatorone for the denominator

The numerator df is the df for the The numerator df is the df for the sourcesource

The denominator df is the df for the The denominator df is the df for the error rowerror row

F is always a right tail testF is always a right tail test

Page 9: Analysis of Variance

The ANOVA TableThe ANOVA Table

The ANOVA table is composed of The ANOVA table is composed of rows, each row represents one rows, each row represents one source of variationsource of variation

For each source of variation …For each source of variation … The variation is in the SS columnThe variation is in the SS column The degrees of freedom is in the df The degrees of freedom is in the df

columncolumn The variance is in the MS columnThe variance is in the MS column The MS value is found by dividing the SS The MS value is found by dividing the SS

by the dfby the df

Page 10: Analysis of Variance

ANOVA TableANOVA Table

The complete ANOVA table can be The complete ANOVA table can be generated by most statistical generated by most statistical packages and spreadsheetspackages and spreadsheets

We’ll concentrate on understanding We’ll concentrate on understanding how the table works rather than the how the table works rather than the formulas for the variationsformulas for the variations

Page 11: Analysis of Variance

The ANOVA TableThe ANOVA Table

SourceSource SSSS(variation)(variation)

dfdf MSMS(variance)(variance)

FF

Explained*Explained*

ErrorError

TotalTotal

The explained* variation has different names depending on the particular type of ANOVA problem

Page 12: Analysis of Variance

Example 1Example 1

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 18.918.9 33

ErrorError 72.072.0 1616

TotalTotal

The Sum of Squares and Degrees of Freedom are given. Complete the table.

Page 13: Analysis of Variance

Example 1 – Find TotalsExample 1 – Find Totals

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 18.918.9 33

ErrorError 72.072.0 1616

TotalTotal 90.990.9 1919

Add the SS and df columns to get the totals.

Page 14: Analysis of Variance

Example 1 – Find MSExample 1 – Find MS

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 18.918.9 ÷÷ 3 3 = 6.30= 6.30

ErrorError 72.072.0 ÷ ÷ 1616 = 4.50= 4.50

TotalTotal 90.990.9 ÷ ÷ 1919 = 4.78= 4.78

Divide SS by df to get MS.

Page 15: Analysis of Variance

Example 1 – Find FExample 1 – Find F

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 18.918.9 33 6.306.30 1.401.40

ErrorError 72.072.0 1616 4.504.50

TotalTotal 90.990.9 1919 4.784.78

F = 6.30 / 4.50 = 1.4

Page 16: Analysis of Variance

Notes about the ANOVANotes about the ANOVA

The MS(Total) isn’t actually part of The MS(Total) isn’t actually part of the ANOVA table, but it represents the ANOVA table, but it represents the sample variance of the response the sample variance of the response variable, so it’s useful to findvariable, so it’s useful to find

The total df is one less than the The total df is one less than the sample sizesample size

You would either need to find a You would either need to find a Critical F value or the p-value to Critical F value or the p-value to finish the hypothesis testfinish the hypothesis test

Page 17: Analysis of Variance

Example 2Example 2

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 106.6106.6 21.3221.32 2.602.60

ErrorError 2626

TotalTotal

Complete the table

Page 18: Analysis of Variance

Example 2 – Step 1Example 2 – Step 1

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 106.6106.6 55 21.3221.32 2.602.60

ErrorError 2626 8.208.20

TotalTotal

SS / df = MS, so 106.6 / df = 21.32. Solving for df gives df = 5.

F = MS(Source) / MS(Error), so 2.60 = 21.32 / MS. Solving gives MS = 8.20.

Page 19: Analysis of Variance

Example 2 – Step 2Example 2 – Step 2

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 106.6106.6 55 21.3221.32 2.602.60

ErrorError 213.2213.2 2626 8.208.20

TotalTotal 3131

SS / df = MS, so SS / 26 = 8.20. Solving for SS gives SS = 213.2.

The total df is the sum of the other df, so 5 + 26 = 31.

Page 20: Analysis of Variance

Example 2 – Step 3Example 2 – Step 3

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 106.6106.6 55 21.3221.32 2.602.60

ErrorError 213.2213.2 2626 8.208.20

TotalTotal 319.8319.8 3131

Find the total SS by adding the 106.6 + 213.2 = 319.8

Page 21: Analysis of Variance

Example 2 – Step 4Example 2 – Step 4

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 106.6106.6 55 21.3221.32 2.602.60

ErrorError 213.2213.2 2626 8.208.20

TotalTotal 319.8319.8 3131 10.3210.32

Find the MS(Total) by dividing SS by df. 319.8 / 31 = 10.32

Page 22: Analysis of Variance

Example 2 – NotesExample 2 – Notes

Since there are 31 df, the sample Since there are 31 df, the sample size was 32size was 32

Since the sample variance was 10.32 Since the sample variance was 10.32 and the standard deviation is the and the standard deviation is the square root of the variance, the square root of the variance, the sample standard deviation is 3.21sample standard deviation is 3.21

Page 23: Analysis of Variance

Example 3Example 3

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 56.756.7

ErrorError 1414 13.5013.50

TotalTotal

The sample size is n = 20. Work this one out on your own!

Page 24: Analysis of Variance

Example 3 - SolutionExample 3 - Solution

SourceSource SSSS dfdf MSMS FF

ExplainedExplained 56.756.7 55 11.3411.34 0.840.84

ErrorError 189.0189.0 1414 13.5013.50

TotalTotal 245.7245.7 1919 12.9312.93

How did you do?