chapter ten the analysis of variance. anova definitions > factor the characteristic that...

30
Chapter Ten The Analysis Of Variance

Upload: branden-watts

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Chapter Ten

The Analysis Of Variance

ANOVA Definitions> Factor The characteristic that differentiates the treatment or populations from one another.

> Level (Treatments) The number of different treatments or populations.

Randomized Experiment

Randomizing the order of sample observations will balance out any known or unknown nuisance variable that may influence the observed response.

Mean Square for Treatments

MSTr = J [(X1 – X)2 +…+ (XI – X)2] I – 1

For I number of levelsFor J number of samples

Mean Square for Error

MSE = S21 + S2

2+…+ S2I

I

Test Statistic for Single Factor ANOVA

F = MSTr

MSE

With 1 = I –1 & 2 = I(J-1)

ANOVA on Single-Factor Experiment ANOVA on Single-Factor Experiment Several (Several (II) Means ) Means

All Normal (Same All Normal (Same 22))

Null Hypothesis: Null Hypothesis: HH00: : uu11 = = uu2 2 =…= =…= uuII

Test Statistic: Test Statistic: = MST = MSTr r / MSE/ MSE

Alternative Hypothesis:Alternative Hypothesis: HHaa: (at least two means are not equal): (at least two means are not equal)

Reject RegionReject Region (upper tailed) (upper tailed) FF, , II-1, -1, I I ((JJ-1)-1) Exp. (Exp. (IJIJ – 1) DOF – 1) DOF

ANOVA on Single-Factor ExperimentANOVA on Single-Factor ExperimentExperiments were conducted to study whether commercial Experiments were conducted to study whether commercial processing of various foods changes the concentration of processing of various foods changes the concentration of essential elements for human consumption. One such essential elements for human consumption. One such experiment was to study the concentration of zinc in green experiment was to study the concentration of zinc in green beans. A batch of green beans was divided into 4 groups. beans. A batch of green beans was divided into 4 groups. The 4 groups were then randomly assigned to be measured The 4 groups were then randomly assigned to be measured (10) times each for zinc as follows: group 1 measured Raw; (10) times each for zinc as follows: group 1 measured Raw; group 2 measured before Blanching; group 3 measured group 2 measured before Blanching; group 3 measured after Blanching; and group 4 measured after the final after Blanching; and group 4 measured after the final processing step. Ten independent measurements were processing step. Ten independent measurements were taken from the 4 groups (treatments), yielding the following taken from the 4 groups (treatments), yielding the following data:data: Zinc ConcentrationZinc ConcentrationGroup 1 Group 1 Group 2 Group 2 Group 3 Group 3 Group 4 Group 4uu11 = 2.01 = 2.01 uu22 = 2.58 = 2.58 uu33 = 2.10 = 2.10 uu44 = 3.05 = 3.05

SS11 = 0.25 = 0.25 SS22 = 0.50 = 0.50 SS33 = 0.30 = 0.30 SS44 = 1.00 = 1.00

Test this hypothesis for significance at the 5% level.Test this hypothesis for significance at the 5% level.Measurements of this type are known to be Normal.Measurements of this type are known to be Normal.

ANOVA on Single-Factor Experiment ExampleANOVA on Single-Factor Experiment ExampleThe coded values for the measure of elasticity The coded values for the measure of elasticity (nt/m(nt/m22) in plastic, prepared by two different ) in plastic, prepared by two different processes A & B, for samples of (6) drawn processes A & B, for samples of (6) drawn randomly from each of the two processes are as randomly from each of the two processes are as follows:follows: Group A Group A Group B Group B uu11 = 7.28 = 7.28 uu22 = 8.02 = 8.02

SS1122 = 0.48 = 0.48 SS22

22 = 0.71 = 0.71

Do the data present sufficient evidence to indicate Do the data present sufficient evidence to indicate a difference in mean elasticity for the two a difference in mean elasticity for the two processes at a level of significance of processes at a level of significance of α = α = .05? .05? Measurements of this type are found to follow a Measurements of this type are found to follow a Normal pdf.Normal pdf.

ANOVA on Single-Factor Experiment Several ANOVA on Single-Factor Experiment Several ((II) Variances (Equal Samples ) Variances (Equal Samples JJ) ) All Normal All Normal Null Hypothesis: Null Hypothesis: HH00: : 22

1 1 = = 222 2 =…= =…= 22

II

Test Statistic: Test Statistic: 22 = (2.3026) = (2.3026) Q / hQ / h ““Bartlett’s Test”Bartlett’s Test” Alternative Hypothesis:Alternative Hypothesis: HHaa:(at least 2 variances are not equal):(at least 2 variances are not equal)

Reject RegionReject Region (upper tailed test) (upper tailed test)22 22 , , I I - 1- 1

QQ = = II((JJ–1)log(MSE) – (–1)log(MSE) – (JJ-1)[log(-1)[log(SS2211)+…+log()+…+log(SS22

II)])]

hh = 1 + = 1 + 1 1 II – – 11 3( 3(II-1) -1) ( (JJ-1)-1) II((JJ–1)–1)

ANOVA on Single-Factor Experiment Several (ANOVA on Single-Factor Experiment Several (II) ) Variances (Equal Samples Variances (Equal Samples JJ) Example) ExampleA study is designed to investigate the sulfur content A study is designed to investigate the sulfur content of (5) major coal seams. Eight core samples are of (5) major coal seams. Eight core samples are taken at randomly selected points within each seam. taken at randomly selected points within each seam. The measured response is the S% content. Before The measured response is the S% content. Before performing a Hypothesis Test on the data to detect performing a Hypothesis Test on the data to detect any differences that might exist in the average sulfur any differences that might exist in the average sulfur content for these (5) seams, you are required to test content for these (5) seams, you are required to test the condition that the (5) seams all have the same the condition that the (5) seams all have the same population variance at a level of significance of .05. population variance at a level of significance of .05. The summary statistics on the sulfur content of the The summary statistics on the sulfur content of the (5) major coal seams follows:(5) major coal seams follows:Seam 1 Seam 1 Seam 2Seam 2 Seam 3 Seam 3 Seam 4Seam 4 Seam 5Seam 511= 1.66= 1.66 22 = 1.17 = 1.17 3 3 = 1.46 = 1.46 44 = 0.88 = 0.88 55= 1.189= 1.189

SS2211==.175.175 SS22

22==.144.144 SS2233==.115.115 SS22

44==.123 .123 SS2255==.074.074

ANOVA on Single-Factor Experiment Several ANOVA on Single-Factor Experiment Several ((II) Variances (Equal Samples ) Variances (Equal Samples JJ) Example) ExampleUse Use Bartlett’sBartlett’s Hypothesis Test to determine Hypothesis Test to determine whether it is reasonable to assume whether it is reasonable to assume homogeneity of variances for the (4) homogeneity of variances for the (4) treatment groups in the study whether treatment groups in the study whether commercial processing of various foods commercial processing of various foods changes the concentration of essential changes the concentration of essential elements for human consumption. Use elements for human consumption. Use = .05.= .05.

Rough rule of thumb:Rough rule of thumb: If the largest s is not If the largest s is not much more than two times the smallest, it is much more than two times the smallest, it is reasonable to assume equal variances.reasonable to assume equal variances.

ANOVA Multiple ComparisonsProcedures for identifying which ui’s significantly differ when H0 is rejected:

> Tukey> Bonferroni> Duncan > Fisher LSD> Newman-Keuls

Tukey’s T Method (Equal Samples)1. Select & find Q, I, I(J-1) from Studentized Range Distribution Table A.10 on pg. 736. (m = I)2. Determine w = Q, I, I(J-1)*MSE/J3. List ui’s in increasing order & underline those pairs that differ by less than w. Any pair of ui’s not underscored by the same line corresponds to a pair of population or treatment means that are judged significantly different.

Examples of Tukey’s MethodSummary Results: w = 5.37 x1 x5 x2 x3 x4

9.8 10.8 15.4 17.6 21.6

Summary Results: w = 0.40 x5 x3 x2 x4 x1

6.1 6.3 6.8 7.3 7.5

Summary Results: w = 0.40 x5 x3 x2 x4 x1

6.1 6.3 7.15 7.3 7.5

ANOVA Multiple Comparison Tukey’s MethodA product development engineer is interested in maximizing the tensile strength of a new synthetic fiber. Previous experience indicates that the strength is affected by the % of cotton in the fiber. The engineer suspects that increasing the cotton content will increase the strength, at least initially. He decides to test (5) specimens at (5) levels of cotton content. Summary data follows:Cotton %: 15 20 25 30 35

Mean: 9.8 15.4 17.6 21.6 10.8 (psi) s : 3.35 3.13 2.07 2.61 2.86The Null Hypothesis H0 is rejected because the F statistic falls in the Reject Region. The % of cotton in the fiber significantly affects the mean tensile strength. Now use Tukey’s T method to find significant differences among the means. Use = .05.

ANOVA Multiple Comparison Tukey’s MethodAn experiment is developed to measure the effect that teaching methods have on a students’ performance. The following table lists the numerical grades on a standard arithmetic test given to 45 students divided randomly into (5) equal-sized groups. Groups 1 & 2 were taught by the current method. Groups 3, 4, & 5 were taught together for a number of days; on each day group 3 students were praised publicly for their previous work while group 4 students were criticized publicly. Group 5 students while hearing the praise and criticism of groups 3 & 4, were ignored. Group: 1 2 3 4 5Mean: 19.67 18.33 27.44 23.44 16.11 s2 : 17.72 12.75 6.05 9.55 13.104Test the null hypothesis that there is no difference in the mean grades produced by these teaching methods using at .05. Then use Tukey’s T method to compare & illustrate the difference in the teaching methods.

Least Significant Difference Method (Equal Samples1. Select & find t/2, I(J-1)

2. Determine w = t/2, I(J-1) *2MSE/J3. Compare the observed difference between each pair of averages to the corresponding LSD. If | ui – uJ | > LSD, we conclude that the population mean ui and uJ differ.

Example: Least Significant Difference MethodA manufacturer of paper used for making grocery bags is interested in improving the tensile strength of the product. Product engineering thinks that the tensile strength is a function of the hardwood concentration in the pulp and that the range of hardwood concentrations of practical interest is between 5 and 20%. You decide to investigate (4) levels of hardwood concentration. Six specimens at each of the (4) concentration levels are prepared and tested on a tensile tester in random order. The summary data from this experiment are shown in the following table:Hardwood (psi)Concentration > 5% 10% 15% 20%

Mean: 10.00 15.67 17.00 21.17 S2: 8.00 7.87 3.20 6.97

Test the null hypothesis that there is no difference in the mean tensile strength produced by these (4) concentration levels using at .01. Then use the LSD method at = .05 to compare & illustrate the difference at each level of concentration.

Example: Least Significant Difference MethodThe effective life of insulating fluids at an accelerated load of 50 m/sec2 is being studied. Test data have been obtained for (4) types of fluids. The summary results for (7) trials on each fluid are as follows:

Life (in hours) at 50 m/sec2 Fluid Type >> 1 2 3 4 Mean : 18.65 17.95 20.95 18.82 S2 : 3.81 3.44 3.53 2.42

Is there any indication that the fluids differ at a significance level of .05?Which fluid or fluids would you select if the objective is long life? Use the Least Significance Difference method with an alpha of .05 to support you conclusion.

-Error for Single Factor ANOVA F-Test

Non-centrality parameter:

= J (I - )2

2

For Non-central F distribution.With Degrees of Freedom:1 = I-12 = I(J-1)

-Error for Single Factor ANOVA1) Find the value of 2

(Experience)2) Find the values of (i - ) 3) Compute 2 using: (Replaces ’) 2 = J (i - )2

I 2

4) Use Power Curves (pg. 422) to look-up power value: = 1 – Power > Use appropriate set curves for 1 > (with ) is on the horizontal axis > Move up to the curve associated with 2

> Find value of power value on vertical axis

-Error ANOVA ExampleA product development engineer is interested in maximizing the tensile strength of a new synthetic fiber. Previous experience indicates that the strength is affected by the % of cotton in the fiber. The engineer suspects that increasing the cotton content will increase the strength, at least initially. He decides to test (5) specimens at (5) levels of cotton content. Summary data follows:Cotton %: 15 20 25 30 35

Mean: 9.8 15.4 17.6 21.6 10.8 (psi) s2 : 11.22 9.80 4.28 6.81 8.18What is the -error if the engineer is interested in rejecting the null hypothesis if the five treatment means are as follows: 15 = 11 20 = 12 25 = 15 30 = 18 35 = 19

Historically, the standard deviation of tensile strength is usually equal to 3 psi. Assume = .01 for this test.

-Error ANOVA ExampleSuppose that (5) means are being compared in a completely randomized experiment with = .01. The design engineer would like to know how many samples to take if it is important to reject the Null Hypothesis with probability at least 0.90 if (i - )2 = 25 & the population variance is known to be 5.0.

-Error ANOVA ExampleSuppose that (4) Normal populations have common variance 2 = 25 and means 1 = 50, 2 = 60, 3 = 50, and 4 = 60. How many observations should be taken on each population so that the probability of rejecting the hypothesis of equality of means is at least 0.90? Use = 0.05.

Single-Factor ANOVA (Unequal Sample Sizes Ji)

F = MSTr MSE

With 1 = I –1 & 2 = N-I

Where: MSTr = SSTr I – 1

And MSE = SSE N -I

ANOVA Definitions

Sum of Squares Treatment:SSTr = Ji (i - )2

i

Sum of Square Error:SSE = (xij- i)2

i j

Sum of Square Total:SST = SSTr + SSE

Example of Unbalanced DesignTwenty-seven coins discovered in Cyprus were grouped into (4) classes, corresponding to (4) different coinages during the reign of King Manuel I Comnenus (1143-1180). Archaeologists are interested in whether there were significant differences in the Ag content of coins minted early and late in King Manuel’s reign. Test the H0 at = .01. Summary data for testing the Ag content of early coins (group 1) to later coins (group 4) follows:Group Ji Mean SSE SSTr 1 9 6.74% 11.02 37.75 2 7 8.24% 3 4 4.88% 4 7 5.61%

Multiple Comparisons (Unequal Samples)Tukey’s method modified:1. Select & find Q, I, N-I from Studentized Range Distribution Table A.10 on pg. 736. (m = I)

2. Determine wij = Q, I, N-I*MSE x ( 1 + 1 ) 2 Ji Jj Uses averages of pairs 1/Ji’s instead of 1/J.

3. List ui’s in increasing order & underline those pairs that differ by less than wij.

Example of Multiple Comparison(Unequal Sample Sizes)

Use Tukey’s modified T method at = .01 to compare & illustrate the difference in the means of Ag percentage in coins found on Cyprus. Group Ji Mean SSE SSTr 1 9 6.74% 11.0237.75 2 7 8.24% 3 4 4.88% 4 7 5.61%