reading: sections 5.3{5.5. introduction to the spock ...€¦ · the spock analysis, model checks...

Stat 529 (Winter 2011)

The Spock analysis, model checks and robustness

Reading: Sections 5.3–5.5.

• Introduction to the Spock dataset (handout)

– The exploratory data analysis

– Performing the basic one-way ANOVA

• Robustness considerations

– Checking assumptions – revisiting the ANOVA model

– Fitted values and residuals

– Producing residual plots

– Residual plots for the Spock ANOVA analysis

– The all-in-one graph from the ANOVA command

• Comparing different models using the F test

• Diagnostic plots for the finally chosen model

• Completing the inference

The Spock dataset

• See the handout for a description of the data, question of

interest, and exploratory data analysis.

• A discussion of the summaries:

Performing the basic one-way ANOVA

• Stat → ANOVA → One-Way.

– Response: Percentage of women.

– Factor: Judge.

– Click Store residuals and Store fits.

– Click OK.

• RESI1 are the residuals.

– Rename this to residuals.

• FITS1 are the fitted values.

– Rename this to fitted values.

(We will need the residuals and fitted values to diagnose the

fit of the model graphically.)

ANOVA output

One-way ANOVA: Percentage of women versus Judge

Source DF SS MS F P

Judge 6 1927.1 321.2 6.72 0.000

Error 39 1864.4 47.8

Total 45 3791.5

S = 6.914 R-Sq = 50.83% R-Sq(adj) = 43.26%

Individual 95% CIs For Mean Based on

Pooled StDev

Level N Mean StDev --------+---------+---------+---------+-

A 5 34.120 11.942 (-------*------)

B 6 33.617 6.582 (------*------)

C 9 29.100 4.593 (----*-----)

D 2 27.000 3.818 (------------*-----------)

E 6 26.967 9.010 (------*------)

F 9 26.800 5.969 (-----*----)

Spock 9 14.622 5.039 (-----*-----)

--------+---------+---------+---------+-

16.0 24.0 32.0 40.0

Pooled StDev = 6.914

Comments on the ANOVA output

Robustness considerations

Taken from the textbook, Section 5.5.1:

• Normality is not crucial as long as experiment is balanced

and there are no long-tailed or highly skewed distributions.

• Independence within and across groups is critical. If inde-

pendence is lacking different analyzes should be attempted.

• The assumption of equal standard deviations is crucial

(e.g., see Display 5.13).

• The tools are not resistant to severely outlying observa-

tions.

Checking assumptions – revisiting the ANOVA model

• Remember the additive model for our data:

Yij = µi + εij; i = 1, . . . , I, j = 1, . . . , ni.

• One way to check that the model fits well is to check the

assumptions made for the errors, εij. We usually assume:

1. Errors have mean zero and constant error variance σ2.

2. The errors are (usually) normally distributed.

3. The errors are independent across i and j.

• We will estimate the errors using the residuals.

Fitted values and residuals

• For any model (reduced or full) that we consider let µi be

the estimate of the mean in the ith population.

• Then the fitted value for case j in sample i is:

Yij = µi

• The residual for individual j in sample i is:

eij = Yij − Yij = Yij − µi.

• Example: In a model in which the mean is different for each

population:

Properties of the residuals

• If the model fits well, the residuals have the following

properties.

1. Residuals are centered around zero with constant spread.

2. The residuals are normally distributed about zero.

3. There should be no obvious patterns in residuals across i

and j. There should certainly be no relationships between

the residuals and the fitted values in a well fitting model.

Some example residual plots

• Plot the residuals versus the fitted values:

– Check for appropriateness of the fit.

– Do we need to transform the response?

– Check for constancy of the variance of errors.

– Look for outliers.

• Plot the residuals versus the population identifier.

– Check adequacy of fit for each population.

– Curvature may indicate the need to transform.

• Normal Q-Q plot of residuals.

– Check that normality is reasonable for the residuals.

• Residuals versus time or collection order.

– Check for systematic problems in the residuals

(e.g., serial correlation).

Producing residual plots for the Spock analysis

• The next four slides show a number of residual plots:

– The 1st plot was created using Graph→ Scatterplot.

– The 2nd plot used Graph→ Individual value plots.

– The 3rd plot is from Graph → Boxplot.

– The last figure is a Graph → Probability plot.

• I added reference lines at y = 0 as needed.

• We would need some time variable (e.g. day the venire was

compiled) to check for serial dependence in the residuals.

Residual plots for the Spock ANOVA analysis

Residual plots, continued

Comments on the residual diagnostic plots

The all-in-one graph from the ANOVA command

• The ANOVA command can produce graphs of its own.

– In Stat→ANOVA→One-Way select Graphs and

then Four in One.

• Not very customizable.

– Advice for a good analysis - do not use these graphs – use

your own!

Comparing different models using the F test

• Here are some models we could consider for the Spock dataset:

1. One population mean explains all the judges.

2. One mean for Spock’s judge, and another mean for all the

other judges.

3. Each judge needs a single mean.

• Let us compare these models using F tests.

A mean for each judge

One-way ANOVA: Percentage of women versus Judge

Source DF SS MS F P

Judge 6 1927.1 321.2 6.72 0.000

Error 39 1864.4 47.8

Total 45 3791.5

S = 6.914 R-Sq = 50.83% R-Sq(adj) = 43.26%

Individual 95% CIs For Mean Based on Pooled StDev

Level N Mean StDev --------+---------+---------+---------+-

A 5 34.120 11.942 (-------*------)

B 6 33.617 6.582 (------*------)

C 9 29.100 4.593 (----*-----)

D 2 27.000 3.818 (------------*-----------)

E 6 26.967 9.010 (------*------)

F 9 26.800 5.969 (-----*----)

Spock 9 14.622 5.039 (-----*-----)

--------+---------+---------+---------+-

16.0 24.0 32.0 40.0

A mean for Spock’s judge and a mean for all

the other judges

One-way ANOVA: Percentage of women versus Is Spock

Source DF SS MS F P

Is Spock 1 1600.6 1600.6 32.15 0.000

Error 44 2190.9 49.8

Total 45 3791.5

S = 7.056 R-Sq = 42.22% R-Sq(adj) = 40.90%

Level N Mean StDev ----+---------+---------+---------+-----

No 37 29.492 7.431 (---*---)

Yes 9 14.622 5.039 (-------*-------)

----+---------+---------+---------+-----

12.0 18.0 24.0 30.0

The F tests (exercise!)

Compare models 1 versus 2, 1 versus 3, and 2 versus 3.

• 1 versus 2:

Fobs = 32.15

The p-value ≤ 0.001 (reject H0).

Conclusion:

• 1 versus 3:

Fobs = 6.72

The p-value ≤ 0.001 (reject H0).

Conclusion:

• 2 versus 3:

Fobs =(2190.9− 1864.4)/(44− 39)

=326.5/5

47.8= 1.37.

The p-value is 1− 0.743 = 0.257 (fail to reject H0).

Conclusion:

The F tests, continued

• We used the following MINITAB output:

F distribution with 5 DF in numerator and 39 DF in denominator

x P(X<=x)

1.37 0.743405

Diagnostic plots for Model 2

• Now we check the model assumptions using diagnostic plots

based on the residuals of model 2.

– Important: The model you choose determines the es-

timated mean, µi, that will be used in calculating the

residuals.

• Comments on the fit of model 2:

Diagnostic plots for Model 2

Diagnostic plots for Model 2, continued

Completing the inference

• Do not use this part of the ANOVA output as a final in-

ference for comparing the groups means:

Level N Mean StDev ----+---------+---------+---------+-----

No 37 29.492 7.431 (---*---)

Yes 9 14.622 5.039 (-------*-------)

----+---------+---------+---------+-----

12.0 18.0 24.0 30.0

• What is an appropriate inference to use instead?

(side issue: should we pool variances?)

reading: sections 5.3{5.5. introduction to the spock ...€¦ · the spock analysis, model checks...

Documents

spring io 2015 spock workshop

economic overview & budget background€¦ · economic...

testing legacy apps with spock

idiomatic spock

review of 5.1, 5.3 and new section 5.5: generalized...

modular spock brochure

spock brochure

the cell: an overview ch. 5; 5.3-5.5

ggug spock

chapter 5.1, 5.3, 5.5, 5.8, 5.9 and 5.10

smarter testing with spock

spock in new jersey

pilotando spock

spock kirk’s real tpn planner

greach 2015 spock workshop

sections 5.3-5.5

spock versus junit - codepipes.com · spock history...

spock versus junit - code...

o o o 4.4 3.3 500 4.7 4.6 5.2 5.3 1000m 5.3 6.2 utm7tž4...

realizando pruebas con spock