sjs sdi_61 design of statistical investigations stephen senn 6. orthogonal designs randomised blocks...

SJS SDI_6 1

Design of Statistical Investigations

Stephen Senn

6. Orthogonal Designs

Randomised Blocks II

SJS SDI_6 2

Exp_5Alternative Analyses

• We will now show three alternative analyses of example Exp_5

• First two of these are equivalent to the analysis already done– First is only equivalent because there are only

two treatments– Not equivalent for three or more treatments

SJS SDI_6 3

Matched Pairs t-test

• Reduce data to a single difference d– per patient– between treatments

• Analyse these differences using a t-test for a single sample

2

21

1

ˆ, /( 1)ˆ

rd

r d jjd

dt d d r

r

SJS SDI_6 4

Matched Pairs t-test (cont)

• Under H0, the population mean difference d is zero

• Hence a significance test may be based on the statistic

1ˆ

r

d

dt

r

SJS SDI_6 5

Exp_5Matched Pairs t-test

SEQ Patient Formoterol Salbutamol Differenceforsal 1 310 270 40salfor 2 385 370 15salfor 3 400 310 90forsal 4 310 260 50salfor 5 410 380 30forsal 6 370 300 70forsal 7 410 390 20salfor 9 320 290 30forsal 10 250 210 40forsal 11 380 350 30salfor 12 340 260 80salfor 13 220 90 130forsal 14 330 365 -35

SJS SDI_6 6

Exp_5Calculations

Difference Statistic Value40 Mean 45.3846215 Variance 1647.75690 SD 40.5925750 r 1330 SE 11.2583570 t 4.0320 DF 1230 P-value 0.0017403080

130-35

SJS SDI_6 7

Exp_5Matched Pairs Using S-Plus

#First split data into two columns#Matched by patienti<-sort.list(patient)data2<-cbind(patient=patient[i],pef=pef[i],treat=treat[i])pefTREAT<-split(data2[,2],data2[,3])

#Perform matched pairs t-testt.test(pefTREAT$”2", pefTREAT$”1", alternative="two.sided", mu=0, paired=T, conf.level=.95)

SJS SDI_6 8

Exp_5S-Plus Output

Paired t-Test

data: pefTREAT$"2" and pefTREAT$"1" t = 4.0312, df = 12, p-value = 0.0017 alternative hypothesis: true mean of differences is not equal to 0 95 percent confidence interval: 20.85477 69.91446 sample estimates: mean of x - y 45.38462

SJS SDI_6 9

Exp_5 using a Linear Model Approach

> fit3 <- lm(pef ~ patient + treat)> summary(fit3, corr = F)

Call: lm(formula = pef ~ patient + treat)Residuals: Min 1Q Median 3Q Max -42.31 -11.15 1.554e-015 11.15 42.31

Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 207.3077 21.0624 9.8425 0.0000 patient1 60.0000 28.7033 2.0904 0.0585 patient11 135.0000 28.7033 4.7033 0.0005etc. treat 45.3846 11.2584 4.0312 0.0017

SJS SDI_6 10

pef

tre

at

100 150 200 250 300 350 400

1.0

1.2

1.4

1.6

1.8

2.0

-40 -20 0 20 40

01

23

45

6

residual

fitted

resi

du

al

150 200 250 300 350 400

-40

-20

02

04

0

theoretical

em

pir

ica

l

-2 -1 0 1 2

-40

-20

02

04

0

SJS SDI_6 11

Exp_5A Non-Parametric Approach

• Wilcoxon signed ranks test

• Calculate difference

• Ignore sign

• Rank

• Re-assign sign

• Calculate sum of negative (or positive) ranks

SJS SDI_6 12

Exp_5Signed Rank Calculations

Absolute Rank Signed Patient Difference Difference Abs Diff Rank

1 40 40 7 7 *2 15 15 1 13 90 90 12 124 50 50 9 95 30 30 3 3 *6 70 70 10 107 20 20 2 29 30 30 3 4 *

10 40 40 7 8 *11 30 30 3 5 *12 80 80 11 1113 130 130 13 1314 -35 35 6 -6

* = tie arbitrarily

broken

SJS SDI_6 13

Exp_5Hypothesis Test

• Suppose H0 true

• P = 1/2 any difference is positive or negative

• 213 = 8192 different possible patterns of - and +

• How many produce sum of negative ranks as low as that seen?

SJS SDI_6 14

Possible Assignments of Negative RanksWith Equal or Lower Score

• No negative ranks: 1 case• 1,2,3,4,5,6 only: 6 cases• 1+(2,3,4,5): 4 cases• 2+(3,4): 2 cases• 1+2+3: 1 case• Total = 1+6 + 4 +2 +1 = 14 cases• 14/213=0.00171• Two sided P-value = 2 x 0.00171 =0.0034

SJS SDI_6 15

Exp_5SPlus Output

> wilcox.test(pefTREAT$"1", pefTREAT$"2", alternative = "two.sided", mu = 0, paired = T, exact = T, correct = T

)

Wilcoxon signed-rank test

cannot compute exact p-value with ties in: wil.sign.rank(dff, alternative, exact, correct)data: pefTREAT$"1" and pefTREAT$"2" signed-rank normal statistic with correction Z = -2.7297, p-value = 0.0063 alternative hypothesis: true mu is not equal to 0

SJS SDI_6 16

Why the Discrepancy?

• P = 0.0034 by hand, 0.0063 SPlus• SPlus is using asymptotic approximation• StatXact gives an accurate calculation

StatXact outputExact Inference: One-sided p-value: Pr { Test Statistic .GE. Observed } = 0.0017 Pr { Test Statistic .EQ. Observed } = 0.0005 Two-sided p-value: Pr { | Test Statistic - Mean | .GE. | Observed - Mean | = 0.0034 Two-sided p-value: 2*One-Sided = 0.0034

SJS SDI_6 17

Orthogonal(See Marriott A Dictionary of Statistical Terms)

• Mathematical meaning is perpendicular– as in co-ordinate axes

• Statistical variates are orthogonal if independent

• Experimental design is orthogonal if certain variates or linear combinations are independent– rectangular arrays are orthogonal

SJS SDI_6 18

Randomised Blocks and Orthogonality

• Randomised blocks are examples of orthogonal designs– Rectangular arrays– Balanced

• Consequences– Treatment sum of squares does not change as

blocks are fitted– Design is efficient

SJS SDI_6 19

Orthogonality and Regression

Illustration of orthogonality using Exp_5

Input design matrix corresponding to treatment only

ORIGIN 1

X11

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

T

X1T X1

1 0.077

0.077

0.077

0.154

Variance-covariance matrix

SJS SDI_6 20

Orthogonality and Regression 2Create design matrix with dummy variables for patients also

X2

tempi j 0

tempi j 1 i jif

tempi 13 j 0

tempi 13 j 1 i jif

j 1 12for

i 1 13for

X2 augment X1 temp

Variance-covariance matrix (part)

X2T X2

1

1 2 3

1

2

3

0.538 -0.077 -0.5

-0.077 0.154 0

-0.5 0 1

SJS SDI_6 21

Orthogonality and Regression 3

2 21 2 1 2

2 2 2 2 21 2

1 1ˆ ˆvar( ) ,

1 1 2,

r r

r r r

Here we have

2 21 2

2ˆ ˆvar( ) 0.154

13

SJS SDI_6 22

Orthogonality

• Addition of patients has not increased variance multiplier for treatments

• Variance is as it would be had patients not been included

• “patient” and “treat” are orthogonal

• The factors are balanced

SJS SDI_6 23

Exp_6 Another Example of a Two-Way Layout

• Classic attempt (Cushny & Peebles, 1905) to investigate optical isomers

• Subjects: eleven patients insane asylum Kalamazoo

• Outcome: hours of sleep gained

• Data subsequently analysed by Student (1908) in famous t-test paper.

SJS SDI_6 24

The Cushny and Peebles Data

Control B C DPatient Mean Mean Mean Mean

1 0.6 1.3 2.5 2.12 3.0 1.4 3.8 4.43 4.7 4.5 5.8 4.74 5.5 4.3 5.6 4.85 6.2 6.1 6.1 6.76 3.2 6.6 7.6 8.37 2.5 6.2 8 8.28 2.8 3.6 4.4 4.39 1.1 1.1 5.7 5.810 2.9 4.9 6.3 6.4

B=L-Hyosciamine HBr

C=L-Hyoscine HBr

D=R-Hyoscine HBr

SJS SDI_6 25

Features

• Treatments provide one dimension

• Patients provide the others

• Patients are the “blocks”

• Control is no treatment

• Main interest is difference between optical isomers

• Other treatment is positive control

SJS SDI_6 26

Cushny and Peebles Data

1 2 3 4 5 6 7 8 9 10 11

Patient number

0

2

4

6

8

ControlL-Hyoscyamine HBrL-Hyoscine HBrR-Hyoscine HBr

SJS SDI_6 27

Plotting Points

• Important to show three dimensions of data– Outcome– Treatment– Block (patients)

• This has been done here by using– Patient as a “pseudo-dimension” X– Outcome as the Y dimension– Treatment by colour and symbols

SJS SDI_6 28

Points

• Clear difference between treatments and control

• Some suggestion of difference to active control

• Little suggestion of difference between isomers

SJS SDI_6 29

0 2 4 6

L-Hyoscine HBr

0

2

4

6

R-H

yosc

ine

HB

rCushny and Peebles Data

Comparison of Two Isomers with Respect to Sleep (hours)

SJS SDI_6 30

Exp_6 SPlus Analysis

#Analysis of first 10 patients#Cushny and peebles datapatient<-factor(rep(c("1","2","3","4","5","6","7","8","9","10"),4))treat<-factor(c(rep("A",10),rep("B",10),rep("C",10),rep("D",10))) sleep<-c(0.6,3.0,4.7,5.5,6.2,3.2,2.5,2.8,1.1,2.9,1.3,1.4,4.5,4.3,6.1,6.6,6.2,3.6,1.1,4.9,2.5,3.8,5.8,5.6,6.1,7.6,8.0,4.4,5.7,6.3,2.1,4.4,4.7,4.8,6.7,8.3,8.2,4.3,5.8,6.4)fit1<-aov(sleep~patient+treat)summary(fit1)

SJS SDI_6 31

Exp_6 SPlus Output

summary(fit1) Df Sum of Sq Mean Sq F Value Pr(F) patient 9 89.500 9.94444 7.38815 0.0000231769 treat 3 40.838 13.61267 10.11342 0.0001225074Residuals 27 36.342 1.34600

SJS SDI_6 32

lsleep

tre

at

-0.5 0.0 0.5 1.0 1.5 2.0

1.0

2.0

3.0

4.0

-0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6

02

46

81

01

2

residual

fitted

resi

du

al

0.0 0.5 1.0 1.5 2.0

-0.6

-0.2

0.2

0.6

theoretical

em

pir

ica

l

-2 -1 0 1 2

-0.6

-0.2

0.2

0.6

SJS SDI_6 33

Calculation of Standard Errors

2 2

2 2 2

1 1 2ˆ ˆvar( ) ,l m

l m

r r r

Here we have2ˆ 1.34600

2ˆ ˆvar( ) 1.3460 0.2692

10

ˆ ˆ ˆ( ) 0.2692 0.519

l m

l mSE

Note that this applies as a consequence of orthogonality

The multiplier 2/r is the same whether or not we fit patient in addition to treat

SJS SDI_6 34

> multicomp(fit1, focus = "treat", error.type = "cwe", method = "lsd")

95 % non-simultaneous confidence intervals for specified linear combinations, by the Fisher LSD method

critical point: 2.0518 response variable: sleep

intervals excluding 0 are flagged by '****'

Estimate Std.Error Lower Bound Upper Bound A-B -0.75 0.519 -1.81 0.315 A-C -2.33 0.519 -3.39 -1.270 ****A-D -2.32 0.519 -3.38 -1.260 ****B-C -1.58 0.519 -2.64 -0.515 ****B-D -1.57 0.519 -2.63 -0.505 ****C-D 0.01 0.519 -1.05 1.070

SJS SDI_6 35

Questions

• Perform a matched pairs analysis comparing B and C and ignoring data from A and D.

• Compare it to the pair-wise contrast for B & C obtained above?

• Why are the results not the same?

• What are the advantages and disadvantages of these approaches?

sjs sdi_61 design of statistical investigations stephen senn 6. orthogonal designs randomised blocks...

Documents

t wilco sjs sdi

statistic slide

matched pairs ttest

patient i sjs sdi

treatments slide

ranks test

broken slide

rank test