bayesian hypothesis testing

21
Clinical trial monitoring with Bayesian hypothesis testing John D. Cook Valen E. Johnson August 6, 2008

Upload: john-cook

Post on 27-Jan-2015

129 views

Category:

Documents


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Bayesian hypothesis testing

Clinical trial monitoring withBayesian hypothesis testing

John D. CookValen E. Johnson

August 6, 2008

Page 2: Bayesian hypothesis testing

Estimation and Testing

I Bayesians typically approach a clinical trial as an estimationproblem, not a test.

I Possible explanation: poor operating characteristics . . .

I Unless you choose your alternative prior well.

Page 3: Bayesian hypothesis testing

Local prior operating characteristics

I Point null hypothesis versus alternative prior that assignspositive probability to the null

I When simulating from the alternative, Bayes factor in favor ofalternative grows like en.

I When simulating from the null, Bayes factor in favor of nullgrows like n1/2.

I Hard to ever reject the null.

Page 4: Bayesian hypothesis testing

Inverse moment priors (iMOM)

0.0 0.2 0.4 0.6 0.8 1.0

π1(θ) ∝ (θ − θ0)−ν−1 exp(−λ(θ − θ0)−2k

)[θ > θ0]

Page 5: Bayesian hypothesis testing

iMOM Convergence rates

When simulating from alternative

p limn→∞

n−1 log BFn(1|0) = c > 0.

(Well known result.)

When simulating from null,

p limn→∞

n−k/(k+1) log BFn(1|0) = c < 0.

(New result.)

Page 6: Bayesian hypothesis testing

Thall-Simon method

I Historical standard: θS ∼ Beta(aS , bS). Parameters aS andbS large.

I Experimental treatment: θE ∼ Beta(aE , bE ) a priori, aE andbE small.

I Stop for inferiority if P(θE < δ + θS | data) is large.

I Stop for superiority if P(θE > θS | data) is large.

I Operating characteristics degrade without δ > 0.

I Inconsistent in limit: both stopping rules could apply.

Page 7: Bayesian hypothesis testing

Thall-Simon plot

0.0 0.2 0.4 0.6 0.8 1.0

Beta(60, 140) historical, Beta(12, 18) experimental

Page 8: Bayesian hypothesis testing

Comparing Bayes factor with Thall-Simon

Historical response 20%, alternative 30%. Fifty patients maximum.

Bayes factor design:

I H0: θ = 0.2

I H1: iMOM prior with mode 0.3.

I Stop for inferiority if P(H0 | data) > 0.9.

I Stop for superiority if P(H1 | data) > 0.9.

Page 9: Bayesian hypothesis testing

Comparing Bayes factor with Thall-Simon, cont.

Thall-Simon design:

I θS ∼ Beta(200,800)

I θE ∼ Beta(0.6, 1.4) a priori

I Stop for inferiority if P(θS > 0.1 + θE | data) > 0.976.

I Stop for superiority if P(θE > θS | data) > 0.99.

Calibrated to match probability of stopping for wrong reason atnull and alternative.

Page 10: Bayesian hypothesis testing

Stopping for inferiority

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

true response probability

prob

abili

ty o

f con

clud

ing

infe

riorit

y● ●

●●

● ● ● ● ● ● ● ● ● ● ● ●

● Thall−SimonBayes factor

Page 11: Bayesian hypothesis testing

Stopping for superiority

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

true response probability

prob

abili

ty o

f con

clud

ing

supe

riorit

y

● ● ●●

● ● ● ● ● ● ● ● ● ● ●

● Thall−SimonBayes factor

Page 12: Bayesian hypothesis testing

Thall-Wooten time-to-event method

I Analogous to Thall-Simon method for binary outcomes.

I t | θ ∼ exponential with mean θ, θ ∼ inverse gamma

I Stop for inferiority if P(θS + 0.1 > θE | data) large . . .

I Stop for superiority if P(θE > θS | data) large

Page 13: Bayesian hypothesis testing

Comparing Bayes factor and Thall-Wooten method

Standard treatment 6 months PFS, alternative 8 months,maximum 50 patients

Bayes factor design:

I H0: θ = 6

I H1: iMOM prior with mode 8.

I Stop for inferiority if P(H0 | data) > 0.9.

I Stop for superiority if P(H1 | data) > 0.9.

Page 14: Bayesian hypothesis testing

Comparing Bayes factor and Thall-Wooten method, cont.

Thall-Wooten design:

I θS ∼ Inverse Gamma (20,1200)

I θE ∼ Inverse Gamma(3, 12) a priori

I Stop for inferiority if P(θS + 2 > θE | data) > 0.976.

I Stop for superiority if P(θE > θS | data) > 0.93.

Calibrated to match probability of stopping for wrong reason atnull and alternative.

Page 15: Bayesian hypothesis testing

Stopping for inferiority

2 4 6 8 10 12

0.0

0.2

0.4

0.6

0.8

1.0

true mean survival time

prob

abili

ty o

f ear

ly s

topp

ing

for

infe

riorit

y● ● ● ● ● ●

●●

● ● ● ● ● ●

● Thall−WootenBayes factor

Page 16: Bayesian hypothesis testing

Stopping for superiority

2 4 6 8 10 12

0.0

0.2

0.4

0.6

0.8

1.0

true mean survival time

prob

abili

ty o

f ear

ly s

topp

ing

for

supe

riorit

y

● ● ● ● ● ● ●●

●● ● ● ● ●

● Thall−WootenBayes factor

Page 17: Bayesian hypothesis testing

Comparison with Simon two-stage design

Simon two-stage design to test null response rate 0.20 versusalternative rate 0.40.

Reject 95% of the time under null, 20% under alternative.

Maximum of 43 patients: 13 in first stage, 30 in second stage.

Page 18: Bayesian hypothesis testing

Comparison with Simon two-stage design:rejection probability

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

true response probability

prob

abili

ty o

f rej

ectin

g tr

eatm

ent

● ● ● ●

●● ● ● ● ● ● ● ● ● ●

● SimonBayes factor

Page 19: Bayesian hypothesis testing

Comparison with Simon two-stage design:patients used

0.0 0.2 0.4 0.6 0.8 1.0

010

2030

40

true response probability

patie

nts

●●

●●

● ● ● ● ● ● ● ● ● ●

● SimonBayes factorNaive Simon

Page 20: Bayesian hypothesis testing

References

I Valen E. Johnson, John D. Cook. Bayesian Design ofSingle-Arm Phase II Clinical Trials with ContinuousMonitoring. Clinical Trials 2009; 6(3):217-26.

I Software: http://biostatistics.mdanderson.org

I http://www.JohnDCook.com

Page 21: Bayesian hypothesis testing