bayesian hypothesis testing
DESCRIPTION
TRANSCRIPT
Clinical trial monitoring withBayesian hypothesis testing
John D. CookValen E. Johnson
August 6, 2008
Estimation and Testing
I Bayesians typically approach a clinical trial as an estimationproblem, not a test.
I Possible explanation: poor operating characteristics . . .
I Unless you choose your alternative prior well.
Local prior operating characteristics
I Point null hypothesis versus alternative prior that assignspositive probability to the null
I When simulating from the alternative, Bayes factor in favor ofalternative grows like en.
I When simulating from the null, Bayes factor in favor of nullgrows like n1/2.
I Hard to ever reject the null.
Inverse moment priors (iMOM)
0.0 0.2 0.4 0.6 0.8 1.0
π1(θ) ∝ (θ − θ0)−ν−1 exp(−λ(θ − θ0)−2k
)[θ > θ0]
iMOM Convergence rates
When simulating from alternative
p limn→∞
n−1 log BFn(1|0) = c > 0.
(Well known result.)
When simulating from null,
p limn→∞
n−k/(k+1) log BFn(1|0) = c < 0.
(New result.)
Thall-Simon method
I Historical standard: θS ∼ Beta(aS , bS). Parameters aS andbS large.
I Experimental treatment: θE ∼ Beta(aE , bE ) a priori, aE andbE small.
I Stop for inferiority if P(θE < δ + θS | data) is large.
I Stop for superiority if P(θE > θS | data) is large.
I Operating characteristics degrade without δ > 0.
I Inconsistent in limit: both stopping rules could apply.
Thall-Simon plot
0.0 0.2 0.4 0.6 0.8 1.0
Beta(60, 140) historical, Beta(12, 18) experimental
Comparing Bayes factor with Thall-Simon
Historical response 20%, alternative 30%. Fifty patients maximum.
Bayes factor design:
I H0: θ = 0.2
I H1: iMOM prior with mode 0.3.
I Stop for inferiority if P(H0 | data) > 0.9.
I Stop for superiority if P(H1 | data) > 0.9.
Comparing Bayes factor with Thall-Simon, cont.
Thall-Simon design:
I θS ∼ Beta(200,800)
I θE ∼ Beta(0.6, 1.4) a priori
I Stop for inferiority if P(θS > 0.1 + θE | data) > 0.976.
I Stop for superiority if P(θE > θS | data) > 0.99.
Calibrated to match probability of stopping for wrong reason atnull and alternative.
Stopping for inferiority
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
true response probability
prob
abili
ty o
f con
clud
ing
infe
riorit
y● ●
●
●
●
●
●
●●
● ● ● ● ● ● ● ● ● ● ● ●
● Thall−SimonBayes factor
Stopping for superiority
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
true response probability
prob
abili
ty o
f con
clud
ing
supe
riorit
y
● ● ●●
●
●
●
●
●
●
● ● ● ● ● ● ● ● ● ● ●
● Thall−SimonBayes factor
Thall-Wooten time-to-event method
I Analogous to Thall-Simon method for binary outcomes.
I t | θ ∼ exponential with mean θ, θ ∼ inverse gamma
I Stop for inferiority if P(θS + 0.1 > θE | data) large . . .
I Stop for superiority if P(θE > θS | data) large
Comparing Bayes factor and Thall-Wooten method
Standard treatment 6 months PFS, alternative 8 months,maximum 50 patients
Bayes factor design:
I H0: θ = 6
I H1: iMOM prior with mode 8.
I Stop for inferiority if P(H0 | data) > 0.9.
I Stop for superiority if P(H1 | data) > 0.9.
Comparing Bayes factor and Thall-Wooten method, cont.
Thall-Wooten design:
I θS ∼ Inverse Gamma (20,1200)
I θE ∼ Inverse Gamma(3, 12) a priori
I Stop for inferiority if P(θS + 2 > θE | data) > 0.976.
I Stop for superiority if P(θE > θS | data) > 0.93.
Calibrated to match probability of stopping for wrong reason atnull and alternative.
Stopping for inferiority
2 4 6 8 10 12
0.0
0.2
0.4
0.6
0.8
1.0
true mean survival time
prob
abili
ty o
f ear
ly s
topp
ing
for
infe
riorit
y● ● ● ● ● ●
●
●
●
●
●
●
●
●●
● ● ● ● ● ●
● Thall−WootenBayes factor
Stopping for superiority
2 4 6 8 10 12
0.0
0.2
0.4
0.6
0.8
1.0
true mean survival time
prob
abili
ty o
f ear
ly s
topp
ing
for
supe
riorit
y
● ● ● ● ● ● ●●
●
●
●
●
●
●
●
●● ● ● ● ●
● Thall−WootenBayes factor
Comparison with Simon two-stage design
Simon two-stage design to test null response rate 0.20 versusalternative rate 0.40.
Reject 95% of the time under null, 20% under alternative.
Maximum of 43 patients: 13 in first stage, 30 in second stage.
Comparison with Simon two-stage design:rejection probability
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
true response probability
prob
abili
ty o
f rej
ectin
g tr
eatm
ent
● ● ● ●
●
●
●
●
●
●
●● ● ● ● ● ● ● ● ● ●
● SimonBayes factor
Comparison with Simon two-stage design:patients used
0.0 0.2 0.4 0.6 0.8 1.0
010
2030
40
true response probability
patie
nts
●●
●
●
●
●
●
●
●
●●
● ● ● ● ● ● ● ● ● ●
● SimonBayes factorNaive Simon
References
I Valen E. Johnson, John D. Cook. Bayesian Design ofSingle-Arm Phase II Clinical Trials with ContinuousMonitoring. Clinical Trials 2009; 6(3):217-26.
I Software: http://biostatistics.mdanderson.org
I http://www.JohnDCook.com