computational radiology laboratory a teaching … · computational radiology laboratory ... –...
TRANSCRIPT
Department of Radiology, MRI DivisionComputational Radiology Laboratory
An NCRR National Resource Center
Computational Radiology LaboratoryBrigham and Women’s HospitalBoston, Massachusetts USA
a teaching affiliate ofHarvard Medical School
STAPLE: Simultaneous Truth and Performance Level Estimation
An algorithm for the evaluation of image segmentations.Simon K. Warfield, Kelly Zou, William M. Wells
Slide 2
Validation of Image Segmentation• Comparison to digital and physical phantoms:
– Excellent for testing the anatomy, noise and artifact which is modeled.
– Typically lacks range of variability encountered in practice.
• Comparison to expert performance; to other algorithms:
• What is the appropriate measure for such comparisons ?
• Our new approach:• Simultaneous estimation of hidden ``ground truth’’ and
expert performance.• Enables comparison between and to experts.• Can be easily applied to clinical data exhibiting range of
normal and pathological variability.
Slide 3
How to judge segmentations of the peripheral zone?
Peripheral zone and segmentations0.5T MR of prostate
Slide 4
Algorithm
• Complete data model:• Binary ground truth Ti for each voxel i.• Expert j makes segmentation decisions Dij
• Expert performance characterized by sensitivity p and specificity q.
• We observe expert decisions D. If we knew ground truth T, we could construct maximum likelihood estimates for each expert’s sensitivity (true positive fraction) and specificity (true negative fraction):
)|( qp,TD,f
)|,(lnmaxargˆ,ˆ qp,TDqpqp,
f=
Slide 5
Expectation-Maximization
= )|(ln)ˆ|( )ˆ|( θTD,θθ θD,T fEQ g
• Since we don’t know ground truth T, treat T as a random variable, and solve for the parameters that maximize:
• Parameter values θj=[pj qj]T that maximize the conditional expectation of the log-likelihood function are found by iterating two steps:– Estimate hidden ground truth given a previous
estimate of the expert quality parameters.– Estimate expert performance parameters based on
how the expert decisions compared to the current estimate of the ground truth.
Slide 6
To Solve for Expert Parameters:
∑=
∑=
=
=
=
∏∏
∏ ∏
∏ ∏∑
iT ij
ojojiij
ij
ojojiij
i
iiT i
jojojiij
ii j
ojojiij
g
g
TgqpTDg
TgqpTDgTg
i j i
TgqpTDg
TgqpTDg
gg
ggg
ffE
fE
)()ˆ,ˆ,|(
)()ˆˆ|()ˆˆ|(
each voxelFor experts.over and sover voxel indexes where
)()ˆ,ˆ,|(
)()ˆˆ|(
)()ˆˆ|(
)()ˆˆ|()ˆˆ|(
expert.each of parameters theof estimates previous theare where
)](),|([lnmaxarg
)]|,([lnmaxargˆ,ˆ
,,
,,
ˆˆ
)ˆˆ|(
)ˆˆ|(
][][
oq,op,D
Toq,opT,D
Toq,opT,Doq,opD,T
Tqp,TD
qp,TDqp
i
T
q,p
q,pD,Tqp,
q,pD,Tqp,
oo
oo
oo
Slide 7
True Segmentation Estimate
ˆ ˆ( 1| )
( 1)( 1) (1 ( 1))
i i
i
i i
W g T
g Tg T g T
αα β
≡ =
==
= + − =
iD ,p ,qo o
: 1 : 0
: 0 : 1
ˆ ˆ(1 )
ˆ ˆ(1 )
( 1) prior probability ground truth is 1 probability that ground truth is 1
ij ij
ij ij
j jj D j D
j jj D j D
i
i
p p
q q
g TW
α
β= =
= =
=
= −
= −
=
∏ ∏∏ ∏
Slide 8
Expert Performance Estimateˆ ˆ( | )
ˆ ˆ( | )
ˆ ˆ( | )
ˆ ˆ( | ),
ˆ ˆ, arg max [ln ( | , ) ( )]
arg max [ln ( | , , ) ln ( )]
arg max [ln ( | , , )]
ˆ ˆ, arg max [ln ( | ,j j
g
g ij i j j iij i
g ij i j jj i
gj j ij ip q
E f f
E f D T p q f T
E f D T p q
p q E f D T p
=
= +
=
=
∏ ∏
∑∑
o o
o o
o o
o o
T D, p , qp,q
T D, p , qp,q
T D, p , qp,q
T D, p , q
p q D T p,q T
,
, : 1 : 1
: 0 : 0
, )]
arg max ln ( | 1, , )
(1 ) ln ( | 0, , )]
arg max ln (1 ) ln(1 )
ln(1 ) (1 ) ln
j j
j jij ij
ij ij
j ji
i ij i j jp q i
i ij i j ji
i j i jp q i D i D
i j i ji D i D
q
W f D T p q
W f D T p q
W p W q
W p W q
= =
= =
= = +
− =
= + − −
+ − + −
∑∑
∑∑ ∑
∑ ∑
Slide 9
Expert Performance Estimators
∑∑∑∑∑
∑
==
=
==
=
−+−
−=
+=
0:1:
0:
0:1:
1:
)1()1(
)1(ˆ
ˆ
ijij
ij
ijij
ij
Di iDi i
Di i
j
Di iDi i
Di i
j
WW
Wq
WW
Wp
p (sensitivity, true positive fraction) : ratio of expert identified class 1 to total class 1 in the image.
q (specificity, true negative fraction) : ratio of expert identified class 0 to total class 0 in the image.
Slide 10
Results• Synthetic expert segmentations of known
ground truth, specified performance parameters.
• Prostate peripheral zone segmentation evaluation.
• Brain tumor segmentation evaluation.• Knee femoral cartilage segmentation
evaluation.
Slide 11
Synthetic Experts• Several experiments with known ground truth
and known performance parameters. • Goal:
– Determine if STAPLE accurately identifies known ground truth.
– Determine if STAPLE accurately determines known expert performance parameters.
– Understand sensitivity of STAPLE with respect to changes in prior hyper-parameters; requirements for number of observations to enable good estimation; convergence characteristics.
Slide 12
Synthetic Experts10 segmentations by experts with p=0.95, q=0.90
0.001685std. dev q0.900035mean q0.001201std. dev p0.950104mean p
STAPLE p,q estimates:
Four segmentations of ten shown. STAPLE ground truth.
Slide 13
Synthetic ExpertsThree segmentations differing by horizontal displacement.
g(Ti=1) = 0.12.88,.99p3,q3
.88,.99p2,q2
1.0,1.0p1,q1
STAPLE results.
g(Ti=1) = 0.500.66,1.0p3,q3
0.66,1.0p2,q2
0.66,1.0p1,q1
Initialize STAPLE with pi=qi=0.90, two experiments with different global priors.
Slide 14
Prostate Peripheral Zone
.944.955.967.951.913Dice
.999.999.999.994.998qj
.895.918.937.991.879pj
54321
STAPLE truth estimateFrequency of selection by experts.
Slide 15
Tumor Segmentation Evaluation
MR image Experts STAPLETumor region
0.99900.99820.98570.9999qj
0.90630.99860.99930.8951pj
auto321
Slide 17
Conclusion• Key advantages of STAPLE
– Estimates ``true’’ segmentation.– Assesses expert performance.
• Principled mechanism which enables– Comparison of different experts,– Comparison of algorithm and experts.
• Extensions– Non-stationary prior probability g(Ti=1)– Neighborhood model (MRF) for coherent spatial
structure of ground truth.– Incorporate multiple observations by experts.– Priors for expert sensitivity, specificity.
Slide 18
AcknowledgementsData for this study was provided by:
• Peter M. Black.• Ferenc A. Jolesz.• Ron Kikinis.• Lawrence Panych.
• Martha Shenton.• Clare Tempany.• Carl Winalski.• Michael Kaus.
This study was supported by:The Whitaker FoundationCenter for the Integration of Medicine and Innovative TechnologyNIH P41 RR13218, P01 CA67165, R01 RR11747, R01 CA86879, R33 CA99015, R21 CA89449.
Slide 19
Relaxing the voxel independence assumption: MRF model of local coherency.
, ,
ˆ ˆ( | ) ( )ˆ ˆ( | )
ˆ ˆ( | ) ( )
ˆ ˆ( | ) ( ) ( | )
ˆ ˆ( | , , ) ( ) ( | )
where indexes over voxels and over experts.For each voxel
[ ][ ]
ij i oj oj i i ii j
T ij i oj oj i i iii j
g gg
g g
g D T p q g T g T T
g D T p q g T g T T
i j i
g
∂
∂
=
=∑
∑∏ ∏
∏ ∏
T
D T,p ,q To oT D,p ,qo o D T,p ,q To o
, ,
i
ˆ ˆ( | ) ( ) ( | )ˆ ˆ( | )
ˆ ˆ( | , , ) ( ) ( | )
where ( | ) is the prior probability of T given the true segmentation of the neighbors of voxel i.
ij i oj oj i i ij
iT ij i oj oj i i ii
j
i i
g D T p q g T g T TT
g D T p q g T g T T
g T T
∂
∂
∂
=∑
∏∏iD ,p ,qo o
Slide 20
, ,
, ,
ˆ ˆ ˆ ˆ( | ) ( | ) ( ) ( | )
ˆ ˆ ˆ ˆlog ( | ) log( ( | ) ( ))
( (1 )(1 ))
where 0 iff voxels , are neighbors.
i ij i oj oj i i ij
i ij i oj oj ij
kl k l k lk l
kl
g T g D T p q g T g T T
g T g D T p q g T
T T T T
k l
β
β
∂∝
∝ +
+ − −
>
∏
∑
∑∑
i
i
D ,p ,qo o
D ,p ,qo o
: 1 : 0
: 0 : 1
Greig et al. 1989 :Solve for with Ford-Fulkerson 1 ( (1 )(1 ))2
where = log( ( | 1, , ) / ( | 0, , ))
ˆ ˆ(1 ) ( 1)log
ˆ ˆ(1 ) (ij ij
ij ij
i
i i kl k l k lk l
i i i
j j ij D j D
j jj D j D
T i
T T T T T
g T g T
p p g T
q q g T
λ β
λ
= =
= =
∀
+ + − −
= =
− ==
−
∑ ∑∑
∏ ∏∏ ∏
i iD p q D p q
0)
log( /(1 )).
i
i iW W
=
= −
MAP estimation with MRF prior
Slide 21
Synthetic ExpertsOnly three segmentations by different quality experts.
0.9000,0.8987p3, q30.9511,0.8987p2, q20.9505,0.9494p1, q1
STAPLE p,q estimates:
p=0.95,q=0.95 p=0.95,q=0.90
p=0.90,q=0.90STAPLE ground truth.
With MRF prior