plugin-complete mi tutorial
TRANSCRIPT
MICCAI 2009 Tutorial Information theoretic similarity measures for image registration and segmentation Sunday - 20th September 14:00-14:30: William M Wells III Alignment by maximization of mutual information This talk will summarize the historical emergence of the mutual information (MI) approach to image registration. Subsequently, it will describe how it was developed, implemented and evaluated, primarily from the perspective of the MIT / Harvard Medical School group. D. Hill, Studholme, C., and Hawkes, D.. Voxel Similarity Measures for Automated Image Registration. VBC 1994 Collignon, A., Vandermeulen, D., Suetens, P., and Marchal, G.. 3d multi-modality medical image registration using feature space clustering. CVRMED 1995. Viola, P. and Wells, W.. Alignment by maximization of mutual information. In Proceedings of the 5th International Conference of Computer Vision, 1995. Viola, P. Alignment by maximization of Mutual Information. MIT PhD Thesis, 1995. Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996. Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 1997;24:137-154. West JB, Fitzpatrick JM, et al.. "Comparison and evaluation of retrospective intermodality image registration techniques. JCAT 1997. C. Studholme, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, Vol. 32(1), Jan 1999, pp 71-86.
14:30-15:00: Frederik Maes Multimodality image registration by maximization of mutual information This talk will present the concept of MI for multimodality image registration from the Leuven perspective. It will discuss implementation issues such as histogram binning, interpolation and optimization, validation of robustness and accuracy, and also some limitations of the MI criterion in real world applications. A. Collignon, F. Maes, D. Delaere, D. Vandermeulen, P. Suetens, G. Marchal, Automated multi-modality image registration based on information theory, Proc. IPMI 1995 F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, P. Suetens, Multimodality image registration by maximization of mutual information, IEEE TMI, 16(2):187-198, 1997
West JB, Fitzpatrick JM, et al.. "Comparison and evaluation of retrospective intermodality image registration techniques. JCAT 1997. F. Maes, Segmentation and registration of multimodal medical images: from theory, implementation and validation to a useful tool in clinical practice, K.U.Leuven PhD Thesis, 1998 F. Maes, D. Vandermeulen, P. Suetens, Comparative evaluation of multiresolution optimization strategies for multimodality image registration by maximization of mutual information, Medical image analysis, 3(4):373-386, 1999 F. Maes, D. Vandermeulen, P. Suetens, Medical image registration using mutual information, Proc IEEE - special issue on emerging medical imaging technology, 91(10):1699-1722, 2003
15:00-15:30: Josien Pluim Aspects of mutual information-based image registration After the introduction of mutual information for medical image registration and the promising first results, the focus shifted towards implementation issues (such as preprocessing, interpolation artifacts and related solutions) and the suitability of information measures in general for image registration. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Mutual information matching in multiresolution contexts, Image Vis. Comput., 19(1-2), pp 45-52, 2001 J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Interpolation artefacts in mutual information based image registration, Comput. Vis. Image Underst., 77(2), pp 211-232, 2000 J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, f-Information measures in medical image registration, IEEE Trans. Med. Imaging, 23(12), pp 1508-1516, 2004 S. Klein, M. Staring, J.P.W. Pluim, Evaluation of optimisation methods for nonrigid medical image registration using mutual information and B-splines, IEEE Trans. Image Process., 16(12), pp 2879-2890, 2007
15:30-15:45: Break 15:45-16:15: William M Wells III Probabilistic and information-theoretic approaches to registration This talk will describe more recent approaches to pair-wise and group-wise registration that are based on generative models of images and information theory, with an emphasis on explicit and implicit modeling assumptions and the interconnections among the methods. Topics will include optimality of the MI criteria, the inclusion of controlled amounts of domain-specific information about image intensities, and use of the EM algorithm for registration and model estimation. A. Roche, G. Malandain, and N Ayache. Unifying maximum likelihood approaches in medical image registration. International Journal of Imaging Systems and Technology, 11(7180):71–80, 2000 Zollei L, Fisher J, Wells W. A Unified Statistical and Information Theoretic Framework for Multi-modal Image Registration. Image Processing in Medical Imaging 2003, Ambleside, UK, 2003. Erik Learned-Miller, (2005) Data Driven Image Models through Continuous Joint Alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). Zollei L, Learned-Miller E, Grimson WEL, Wells W. Efficient Population Registration of 3D Data. Proc ICCV 2005, Computer Vision for Biomedical Image Applications, Beijing, China 2005 Zöllei L, Wells W. Multi-modal Image Registration Using Dirichlet-encoded Prior Information. Third International Workshop on Biomedical Image Registration, Utrecht, 2006. Zollei L, Jenkinson M, Timoner S, Wells W. A Marginalized MAP Approach and EM Optimization for Pair-Wise Registration. Proc. IPMI, Kerkrade Netherlands, 2007.
16:15-16:45: Frederik Maes Incorporating local context in mutual information based registration: spatial and voxel label information MI based registration maximizes the statistical correlation between different images without assuming a specific intensity relationship between them. While this has been shown to be a major benefit over other more informed methods in case of affine registration applications, MI of voxel intensities may not be optimally suited for non-rigid registration due to ambiguity in the local intensity information. Inclusion of local context can help to make MI based non-rigid registration more robust. Two different strategies are presented here: one based on including local spatial information (‘conditional MI’) and one based on voxel label information. Voxel label information assumes a prior segmentation of at least one of the images to be registered. If one of the images is an atlas, non-rigid atlas registration and atlas-based segmentation can be combined in a unified framework based on the Expectation-Maximization algorithm. D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information, Proc. IPMI 2007 D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information, IEEE TMI, 2009 (in press) E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, An information theoretic approach for non-rigid image registration using voxel class probabilities, Medical image analysis, 10(3):413-431, 2006 E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, A unified framework for atlas based brain image segmentation and registration, Proc. WBIR 2006 Ashburner, J., Friston, K.: Unified segmentation. NeuroImage 26 (2005) 839–851 Pohl, K., Fisher, J., Grimson, W., Kikinis, R., Wells, W.: A bayesian model for joint segmentation and registration. NeuroImage 31 (2006) 228–239
16:45-17:15: Josien Pluim Incorporating spatial information in mutual information-based registration This part will continue the theme of including additional information in mutual information-based image registration. Examples covered are the combination of gradient and mutual information, higher-order mutual information and multifeature mutual information. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Image registration by maximization of combined mutual information and gradient information, IEEE Trans. Med. Imaging, 19(8), pp 809-814, 2000 D. Rueckert, M. J. Clarkson, D. L. G. Hill, D. J. Hawkes, Non-rigid registration using higher-order mutual information, in Medical Imaging: Image Processing, K. M. Hanson, Ed. 2000, vol. 3979 of Proc. SPIE, pp. 438–447, SPIE Press, Bellingham, WA. H. F. Neemuchwala, A. Hero, P. Carson, Image matching using alpha-entropy measures and entropic graphs, Signal Processing, vol. 85, no. 2, pp. 277 – 296, 2005. M. Staring, U.A. van der Heide, S. Klein, M.A. Viergever, J.P.W. Pluim, Registration of cervical MRI using multifeature mutual information, IEEE Trans. Med. Imaging, in press 17:15-17:30: Concluding remarks
1
1
Alignment by Maximization of Mutual Information
William WellsAssociate Professor of RadiologySurgical Planning LaboratoryHarvard Medical School and Brigham and Women’s Hospital
Affiliated Faculty: Harvard – MIT Division of Health Sciences and Technology
Research Scientist – MIT CSAIL
2
Summary
• Historical Emergence of MI registration approach
• Development, implementation, and evaluation• MIT / Harvard Medical School perspective
3
Antecedents
• Voxel Similarity Measures for Automated Image Registration. VBC 1994 – Hill D., Studholme, C., and Hawkes, D. – Meeting at Mayo Clinic, Oct 4 – 7 1994– 3rd order moments (and other) measures– MOVIE by Colin Studholme…
4
Joint Scatter of MRI and CT
Movie prepared by Colin Studholme
5
Early Entropy / MI Registration
• Minimum Entropy and Registration:– Collignon A., Vandermeulen, D., Suetens, P., and Marchal, G..
3d multi-modality medical image registration using feature space clustering. CVRMED April 1995.
• Maximum Mutual Information Registratoin– Viola, P. and Wells, W.. Alignment by maximization of mutual
information. In Proceedings of the 5th International Conference of Computer Vision, June 20 – 23, 1995.
– Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G, Automated multi-modality image registration based on information theory. IPMI June 26, 1995.
– Viola, P. Alignment by maximization of Mutual Information. MIT PhD Thesis, June 1995.
6
More MI Registration (Journal Articles)
• Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996. (1761*)
• Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 1997;24:137-154. (1011*)
• Maes F, Collignon A,Vandermeulen D, Marchal G, Suetens P. Multimodality image registration by maximization of mutual information, IEEE TMI, 16(2):187-198, 1997 (2005*)
* Google scholar citation counts sept, 2009
2
7
Medical image data sets
Transform (move around)
Compare with objective function
Optimization algorithminitialvalue
motion parameters
score
Medical Image Registration
8
Notation
• Images: u(x), v(x)• Transformation (deformation model): T(x)• arg max notation:
– “the value of x that maximizes f”– similar for min
argmaxxf(x)
9
Minimum Joint Entropy Registration
• find the transformation that minimizes the joint entropy of images u and v under transformations T
T̂ = argminTH[u(x), v(T (x))]
10
Entropy of Images
• histogram the joint data• calculate the entropy of the histogram
– (normalize the histogram)
H[u(x), v(T (x))]
11
Entropy of Images…
• histogram the joint data• estimate (the parameters of) a distribution p on
the pairs (u(x), v(T(x)))• calculate the entropy of that distribution
H[p(u(x), v(T (x)))]
12
• correlation fails for multi-mode registration when intensities are different; e.g.: MR-CT
• one solution:⇒ apply a special intensity transform to the MRI to
make it look more like CT; then compute the correlation measure
Multimodal Inputs
Petra A. van den Elsen. Multimodality Matching of Brain Images. PhD thesis, Utrecht University, The Netherlands, 1992. Petra A. van den Elsen. Retrospective fusion of ct and mr brain images using mathematical operators. In Applications of Computer Vision in Medical Image Processing, Spring Symposium Series. AAAI, March 1994.
3
13
MR-CT situation
CT
MR
air
bone
white matter
gray matter CSF
fat
14
Histograms
1.4
3.41.0 3.0
1.3 1.6
1.0
10.0
1.2
Counts:
Bins or buckets
0 1 2 3 4 5 6 7 8 9 10
0000000010000003060
Rel. frequency:
Data
610
110
310
15
Histogram Joint Intensity of Images
121121121
222333222
Images:
U V
intensities
Joint intensities:(1,2)(2,2)(1,2)(1,3)(2,3)(1,3)(1,2)(2,2)(1,2)
histogram
relative freq.
Y
X
Y
X U
V
3
2
1 2
29
49
19
29
16
MRI & CT pairs
17
Joint histogram: MRI & CT registered
MRI
CT CT
MRI
(1 1) entry suppressed for clarity:
18
Joint histogram: MRI & CT; slightly off
Correct registration Slight mis-registration
MRI
CT CT
MRI
4
19
Joint histogram:MRI & CT; significantly off
Correct registration Significant mis-registration
MRI
CT CT
MRI
20
Entropy
• entropy is a measure of the uncertainty of randomness in a random variable
• it is the minimum length of a message that describes the result of the experiment characterized by p(x)
H[p(x)].= Ex[log
1
p(x)] = −
Xx
p(x) log p(x)
21
entropy:
predictable coin --no uncertainty,lowest entropy fair coin --
most uncertain,highest entropy
p
1
0
.5
H T
biased coin --moderate uncertainty,moderate entropy
H T H T
H[p(x)] = −Xx
p(x) log p(x)
22
Examples of joint intensity distributions
14
14
14
14
16
16
16
16
16
16
U
V
U
V
2log (4)H = 2log (6)H =
23
Maximum Mutual Information Registration
T̂ = argmaxTI[u(x), v(T (x))]
I[xy].= H [x] +H[y]−H[x, y]
find the transformation that maximizes the mutual information of images u and v under transformations T
mutual information definition:
24
Registration of Video and 3D Model
Paul Viola MIT PhD Thesis 1996
5
25
MR-CT Registration
•1996 •EMMA•Stochastic Gradient Descent
26
2D-3D Rigid-body Registration of X-Ray Fluoroscopy and CT
• Motivating applications: Image Guided Surgery (IGS)– 3D Roadmapping (Neuro catheter procedure)– Orthopedics (total hip replacement, revision surgery,
spine procedures, metastatic bone cancer)
L . Zöllei: "2D-3D Rigid-Body Registration of X-Ray Fluoroscopy and CT Images", Masters Thesis, MIT AI Lab, August 2001.
L . Zöllei, E. Grimson, A. Norbash, W. Wells: "2D-3D Rigid Registration of X-Ray Fluoroscopy and CT Images Using Mutual Information and Sparsely Sampled Histogram Estimators", IEEE CVPR, 2001.
27
2D-3D Rigid Registration
Problem: find T28
Gage Before
Provided by L. Zollei
Skull: Before Registration
29
Provided by L. Zollei
Skull: After Registration
30
Plastic Pelvis: Before Registration
Provided by L. Zollei
6
31
Plastic Pelvis: After Registration
Provided by L. Zollei
32
MI-based Audio / Video Fusion
Learning Joint Statistical Models for Audio-Visual Fusion and Segregation
John Fisher et al.
NIPS 2000
33
Video – Audio Joint Statistics
34
Video – Audio MI
35
MI Tracking with Graphics Hardware
• MI registration• apparent surface normals to video intensity• gradient descent with “differentiated histogram”
Wells W, Halle M, Kikinis R, Viola P.Alignment and Tracking using Graphics Hardware.Image Understanding Workshop, 1995. 36
MI Tracking with Graphics Hardware
7
37
MI Tracking with Graphics Hardware
• SUN ffb graphics board – colored lights -> surface
normals calculated in hdwr• SUN SPARC ULTRA• 4 Hz iteration rate (1995)
38
Normalised Mutual Information
• analyzed effect of changes in image overlap• showed improved behavior in synthetic and
clinical images over range of fields of view
C. Studholme, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, Vol. 32(1), Jan 1999, pp 71-86.
NMI(u, v).=H(u) +H(v)
H(u, v)
39
EMMA entropy estimatorH[x]
.= −Ex[log(p(x))]
≈ − 1
|A|Xxi∈A
log(p(xi))
p(x) ≈ 1
|B|Xxj∈B
K(x− xj)
; A: sample of data; weak law of large numbers
; B: sample of data; K: kernel; Parzen density estimation
H [x] ≈ − 1
|A|Xxi∈A
log
⎛⎝ 1
|B|Xxj∈B
K(xj = xi)
⎞⎠; EMMA entropy estimator
40
EMMA and Stochastic Gradient Descent
• Gradient descent – closed form derivative• Samples A and B redrawn at each iteration• |A| and |B| typically = 50 (very small subsamples)• lightweight iterations• noisy estimates of gradient of entropy, MI• many iterations (~20K)• tolerant of local minima : bounces out of them
H [x] ≈ − 1
|A|Xxi∈A
log
⎛⎝ 1
|B|Xxj∈B
K(xj = xi)
⎞⎠
41
areas of non-overlap of images
• one strategy: restrict calculation to region of overlap
• MIT strategy:– sample all of u(x)– if T(x) falls outside valid part of v(x)
• substitute the value zero for the intensity v(T(x))
– v(x) surrounded by a “sea of black”– resistant to “scalloping” of objective function
42
Web Course
• MIT Open Courseware• HST582: Biomedical Signal and Image
Processing• Lectures• MATLAB
– registration with RIRE data• http://ocw.mit.edu/OcwWeb/Health-Sciences-
and-Technology/HST-582JSpring-2007/CourseHome/
8
43
software systems
• ItK: image processing “c” libraries– funded by NIH/NLM– well documented
• 3D Slicer – incorporates ItK, has GUI, graphics
• FLIRT / FSL– collection of independent c programs– fMRI analysis
• SPM / AIR– matlab, standardized defaults
44
References• Hill D, Studholme, C, and Hawkes, D.Voxel Similarity Measures for
Automated Image Registration. VBC 1994.• Collignon A., Vandermeulen, D., Suetens, P., and Marchal, G.. 3d multi-
modality medical image registration using feature space clustering. CVRMED April 1995.
• Viola, P. and Wells, W.. Alignment by maximization of mutual information. In Proceedings of the 5th International Conference of Computer Vision, June 20 – 23, 1995.
• Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G, Automated multi-modality image registration based on information theory. IPMI June 26, 1995.
• Viola, P. Alignment by maximization of Mutual Information. MIT PhD Thesis, June 1995.
• Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996.
• Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 1997;24:137-154.
45
References
• Maes F, Collignon A,Vandermeulen D, Marchal G, Suetens P. Multimodality image registration by maximization of mutual information, IEEE TMI, 16(2):187-198, 1997.
• Petra A. van den Elsen. Multimodality Matching of Brain Images. PhD thesis, Utrecht University, The Netherlands, 1992.
• Petra A. van den Elsen. Retrospective fusion of ct and mr brain images using mathematical operators. In Applications of Computer Vision in Medical Image Processing, Spring Symposium Series. AAAI, March 1994.
• L . Zöllei: "2D-3D Rigid-Body Registration of X-Ray Fluoroscopy and CT Images", Masters Thesis, MIT AI Lab, August 2001.
• L . Zöllei, E. Grimson, A. Norbash, W. Wells: "2D-3D Rigid Registration of X-Ray Fluoroscopy and CT Images Using Mutual Information and Sparsely Sampled Histogram Estimators", IEEE CVPR, 2001.
• Wells W, Halle M, Kikinis R, Viola P. Alignment and Tracking using Graphics Hardware. Image Understanding Workshop, 1995.
• C. Studholme, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, Vol. 32(1), Jan 1999, pp 71-86.
Multimodality image registration by maximization of mutual information
Frederik Maes
K.U. Leuven Dept. of Electrical Engineering (ESAT/PSI)
UZ Gasthuisberg Medical Imaging Research CenterLeuven, Belgium
Outline
MI from a Leuven perspective: (Collignon CVRMed 2005)Collignon IPMI 2005, Maes IEEE TMI 1997, Maes Proc IEEE 2003
• Motivation & inspiration• Concept & interpretation• Implementation• Initial validation• Some applications
The motivation: stereotactic neurosurgery planning Prospective marker-based registration
The position of each image slice in 3D space is determined from the locations of the markers in the image.
Different images are registered based on their relative position in 3D space.
The problem: retrospective multimodality registration
MR/CT
MR/PET
MR/MR
Retrospective registration strategies
• For reviews, see e.g.:van den Elsen 1993, Maintz 1998, Zitova 2003
• Internal landmarkse.g. Hill 1991, Rohr 1997
• Surface based registratione.g. Borgefors 1988, Pellizari 1989, Besl & McKay 1992
requires segmentationdifficult to automate and introduces inaccaracies
• Voxel based registrationuses the intensity information directly, without need for segmentation
Voxel-based registration
p
q
Tα
I1 I2
q = T(p)a = I1(p)b = I2(q)
Registered (correct T) Not registered (incorrect T)
Intensity-based voxel similarity measures
• SSD• Correlation• Deterministic / stochastic sign change
Venot 1984
• Cross-correlation after intensity remapping van den Elsen 1994
• Cross-correlation of edges/ridges van den Elsen 1995, Maintz 1996
Joint histogram: same modality
p q
Tα
I1 I2
q = T(p)a = I1(p)b = I2(q)
I2
I1
b
a
p(a,b)
I2
Unimodal intensities a and b of
corresponding voxels p and q of registered images I1 and I2 are likely to be similar
p(a,b) is clustered around diagonal when registered
p(a,b) shows significant non-zero off-diagonal elements in case of misregistraion
Registered (correct T) Not registered (incorrect T)
Joint histogram: different modalities
p q
TαI1 I2
q = T(p)a = I1(p)b = I2(q)
I2
I1
b
a
p(a,b)
I2
Registered (correct T) Not registered (incorrect T)
Multimodal relationship between a
and b is strongly data dependent
Caveat: only voxels in the region of overlap of both images are considered
p(a,b) depends on T through varying correspondence (p,q) and through varying region of overlap
Histogram-based voxel similarity measures
• Variance of intensity ratio’sWoods 1993
• Third order momentHill 1994
• N-th order momentStudholme IPMI 1995
• EntropyCollignon SPIE 1994
Joint histogram & statistical dependence
I1
p(a|I2=b)more dispersed
p(a|I2 = b) = likelihood of observing intensity a in I1 given that the intensity of the corresponding voxel in I2 is b
The more clustered p(a|I2=b), the less uncertainty there is about I1 given I2=b, thus the more information the knowledge of one value (I2=b) contains about the other (I1)
If p(a|I2=b) = p(a), knowledge of I2=b does not contain information about a
I2I2
I1
b b
p(a|I2=b)more clustered
Information theory: mutual information
= a special case of the Kullback-Leibler divergence between two probabilities P and Q:
= (kind of) ‘distance’ (in bits) between the joint probability and the product of the marginals
= measure of the statistical dependence of two random variables:
A and B independent p(a,b) = p(a).p(b) I(A,B) = 0A and B one-to-one related p(a,b) = p(a) = p(b) I(A,B) = H(A) = H(B) = entropy
Mutual information: interpretation
= amount of information that one variable contains about another
marginal entropy
conditional entropy
joint entropy
The mutual information registration criterion
Collignon CVRMED/IPMI 1995, Viola ICCV 1995:
“Mutual information is maximal at registration”
p,a q,b
Tα
A B
(α = registration parameters)
Example
Original CT MR Resampled CT
Example
I(CT,MR) = 0.52 I(CT,MR) = 0.86
Example
Interpretation
“Find as much of the complexity in the separate datasets (maximizing HA + HB) such that at the same time they explain each other well (minimizing HAB).”
HA + HB = number of bits required to optimally encode A and B separatelyHAB = number of bits required to optimally encode A and B combinedIAB = (HA + HB) - HAB ≥ 0 because A and B contain redundant information“Information redundancy is maximal at registration”
IAB(α) = HA(α) + HB(α) - HAB(α)(α = registration parameters)
MI versus joint entropy
• Minimization of joint entropy by itself does not work• Indeed: HAB is minimal (zero) when the images do not overlap…• The marginal entropies vary with varying image overlap• Inclusion of the marginal entropies in MI is essential in order to assure that
the region of overlap at the registration solution contains information
max (HA(α) + HB(α) - HAB(α)) ≠ min HAB(α)
MI versus normalized MI
• In case the overlap between the images is small at registration, maximizing HA+HB may prevail over minimizing HAB , leading to solutions that prefer larger overlap instead of better correspondence
• Normalization:
Normalized mutual informationStudholme 1999
Entropy correlation coefficientMaes1997
MI as a measure of overlap
I1 = 0 I1 =1
I2 = 0
I2 = 1
4/36
5/36
5/36
22/36
I1 I2 p(i1,i2) p2(i2)
1/4
3/4
045.0
577.1)3622log
3622
365log
365*2
364log
364(
811.0)43log
43
41log
41(
811.0)43log
43
41log
41(
045.04/3*4/3
36/22log3622
4/1*4/336/5log
365*2
4/1*4/136/4log
364
1221
22212
222
221
222
=−+=
=++−=
=+−=
=+−=
=++=
HHHI
H
H
H
I
MI as a measure of overlap
I1 = 0 I1 =1
I2 = 0
I2 = 1
6/36
3/36
3/36
24/36
I1 I2 p(i1,i2) p2(i2)
1/4
3/4
204.0
418.1)3624log
3624
363log
363*2
366log
366(
811.0)43log
43
41log
41(
811.0)43log
43
41log
41(
204.04/3*4/3
36/24log3624
4/1*4/336/3log
363*2
4/1*4/136/6log
366
1221
22212
222
221
222
=−+=
=++−=
=+−=
=+−=
=++=
HHHI
H
H
H
I
MI as a measure of overlap
I1 = 0 I1 =1
I2 = 0
I2 = 1
9/36
0/36
0/36
27/36
I1 I2 p(i1,i2) p2(i2)
1/4
3/4
811.0
811.0)3627log
3627
360log
360*2
369log
369(
811.0)43log
43
41log
41(
811.0)43log
43
41log
41(
204.04/3*4/3
36/27log3627
4/1*4/336/0log
360*2
4/1*4/136/9log
369
1221
22212
222
221
222
=−+=
=++−=
=+−=
=+−=
=++=
HHHI
H
H
H
I
Impact of spatial correlation Limiting assumptions
• Both images share information…
CT PET
Limiting assumptions
• Nature of relationship between image intensities is spatially stationary…
Limiting assumptions
• Joint probability density can be estimated reliably …>< small region of overlap>< low resolution>< interpolation artifacts>< histogram size and binning strategy>< image degradations>< ...
Sampling
Transformation
Interpolation
b
a
Joint histogram
Binning
I(α)
Optimization
pa = A(p)
Tα
q = Tα(p)b = B(q)
Floating Image (A)
Reference Image (B)
+1
α∗
sub/supermulti-resolution
rigid/affine, non-rigid
NN, TRI, PV
256 x 256 Non-gradient-basedGradient based
Implementation
X
Y
Z
X
Y
Z
Sub sampling
• Start with few samples initially to speed up the criterion evaluation• Add more samples as the registration proceeds to improve accuracy
• In practice: not more than 2 levels (course and fine), as additional levels only increase computation time (Maes 1999)
Super sampling
• Resampling one of the images at a finer grid as a pre-processing step may be useful to increase accuracy
• Can avoid interpolation artifacts due to grid-aligning transformations
Joint probability estimation
pa = I1(p) q
b = I2(q) ?
T
I1 I2
q1 q2
q3q4
q
q = T(p)a = A(p), b = B(q1)
h(a,b) += 1
q1 q2
q3q4
qw3 w4
w2 w1
q = T(p)A = A(p), bi = B(qi)
b = Σ wi bi , Σ wi = 1h(a,b) += 1
Similar to linear, but using more
neighbours
Nearest neighbour(order 0)
Linear(order 1)
Cubic, B-spline,...(higher order)
Intensity interpolation Histogram binning
• If the histogram is large, it will only be sparsely filledsmall changes in T have significant impact on many bins in Hmany local optima in MIimprove robustness by intensity binning
• Linear intensity remapping: e.g. to range [0-255]converts original intensities in ‘iso-intensity’ objects
• Parzen windowing: distribute each sample over multiple bins
• Partial volume distributionavoids intensity interpolation, but treats image values as labels
Partial volume distribution interpolation
q1 q2
q3q4
qw3 w4
w2 w1
a
b1q = T(p)
A = A(p), bi = B(qi)
h(a,bi) += wi, Σ wi = 1
b3
b2
b4
+w1
+w4
+w2
+w3
Fractions wi vary smoothly with q histogram and MI vary smoothly with T MI a.e. differentiable w.r.t. T
Joint histogram
Collignon 1995, Maes 1997, Chen & Varshny 2003
MI traces for in-slice rotation around registered position: [-180,+180] degrees
NN TRIPV
-180 -120 -60 0 60 120 1800.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MR:1x1x1mm CT:1x1x1.5mm
Behavior of MI
-0.5 -0.25 0 0.25 0.50.881
0.882
0.883 NN
-0.5 -0.25 0 0.25 0.5
0.963
0.964
0.965
TRI
-0.5 -0.25 0 0.25 0.50.875
0.876
0.877
PV
MI traces for in-slice rotation around registered position: [-0.5,+0.5] degrees
0-180 180
Influence of interpolationon optimization behavior
NN TRIPV
0 0.1 0.2 0.30
1
2 x 10-4
I(α*) - I(α)
| α - α* |mmdegrees
Same registration experiment, using different interpolation methods and starting from different initial parameter values
α = registration parametersα* = optimal value for a particular interpolation type
Gradient of MI
q1 q2
q3q4
qw3 w4
w2 w1
δq/δα
δw2 δw1
δw3 δw4
PV interpolation
(Maes 1999)
Different interpolation and binning schemes, lead to other gradient expressions,e.g. Thevenaz 2000, Hermosillo 2002, Mattes 2003
Optimization strategies
Powell SimplexSteepestdescent
Conjugategradient
Quasi-Newton
Levenberg-Marquardt
Non-gradient Gradient
Multiresolution voxel similarity measures for MR-PET registrationStudholme C., D.L.G. Hill, and D.J. Hawkes, IPMI 1995
Retrospective Registration Evaluation Project (RREP), J.M.Fitzpatrick et al., 1996• comparitive validation of retrospective registration techniques• for CT/MR and PET/MR matching of the brain• using the stereotactic registration solution as the gold standard• blind study: images were edited to remove markers• study demonstrated the subvoxel accuracy of the MI matching criterion
Early validation Initial RREP results
West 1997, Maes 2003
CT PET-FDG emissionPET transmissionAligned by acquisitionMatched using MMI
Application: thorax tumor staging from PET and CT
detection in PET
localisationin CT
Vansteenkiste 1998
Application: prostate radiotherapy planning from CT and MR
Debois 1999
hardware phantomknown geometry
CT imageideal
image
Application: geometric accuracy of spiral CT imaging
geometricalmodel
ESP (anthropomorphicspine phantom)
matchedcomparison
Model-to-image registration
Histogram dispersion
cortical wall 1.5 mm cortical wall 1.0 mm cortical wall 0.5 mm
Image intensity Image intensity Image intensity
Before
Before
BeforeAfter
AfterAfter
Conclusion
• histogram-based instead of intensity-based• robust against image degradations (noise, artifacts, local
distortions)• no limitations imposed on the data• theoretically well founded (information theory)• no segmentation required• no need for user intervention• completely automated• ‘easy’ to implement• same algorithm applicable in a variety of applications• very broad applicability (see Pluim 2003 for a survey)
Impact on the field
In 2000 recognized by IEEE as “a landmark in the profession, with enduring importance and influence far beyond its peers”.
In 2005 recognized by ISI as one of the 10 most cited papers of the last decade published in Engineering (September 2009: >1400 citations, ISI Web of Science)
Commercial implementation
Some software tools
• AIR (UCLA, Loni): http://www.loni.ucla.edu/Software/AIR• DROP (TU Muenchen): http://www.mrf-registration.net• Elastix (Image Sciences Institute, Utrecht): http://elastix.isi.uu.nl• FSL/FLIRT/FNIRT (FMRIB, Oxford): http://www.fmrib.ox.ac.uk/fsl• IRTK (Imperial College London): http://www.doc.ic.ac.uk/~dr/software/• Slicer (BWH): http://www.slicer.org• SPM (University College London): http://fil.ion.ucl.ac.uk/spm• ...
References
• Besl P.J., McKay N.D. A method for registration of 3-D shapes, IEEE PAMI, 14(2):239-256, 1992• Borgefors G., Hierarchical chamfer matching: a parametric edge matching algorithm, IEEE PAMI,
10(6):849-865, 1988• Chen HM, Varshney PK. Mutual information-based CT-MR brain image registration using
generalized partial volume joint histogram estimation. IEEE Trans. Med. Img., 22(9): 1111-1119, 2003
• Collignon, A., Vandermeulen, D., Suetens, P., and Marchal, G.. 3d multi-modality medical image registration using feature space clustering. CVRMED April 1995.
• Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G, Automated multi-modality image registration based on information theory. IPMI June 26, 1995.
• Cover T.M., and J.A. Thomas, Elements of Information Theory, 1991• Debois M, Oyen R, Maes F, et al., The contribution of magnetic resonance imaging to the three-
dimensional treatment planning of localized prostate cancer, INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 45(4): 857-865, 1999
• Hermosillo G, Chefd'Hotel C, Faugeras O, Variational methods for multimodal image matching, INTERNATIONAL JOURNAL OF COMPUTER VISION, 50(3): 329-343, 2002
• Hill D, Studholme, C., and Hawkes, D.. Voxel Similarity Measures for Automated Image Registration. VBC October 1994
References
• Maes F, Collignon A,Vandermeulen D, Marchal G, Suetens P. Multimodality image registration by maximization of mutual information, IEEE Trans. Med. Img., 16(2):187-198, 1997
• Maes F. Segmentation and Registration of Multimodal Medical Images: from Theory, Implementation and Validation to a Useful Tool in Clinical P actice. PhD thesis, KU Leuven, 1998.
• F.Maes, D. Vandermeulen, and P. Suetens. Comparative evaluation of multiresolution optimization strategies for multimodality image registration by maximization of mutual information. Medical Image Analysis, 3(4):373–386, 1999.
• Maes F, Vandermeulen D, Suetens P. Medical image registration using mutual information, Proc. IEEE 91(10): 1699-1722, 2003
• Maintz J.B.A., P.A. van den Elsen, M.A. Viergever, Evaluation of ridge seeking operators for multimodality medical image matching", IEEE Trans. PAMI, 1996.
• Maintz JBA, Viergever MA. A survey of medical image registration, Medical Image Analysis, 2(1):1-37, 1998
• Mattes D, Haynor DR, Vesselle H, et al., PET-CT image registration in the chest using free-form deformations, IEEE Trans. Med. Img., 22(1): 120-128, 2003
• Pelizzari C.A., G.T.Y. Chen, D.R. Spelbring, R.R. Weichselbaum, C-T. Chen, Accurate Three-Dimensional Registration of CT, PET, and/or MR Images of the brain", Journal of Computer Assisted Tomography, 13(1) (1989) 20-26
References
• Pluim JPW, Maintz JBA, Viergever MA, Mutual-information-based registration of medical images: A survey, IEEE Trans. Med. Img., 22(8): 986-1004, 2003
• Rohr K. On 3D differential operators for detecting point landmarks. Image and Vision Computing, 15(3):219-233, 1997
• Studholme C., D.L.G. Hill, and D.J. Hawkes, Multiresolution voxel similarity measures for MR-PET registration, IPMI 1995
• Studholme C, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, 32(1), Jan 1999, pp 71-86.
• Thevenaz P, Unser M, Optimization of mutual information for multiresolution image registration, IEEE Trans. Image processing, 9(12): 2083-2099, 2000
• van den Elsen P.A., J.B.A. Maintz, E.-J.D. Pol, M.A. Viergever, Automatic Registration of CT and MR Brain Images Using Correlation of Geometrical Features", IEEE TMI, 14(2): 384-396, 1995
• van den Elsen P.A., E.J.D. Pol, T.S. Sumanaweera, P.F. Hemler, S. Napel, J.R. Adler, Grey value correlation techniques used for automatic matching of CT and MR brain and spine images, SPIE 1994
• van den Elsen, P. A., Pol, E. J. D., and Viergever, M. A. Medical image matching– a review with classification. IEEE Engineering in medicine and biology, 12(1), 26–39, 1993
References
• Vansteenkiste JF, Stroobants SG, Dupont PJ, et al., FDG-PET scan in potentially operable non-small cell lung cancer: do anatometabolic PET-CT fusion images improve the localisation of regional lymph node metastases?, EUROPEAN JOURNAL OF NUCLEAR MEDICINE, 25(11):1495-1501, 1998
• Venot A., J.F. Lebruchec, and J.C. Roucayrol, A New Class of Similarity Measures for Robust Image Registration," Computer Vision, Graphics, and Image Processing, vol. 28, pp. 176-184, 1984.
• Viola, P. and Wells, W.. Alignment by maximization of mutual information. In Proceedings of the 5th International Conference of Computer Vision, June 20 – 23, 1995.
• Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 24:137-154, 1997
• Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996.
• West JB, Fitzpatrick JM, et al.. "Comparison and evaluation of retrospective intermodality image registration techniques. JCAT 1997.
• Woods R.P., Mazziotta J.C., Cherry S.R. MRI-PET registration with automated algorithm, Journal of Computer Assisted Tomography, 17(4):536-546, 1993
• Zitova B, Flusser J. Image registration methods: a survey. Image and Vision Computing, 21(11):977-1000, 2003
1
Aspects of mutual information-based image registration
Josien Pluim
Image Sciences InstituteUniversity Medical Center UtrechtThe Netherlands
Outline
Image registration involves• a similarity measure• interpolation• optimization
– f-information measures– interpolation artefacts– acceleration of optimization
f-Information measures
Reference: Pluim 2004
f-Divergence measures
Distance between two probability distributions.
Definition:
Example: Kullback-Leibler distance:
)()||( ∑=i i
ii q
pfqQPf
∑i i
ii q
pp log
f-Information measures
Subclass of f-divergence, measure of dependence.Divergence between joint probability pij and joint probability in case of independence pi pj .
Definition:
Example: mutual information
)()||(,
21 ∑=×ji ji
ijji pp
pfppPPPf
ji
ij
jiij pp
pp∑
,log
Choice of f
Varying the function f (subject to certain requirements) yields various measures.Example f :
( ) 1,0,)1(
1 ≠≠−
−+−= αααα
ααα
αxxxI
I0.2
2
Choice of f
Resulting Iα – information:
For equals mutual information.
( ) 1,0,1)()1(
1||,
121 ≠≠⎟⎟⎠
⎞⎜⎜⎝
⎛−
−=× ∑ − αα
αα α
α
αji ji
ij
ppp
PPPI
( )21||,1 PPPI ×→ αα
Other f-information measures
V - information:
Matusita information:
χα – information:
Rényi information:
( ) ∑ −=×ji
jiij pppPPPV,
21||
( ) ∑ −=×ji
jiij pppPPPM,
1
21 )(|| αααα
( ) ∑ −
−=×
ji ji
jiij
ppppp
PPP,
121 )(|| α
α
αχ
( ) ∑ −−=×
ji ji
ij
ppp
PPPR,
121 )(log
11|| α
α
α α
Examples
MR-CT, head, out-of-plane rotation, -60 to 60 degrees
0.2 0.5 0.8
0.2 0.5 2.0 3.0MIIα
Mα V
Evaluation
Evaluation on MR-CT and MR-PET registration: head, rigid,RIRE data.• Accuracy (screw marker-based gold standard).• Robustness (convergence from many starting positions).
Conclusions
Conclusions:• Choice of α=1 seems best option for function
smoothness• Functions for small and large α more difficult to optimize• Some measures achieved better accuracy than MI
(Iα, Rα, Mα; α ∈ {0.2, 0.5}).
Interpolation artefacts
References: Maes 1998, Pluim 2000
3
Interpolation artefacts
Examples, MR-CT, head, axial translation
partial volume interpolationlinear interpolation
Interpolation
y1 y2
y3 y4
T(x)
w4 w3
w2 w1
Partial volume interpolationh( I(x),J(yi) ) += wi , ∀ i
Linear interpolation
I(T(x)) = Σi wi · yih( I(x),J(T(x)) ) += 1
Interpolation artefacts
When do they occur?
For images of equal grids.Problems occur when interpolation is not required for every transformation, i.e. when the grids align.
Linear interpolation
Local minima at grid alignment.
Typical example: MR-CT, head, axial translation
Artefacts occur because linear interpolation smoothes image. Reduction of noise causes a decrease in joint entropy.For grid-aligning transformations, there is no interpolation, resulting in higher joint entropy and lower MI.
Linear interpolation and noise
Example: MNI brain atlas, MR-T1 and T2
20 - 10 0 10 20 20 - 10 0 10 20 20 - 10 0 10 20
“no” noise 3 percent noise 5 percent noise
- - -
Partial volume interpolation
Local maxima at grid alignment.
Typical example: MR-CT, head, axial translation
Artefacts occur because PV interpolation increases the dispersion of the joint histogram. It causes an increase in joint entropy.For grid-aligning transformations, there is no interpolation, resulting in lower joint entropy and higher MI.
4
Resampling
Example: MR-CT, head, axial translation
partial volume interpolationlinear interpolation
voxel size 1.5 mm
voxel size 1.53 mm
Subvoxel accuracy
Interpolation artefacts may impede subvoxel accuracy.
Example: MR-CT, head, in-plane translation
original resolution downsampled
Summary
Interpolation artefacts• can occur when images have equal voxel size(s),• are more pronounced for images of low resolution,• can occur both for partial volume interpolation (local
maxima) and linear interpolation (local minima),• impede subvoxel accuracy.
Related work
Further studies into interpolation artefacts, including other similarity measures and interpolation methods:
Holden 2001Tsao 2003 Ji 2003Aljabar 2005Rohde 2005Inglada 2007Thévenaz 2008Rohde 2009
Proposed solutions
• Resampling• Initial rotation• Smoothing images
Inglada 2007, Rohde 2009
• Blurring of the joint histogramTsao 2003
• Small number of histogram binsJi 2003
• OversamplingJi 2003
Proposed solutions
• Use of a prior probabilityAt coarse levels in a pyramid, include the joint pdf from the finest level:Likar 2001, Gan 2004
• Higher-order interpolationTsao 2003, Aljabar 2005, Rohde 2009
• Generalized Partial Volume EstimationPV interpolation with a B-spline instead of a linear kernelIntroduced in Chen 2003Wei 2004 (Gaussian instead of B-spline)Lu 2008 (Hanning windowed sinc instead of B-spline)
( ) ( ) ( ) ( )JIpJIpJIp priorcurrent ,1,, λλ −+=
5
Proposed solutions
• Off-grid samplingE.g. randomly perturbed grid positions, (x+Δx, y+Δy).Likar 2001, Tsao 2003, Seppä 2008
• Random sampling Thévenaz 2008, Rohde 2009
• Constant variance interpolationThévenaz 2008a
• Variance correction filterPost-interpolation filter to counteract change in varianceSalvado 2007
Accelerating optimization
Reference: Klein 2007
Some related work
• Various optimization methods Maes 1999
• Multiresolution approachesThévenaz 2000, Pluim 2001, Likar 2001
• Look-up tablesSarrut 1999, Meihe 1999
• Parallelization / hardware implementationsCastro-Pareja 2004, Levin 2004, Ino 2005, Vetter 2007, Modat 2009
Optimization
Deformation modelled by B-splines.
Control point displacements: μ = {μ1, μ2, μ3, ...... }Cost function: F(μ)
Aim: find μ that minimises F(μ)
Gradient descent
gkak.-μk=μk+1
μ3
μ2
μ1
g3ak.-=μ3
g2μ2
g1μ1
::
::
::
k+1 k k
- - -
= ∂F∂μ1 k
Gradient descent
F(μ)-
μ1μ2
6
Smarter steps
F(μ)-
μ1μ2
Cheaper steps
F(μ)-
μ1μ2
Comparison
cheaper steps• Stochastic gradient
• Conjugate gradientsmarter steps
• Quasi-Newton
reference• Gradient descent
dkak.+μk=μk+1
Stochastic approach
Stochastic method uses an approximation to gk by subsampling. Convergence guaranteed if bias in approximation error goes to zero
Therefore take new set of random samples in every iteration.
Deterministic methods can be used with a single set of regular samples.
∞→→ kgg kk as,)~(E
Experiments
Cardiac CT images, 3D, known deformations
0
0.5
1
1.5
2
2.5
3
0.001 0.01 0.1 1 10 100 1000
Results
‘computation time’
e [mm]
gradient descentquasi-Newtonconjugate gradientstochastic gradient
7
0
0.5
1
1.5
2
2.5
3
0.001 0.01 0.1 1 10 100 1000
Results
‘computation time’
e [mm]
gradient descentquasi-Newtonconjugate gradientstochastic gradient248
Conclusions
• Cheap steps result in more acceleration than smart steps.
• Stochastic methods allow strong subsampling and hence a large reduction in computation time.
References
1. P. Aljabar, J.V. Hajnal, R.G. Boyes, D. Rueckert, Interpolation artefacts in non-rigid registration, MICCAI, LNCS 3750:247-254, Springer, 2005
2. A. Bardera, M. Feixas, I. Boada, Normalized similarity measures for medical image registration, SPIE Medical Imaging: Image Processing, Proc. SPIE 5370:108-118, 2004
3. C.R. Castro-Pareja, J.M. Jagadeesh, R. Shekhar, FAIR: a hardware architecture for real-time 3-D image registration, IEEE Trans. Inf. Technol. Biomed. 7(4):426-434, 2003
4. H. Chen and P.K. Varshney, Mutual information-based CT-MR brain image registration using generalized partial volume joint histogram estimation, IEEE Trans. Med. Imaging 22(9):1111-1119, 2003
5. R. Gan, J. Wu, A.C.S. Chung, S.C.H. Yu, W.M. Wells III, Multiresolution image registration based on Kullback-Leibler distance, MICCAI, LNCS 3216:599-606, Springer, 2004
6. Y. He, A.B. Hamza, H. Krim, A generalized divergence measure for robust image registration, IEEE Trans. Signal Process., 51(5):1211-1220, 2003
7. M. Holden, Registration of 3D serial MR brain images, PhD thesis, University of London, UK, 2001
References
8. J. Inglada, V. Muron, D. Pichard, T. Feuvrier, Analysis of artifacts in subpixel remote sensing image registration, IEEE Trans. Geosci. Remote Sensing 45(1):254-264, 2007
9. F. Ino, K. Ooyama, K. Hagihara, A data distributed parallel algorithm for nonrigidimage registration, Parallel Comput. 31(1):19-43, 2005
10. J.X. Ji, H. Pan, Z.P. Liang, Further analysis of interpolation effects in mutual information-based image registration, IEEE Trans. Med. Imaging 22(9):1131-1140, 2003
11. S. Klein, M. Staring, J.P.W. Pluim, Evaluation of optimization methods for nonrigidmedical image registration using mutual information and B-splines, IEEE Trans. Image Process. 16(12):2879-2890, 2007
12. S. Klein, J.P.W. Pluim, M. Staring, M.A. Viergever, Adaptive stochastic gradient descent optimisation for image registration, Int. J. Comput. Vis. 81(3):227-239, 2009
13. D. Levin, D. Dey, P.J Slomka, Acceleration of 3D, nonlinear warping using standard video graphics hardware: implementation and initial validation, Comput. Med. Imaging Graph. 28(8):471-483, 2004
14. B. Likar and F. Pernuš, A hierarchical approach to elastic registration based on mutual information, Image Vis. Comput. 19(1-2):33-44, 2001
References
15. X. Lu, S. Zhang, H. Su, Y. Chen, Mutual information-based multimodal image registration using a novel joint histogram estimation, Comput. Med. Imaging Graph. 32(3):202-209, 2008
16. F. Maes. Segmentation and registration of multimodal medical images: from theory, implementation and validation to a useful tool in clinical practice, PhD thesis, Catholic University of Leuven, Belgium, 1998.
17. F. Maes, D. Vandermeulen, P. Suetens, Comparative evaluation of multiresolutionoptimization strategies for multimodality image registration by maximization of mutual information, Med. Image Anal., 3(4):373-386, 1999
18. X. Meihe, R. Srinivasan, W.L. Nowinski, A fast mutual information method for multi-modal registration, IPMI, LNCS 1613: 466-471, Springer, 1999
19. M. Modat, G.R. Ridgway, Z.A. Taylor, D.J. Hawkes, N.C. Fox, S. Ourselin, A parallel-friendly normalized mutual information gradient for free-form registration, SPIE Medical Imaging: Image Processing, Proc. SPIE 7259, 2009
20. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Interpolation artefacts in mutual information based image registration, Comput. Vis. Image Underst. 77(2):211-232, 2000
References
21. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Mutual information matching in multiresolution contexts, Image Vis. Comput., 19(1-2):45-52, 2001
22. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Mutual-information-based registration of medical images: a survey, IEEE Trans. Med. Imaging, 22(8):986-1004, 2003.
23. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, f-Information measures in medical image registration, IEEE Trans. Med. Imaging 23(12):1508-1516, 2004
24. G.K. Rohde, A.S. Barnett, P.J. Basser, C. Pierpaoli, Estimating intensity variance due to noise in registered images: Applications to diffusion tensor MRI, NeuroImage26(3):673-684, 2005
25. G.K. Rohde, A. Aldroubi, D.M. Healy, Interpolation artifacts in sub-pixel image registration, IEEE Trans. Image Process. 18(2):333-345, 2009
26. O. Salvado and D.L. Wilson, Removal of local and biased global maxima in intensity-based registration, Med. Image Anal. 11(2):183-196, 2007
27. D. Sarrut and S. Miguet, Fast 3D image transformations for registration procedures, ICIAP, 446-452, IEEE Computer Society, 1999
28. M. Seppä, Continuous sampling in mutual-information registration, IEEE Trans. Med. Imaging 17(5):823-826, 2008
8
References
29. P. Thévenaz and M. Unser, Optimization of mutual information for multiresolutionimage registration, IEEE Trans. Image Process., 9(12):2083-2099, 2000
30. P. Thévenaz, M. Bierlaire, M. Unser, Halton sampling for image registration based on mutual information, Sampling Theory Signal Image Process. 7(2):141-171, 2008
31. P. Thévenaz, T. Blu, M. Unser, Short basis functions for constant-variance interpolation, SPIE Medical Imaging: Image Processing, Proc. SPIE 6914, 2008a
32. J. Tsao, Interpolation artifacts in multimodality image registration based on maximization of mutual information, IEEE Trans. Med. Imaging 22(7):854-864, 2003
33. C. Vetter, C. Guetter, C. Xu, R. Westermann, Non-rigid multi-modal registration on the GPU, SPIE Medical Imaging: Image Processing, Proc. SPIE 6512, 2007
34. M. Wei and J. Liu, Artifacts reduction in mutual information-based CT-MR image registration, SPIE Medical Imaging: Image Processing, Proc. SPIE 5370:1176-1186, 2004
1
1
Probabilistic and information-theoretic approaches to registration
William WellsAssociate Professor of RadiologySurgical Planning LaboratoryHarvard Medical School and Brigham and Women’s Hospital
Affiliated Faculty: Harvard – MIT Division of Health Sciences and Technology
Research Scientist – MIT CSAIL
2
plan
• tour of probabilistic and information theoretic registration methods
• focus on generative models of the methods– describe models with equations– state results (skip derivations)
• framework for motivation and comparison
3
Outline• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters *• KLD Registration and MI• MAP with unknown model parameters
– Prior probabilities on model parameters– Joint MAP *– Marginalized MAP
• Weak prior *• Informative prior *• Strong prior
– EM Algorithm to obtain estimates• Simple iteration *• Experimental Results
– Group-wise Registration *
* Connections to prior work4
A Marginalized MAP Approach and EM Optimization for Pair-Wise Registration
IPMI 2007
Lilla ZolleiMark JenkinsonSamson TimonerWilliam Wells
Formalism from:
5
Estimation
y: dataθ: model parameters
Maximum Likelihood:
Maximum A Posteriori (MAP):
using Bayes rule
θ̂ = argmaxθlog p(y|θ)
θ̂ = argmaxθlog p(θ|y)
= argmaxθ[log p(y|θ) + log p(θ)]
6
Basic MAP Registration
T̂ = argmaxTlog p(T |u, v)
T̂ = argmaxTlog [p(u, v|T ) p(T )]
• u, v: images• T: transform on image• Maximum A-posteriori Probability
2
7
Probability on Image Pairs
• Kinematic Assumption
• Independently and Identically Distributed (IID) in space
• xi : voxel
p(u, v|T ) =Yxi
p(u(xi), v(xi)|T )
p(u, v|T ) = p(u(x), v(T (x)))
8
Probability on Intensity Pairs
• : Bin index• Multinomial Distribution
– One trial
p(ui, vi|Θ) = Mult(B(ui, vi); 1,Θ)
= θB(ui,vi)
0 ≤ θj ≤ 1Xj
θj = 1
B(·, ·)
9
Estimate Multinomial Distribution from Data
• Maximum Likelihood Method:– Histogram the data– Set the parameters to be normalized histogram counts
10
Joint Image Histogram
u intensities
v intensities1 2 43
11
5
106 987
…
Data:
Bin Counts, e.g., n9 = 2
BIN Indices
(u(xi), v(xi))
11
Probability on Image Pairs…
• nj(T) : Number of voxel pairs that map to bin j • Multinomial on sequence of observations
– (not counts)
p(u, v|T,Θ)
=Yi
Mult(B(ui, vi); 1,Θ)
= Mult({B(ux1 , vy1) · · · B(uxN , vyN )};N,Θ)
=Yj
θnj(T )j
12
MAP Registration: Known Model
T̂ = argmaxTlog p(T |u, v,Θ)
= argmaxTlog [p(u, v|T,Θ) p(T )]
= argmaxT
⎡⎣Xj
n(T )j log(θj) + logP (T )
⎤⎦
3
13
MAP Registration: Known Model…
• Training– Estimate Θ from registered images
• ML: normalized histogram:
• Registration– Simple Objective Function
• Linear in counts
T̂ = argmaxT
⎡⎣ gXj=1
n(T )j log(θj) + logP (T )
⎤⎦
Θ̂ =n(T0)
N
14
ML: Known Model
T̂ = arg maxTlog [p(u, v|T,Θ)]
* M Leventon, W Grimson, W Wells. Multi-Modal Volume Registration Using Joint Intensity Distributions. MICCAI 98
15
KLD Registration
• D[· || ·] : KL Divergence– Compares probability distributions– Non neg, zero for identical distributions, not symmetric
T̂ = argminTD[p̂(u, v;T ) || p(u, v;T = 0)]
D[p(x) ||q(x)].=Xx
p(x) logp(x)
q(x)
estimated from current images at transform T
estimated from correctly registered images
16
Multi-Modality image registration by minimising Kullback-Leibler distance
ACS Chung, WM Wells, WEL Grimson, A NorbashMICCAI 2002
• 2D/3D DSA to MRA
Provided by Albert Chung
KLD Registration Example…
17
Before Registration
Provided by Albert Chung18
After Registration
Provided by Albert Chung
4
19
KLD and MI Registration
T̂ = argmaxTD[p̂(u, v;T ) || p̂(u) p̂(v|T )]
KLD registration:
MI Registration:
T̂ = argminTD[p̂(u, v;T ) || p(u, v;T = 0)]
20
Strong vs. Weak Models
• Capture / Bias Tradeoff– Strong Model, e.g.: ML, MAP, KLD
• Robust: large capture• Model may be inaccurate for new images
– Less accurate estimate of T
– Weak Model, e.g.: min Entropy, max MI• Less robust: smaller capture• More accurate estimate of T
21
Outline• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters *• KLD Registration and MI• MAP with unknown model parameters
– Prior probabilities on model parameters– Joint MAP *– Marginalized MAP
• Weak prior *• Informative prior *• Strong prior
– EM Algorithm to obtain estimates• Simple iteration *• Experimental Results
– Group-wise Registration *
* Connections to prior work22
Model Parameters, Θ ,Unknown
p(T,Θ|u, v, w) ∝ p(u, v|T,Θ)p(T )p(Θ|w)
• Prior: p(Θ|w)• Joint prior is independent
– p(T,Θ) = p(T) p(Θ|w)
• Joint Posterior:
23
Joint MAP
• Estimate T, Θ jointly• Θ is a nuisance parameter
– discard estimate of Θ
dTΘ = argmaxTΘ
p(T,Θ|u, v, w)
bT = argmaxT
hmaxΘp(T,Θ|u, v, w)
i
24
Joint MAP *• Θ is maximized out
• Joint Maximum Likelihood– Connections to entropy and Mutual Information
• Related:
* L . Zöllei. A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images.Ph.D. thesis, MIT
bT = argmaxT
hmaxΘp(T,Θ|u, v, w)
i
A. Roche, G. Malandain, and N Ayache. Unifying maximum likelihood approaches in medical image registration. International Journal of Imaging Systems and Technology, 11(7180):71–80, 2000
5
25
Marginalize Nuisance Parameter
• Alternative: Average or Marginalize:
bT = argmaxT
·Zp(T,Θ|u, v, w)dΘ
¸= argmax
Tp(T |u, v, w)
= argmaxT
·Zp(u, v|T,Θ)p(T )p(Θ|w)dΘ
¸
bT = argmaxT
hmaxΘp(T,Θ|u, v, w)
i• Maximize out:
26
• Conjugate prior for Multinomial• Multi-category generalization of Beta• Parameterized by pseudo-data counts: w
Dirichlet Prior on Θ
p(Θ|w) = Dir(Θ;w)
27
Marginalized MAP
T̂ = argmaxTlog p(T |u, v, w)
= argmaxT
⎡⎣logP (T ) + gXj=1
logΓ (nj(T ) + wj)
⎤⎦
28
3 Cases on strength of prior
• Weak prior: Laplace Prior• Informative prior • (Strong prior: dominates data : model is known)
– back to where we started
29
Laplace Prior: wj = 1
T̂ = argmaxT
⎡⎣logP (T ) + gXj=1
log Γ (nj(T ) + 1)
⎤⎦
T̂ ≈ argminT
·N ·H
·Mult
µ1,n(T )
N
¶¸− log p(T )
¸Use Stirling’s approximation…
Minimum Entropy Registration emerges!
30
Informative Prior
• Minimize entropy of pooled data *
T̂ = argmaxT
⎡⎣logP (T ) + gXj=1
logΓ (nj(T ) + wj)
⎤⎦≈ argmin
T
·N · c ·H
·Mult
µ1,n(T ) + w
N + w0
¶¸− log p(T )
¸
* Mert Sabuncu. Entropy-based Methods for Image Registration. PhD Thesis, Princeton 2006.
• using log(Γ(x)) ≈ x log(x)
6
31
Entropic Probability Model
Recap on the posterior distribution:
p(T |u, v, w) ∝ e−NH
£Mult
¡1,
n(T)+wN+w0
¢¤p(T )
32
Example use of Dirichlet Prior
L . Zöllei, W.M. Wells III: "Multi-modal Image Registration Using Dirichlet-encoded Prior Information", WBIR06
33
Recap: Marginalized MAP registration
• 3 Cases on strength of prior– Weak prior: Laplace Prior
• Minimize entropy
– Informative prior • Minimize entropy of pooled data
– (Strong prior: dominates data• MAP using known fixed model)
34
Outline• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters *• KLD Registration and MI• MAP with unknown model parameters
– Prior probabilities on model parameters– Joint MAP *– Marginalized MAP
• Weak prior *• Informative prior *• Strong prior *
– EM Algorithm to obtain estimates• Simple iteration *• Experimental Results
– Groupwise Registration
35
EM Algorithm
• ML parameter estimation • Observed data• Hidden data• Simple iterative algorithm• Nice convergence properties
A. P. Dempster, N. M. Laird, and D. Rubin. Maximum likelihood from incompletedata via the em algorithm. Journal of the Royal Statistical Society, 39(1):1–38,1977.
36
EM Estimator of T | u v w
• Timoner’s Iteration:
T̂next ≈ argmaxT
⎡⎣ gXj=1
n(T )j log³n(T̂old)j + wj − .5
´+ logP (T )
⎤⎦
T̂next ≈ argmaxT
⎡⎣ gXj=1
n(T )j log³n(T̂old)j + ²
´+ logP (T )
⎤⎦
• Iterate:– (Re) estimate model from current configuration
• Histogram joint intensities
– Do MAP registration with fixed model• Simple objective function: linear in counts
7
37
Experiment *• Samson Timoner PhD Thesis• Sequential Intra-Operative MRI• “Samson’s Iteration”• Linear Elastic Deformation Energy: E(T)
– p(T) ∝ exp( – E(T))• Iterated relaxation of deformation energy
– Equivalent to Viscous fluid **
* S Timoner. Compact Representations for Fast Nonrigid Registration of Medical Images. PhD Thesis, MIT 2003
** X. Papademetris, E. T. Onat, A. J. Sinusas, D. P. Dione, R. T. Constable, and J. S. Duncan. The active elastic model. In Proceedings of IPMI, volume 0558 of LNCS, pages 36–49. Springer, 2001.
38
Timoner’s Iteration
• Iterate:– (Re) estimate model from current configuration
• Histogram joint intensities
– Do MAP registration with fixed model• Simple objective function: linear in counts
– Relax energy of current deformation
39
Signa SP (GE Medical Systems)
R. Pergolizzi40
Intra-Operative Image
Provided by Samson Timoner
41
Second Intra-Operative Image
Provided by Samson Timoner42
Second Image Warped to First Image
Provided by Samson Timoner
8
43
Outline• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters *• KLD Registration and MI• MAP with unknown model parameters
– Prior probabilities on model parameters– Joint MAP *– Marginalized MAP
• Weak prior *• Informative prior *• Strong prior *
– EM Algorithm to obtain estimates• Simple iteration *• Experimental Results
– Groupwise Registration
44
T1
T5
T4
T3
T2
T6
T7
TN
…
Groupwise Registration
Goal: find “central tendancy”:
45
Groupwise Registration
Consider IID in space, kinematic assumption
But, High dimension density estimation is difficult…
46
Model for Congealing
Independent (not identical) in spaceIID across the images
One dimensional density estimation is easier…
47
Congealing
Minimize total entropy of PDFs at all voxels
• (different) Multinomial Model at each voxel• Uninformative prior on model parameters• (no prior on transforms)• Marginalize out unknown model parameters
T̂ ≈ argminT
Xj
Hhp̂j(I|T )
i
Probability distribution on intensity at voxel jestimated from observed intensities given transform T
48
Congealing…• Erik G. Miller, (Feb., 2002) Ph.D. Thesis: Learning from One Example
in Machine Vision by Sharing Probability Densities. MIT EECS.
• Erik Learned-Miller, (2005) Data Driven Image Models through Continuous Joint Alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).
• Lilla Zollei, Erik Learned-Miller, Eric Grimson and William Wells, (2005) Efficient population registration of 3D data. Workshop onComputer Vision for Biomedical Image Applications: Current Techniques and Future Trends, at the International Conference ofComputer Vision (ICCV). (Best Paper Award)
9
49
127 Adult MRI
50
Before and After Congealing
Data set: 127 T1w MRI; [256x256x124] with (0.9375, 0.9375, 1.5) mm3 voxels;Experiment: 3 levels; 12-param. affine; N = 800-1600; iter = 250; time = 6hrs
51
• Balci S, Golland P, Wells W. Non-Rigid groupwise registration using b-splinedeformation model. Insight Journal, http://hdl.handle.net/1926/568, 2007.
30 FBIRN Subjects
Affine
B-Spline
52
Joint Congealing Two Infant Populations
• 17 full term• 22 pre term
• Group analysis of transform params– Significant difference
in shape
53
Summary• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters• KLD Registration and MI• MAP with unknown model parameters
– Prior probabilities on model parameters– Joint MAP– Marginalized MAP
• Weak prior• Informative prior• Strong prior
– EM Algorithm to obtain estimates• Simple iteration• Experimental Results
– Group-wise Registration
54
References
• Zollei L, Jenkinson M, Timoner S, Wells W. A Marginalized MAP Approach and EM Optimization for Pair-Wise Registration. IPMI 2007.
• M Leventon, W Grimson, W Wells. Multi-Modal Volume Registration Using Joint Intensity Distributions. MICCAI 98.
• Chung ACS, Wells W, Grimson W, Norbash A. Multi-Modality image registration by minimising Kullback-Leibler distance. MICCAI 2002.
• L . Zöllei. A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images. Ph.D. thesis, MIT
• A. Roche, G. Malandain, and N Ayache. Unifying maximum likelihood approaches in medical image registration. International Journal of Imaging Systems and Technology, 11(7180):71–80, 2000
• Mert Sabuncu. Entropy-based Methods for Image Registration. PhD Thesis, Princeton 2006.
• L . Zöllei, W.M. Wells III: "Multi-modal Image Registration Using Dirichlet-encoded Prior Information", WBIR06.
10
55
References• A. P. Dempster, N. M. Laird, and D. Rubin. Maximum likelihood from
incomplete data via the em algorithm. Journal of the Royal Statistical Society, 39(1):1–38,1977.
• S Timoner. Compact Representations for Fast Nonrigid Registration of Medical Images. PhD Thesis, MIT 2003
• X. Papademetris, E. T. Onat, A. J. Sinusas, D. P. Dione, R. T. Constable, and J. S. Duncan. The active elastic model. In Proceedings of IPMI, volume 0558 of LNCS, pages 36–49. Springer, 2001.
• Erik G. Miller, (Feb., 2002) Ph.D. Thesis: Learning from One Example in Machine Vision by Sharing Probability Densities. MIT EECS.
• Erik Learned-Miller, (2005) Data Driven Image Models through Continuous Joint Alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).
• Lilla Zollei, Erik Learned-Miller, Eric Grimson and William Wells, (2005) Efficient population registration of 3D data. Workshop on Computer Vision for Biomedical Image Applications: Current Techniques and Future Trends, at the International Conference of Computer Vision (ICCV). (Best Paper Award)
Incorporating local context in MI based registration: spatial and voxel label information
Frederik Maes, D. Loeckx
K.U. Leuven Dept. of Electrical Engineering (ESAT/PSI)
UZ Gasthuisberg Medical Imaging Research CenterLeuven, Belgium
Intensity-based non-rigid registration
p2 = p1 – u(p1)
p1 p2
Reference Template
• Find ‘realistic’ deformation field that maximizes ‘similarity’ between both images
Regularization
Original Deformed: valid Deformed: not valid
• Not all deformation fields are acceptable:
Impose constraints using a suitable deformation model
Deformation model
• Implicit regularization: basis functions– Global support: polynomial, Gaussian, thin-plate spline– Local support: B-spline, radial basis functions– Implicitly smooth at small scale, explicit regularization at larger scale
• Explicit regularization: – Smoothness penalty: Jacobian, volume preservation, rigidity– Physics-inspired PDE: elastic, viscous fluid, diffeomorphism
• (Biomechanical deformation model)• (Statistical deformation model)
Example: B-spline deformation model
Rueckert 1999 (B-spline), Meyer 1997 (thin plate spline)
( ) ( ) ( ) ( )zkR
yjR
xiR
ijkijkR kzkykx
zyx−−−= ΔΔΔ∑ 222; βββμμrg
Example: viscous fluid deformation model
• u = deformation field, v = deformation velocity• F = force field, derived from similarity measure• λ, μ = material parameters (μ=1,λ=0)
( ) ( ) ( ) 0,2 =+∇∇++∇ uxFvv μλμ
Christensen 1998ii
i xuv
tu
dtduv
∂∂
+∂∂
== ∑=
3
1
Model-based non-rigid registration
Cost function:
C(u) = -Csimilarity(u,I) + Cpenalty (u)
Optimization: minimize C wrt deformation field u:
iu
Cu
CuC
i
similarity
i
penalty
i
∀=∂
−∂+
∂
∂=
∂∂ ,0
)(
MI as similarity measure for NRR?
• Global Histogram– Multimodal registration:
• MMI wants to minimise minor in favour of major peaks in the histogram• NRR will reduce smaller image details
– Non-stationary intensity relationship, e.g. bias:• NRR will register bias fields, not image features
Proposed solution: incorporate voxel label information
• Local Histogram– Limited number of samples
• Statistical power?Proposed solution: overlapping subregions
Spatially conditional MI
D. Loeckx
Local Mutual Information
• Introduce spatial location as an extra variable– : Allows for spatially varying p(r,f)
– : Spatial label, spatial bins overlapping regions
• How to compute MI between r, f, x ?– Total correlation (Studholme 2006):
– Conditional MI: = MI between R and F when X is known’
( ) ( ) ( ) ( )XFRHXHFHRHXFRC ,,)(,, −++=
( ) ( )x,,, frpfrp →
( )xp
( ) ( ) ( ) ( )XR,FHXFHXRHXR,FI −+=
Mutual information
( )RH( )FRH ,
( )FH
+ – =
( )H R ( )H F
( ) ( ) ( ) ( ), ,I R F H R H F H R F= + −
( ) ( ) ( )logx
H X p x p x= −∑F
: Reference image (histogram): Floating image (histogram)
: Entropy
R
Conditional Mutual Information
( )XRH ( )XFH
( )XFRH ,
( ) ( ) ( ) ( ), ,I R F X H R X H F X H R F X= + −
+ – =
Locally, if I know p(r),can I better predict p(f)?
( )H R ( )H F
( )H X
( ) ( ) ( ) ( )( ) ( )
( ) ( )
,, , log
,
x r f
x
p r fI R F X p p r f
p r p f
p I R F
⎛ ⎞= ⎜ ⎟⎜ ⎟
⎝ ⎠=
∑ ∑∑
∑ x x
xx x
x x
x
Conditional Parzen window
( ) ( )( ) ( )( )( )∑∈
• −−=R
RFfRRrR
fwrwfrx
μxgxμ ;II;,H
( ) ( )( )∑
=•
fr,;,H
;,H;,μ
μμfr
frfrp
( ) ( )ξξξξ μμμμ ijk
RT
f
ijk
f
R ijk
f
ijk iwwwfr
R∂
∂∂∂
∂
∂=
∂
∂
∂
∂=
∂∂ ∑
∈
•μxg
rμ
x
;I,;,HKK
( )μx;,,H frs
( )xxx −⋅ Rw( )μx;,,H frs∂
( ) ( )( ) ( )
( )( )∑
∑==
χμχ
μxμx
μxμxμx
r,f, s
r,f ss
fr
frp
pfrfrp
;,,H
;,,H;,
;;,,H;,
Spatial bins
• Same concept of ‘spatial resolution’⇒ Use same settings for mesh knots and spacing⇒ Local transformation guided by local joint histogram
( ) ( ) ( ) ( )zkzR
yjyR
xixR
ijkijkR kxkxkx
zyx−−−= ΔΔΔ∑ ,
2,
2,
2; βββμμxg~
( ) ( ) ( )zzRyyRxxR xxxxxxzyx
−−−= ΔΔΔ ,2
,2
,2 βββ
( ) ( )( ) ( )( )( )∑∈
• −−=R
RFfRRrR
fwrwfrx
μxgxμ ;II;,H( )μx;,,H frs
( )xxx −⋅ Rw
Toy experiment
• 200 2D image pairs– ‘CT’, ‘MR’– 256x256 pixels– I = 0, 200, 400, noise σ =50– Mesh spacing 32 voxels– 32 bins, PW, PV
• Initial transformation– μ uniform, < 30 pixels
• Validation– Intensity difference– Warping index– ROI: 10% outside polygon
CT
(orig
inal
) MR
(warped)
PV, c
ondi
tiona
l MI PW
,conditionalMI
PV, g
loba
l MI PW
, global MI
Toy experiment
CT
(orig
inal
) MR
(warped)
PV, c
ondi
tiona
l MI PW
,conditionalMI
PV, g
loba
l MI PW
, global MI
Theoretical foundation
R F
Global MI:whole image
local optimum⎥⎦
⎤⎢⎣
⎡
31
2
000
~AA
AH
⎥⎦
⎤⎢⎣
⎡
′′
3
2
000~0~A
AHConditional MI:central region
global optimum
Clinical CT/MR
• Data– Radiotherapy
• Colorectal cancer• Delineations MR CT
– 3 CT/MR pairs– Manual delineations
• rectum, mesorectum
• Settings– PW, multiresolution, 64 bins
• Validation– Dice similarity (DSC)– Centroid distance (cD)
Black: original (ground truth)White 1: global MI
White 2: conditional MI
Clinical CT/MR
Black: original (ground truth)White 1: global MI
White 2: conditional MI
Incorporating voxel label information
E. D’Agostino
A viscous fluid model for NRR
• u = deformation field, v = deformation velocity• F = force field, derived from similarity measure• λ, μ = material parameters (μ=1,λ=0)
( ) ( ) ( ) 0,2 =+∇∇++∇ uxFvv μλμ
Christensen 1998ii
i xuv
tu
dtduv
∂∂
+∂∂
== ∑=
3
1
Approximate solution
• Solve for v by spatial convolution of F with Gaussian kernel• Integrate over time to solve for u• Recompute F(x,u) and iterate until convergence• Regridding to enforce Jacobian of u to be positive• Main parameter: σ (width of spatial smoothing kernel)
( ) ( ) ( )( )3
1( ) ( ) ( )
1
kk k kk k k
ii i
uR v v tu u Rx
+
=
⎡ ⎤∂= − ⇒ = +⎢ ⎥∂⎣
Δ⎦
∑
( ) *spaceF MI v Fσψ= ∇ − ⇒ =
A force field for NRR using MMI
( ) *spaceF MI v Fσψ= ∇ − ⇒ =
(ψh = Parzen window kernel)
Hermosillo 2002
D’Agostino 2003
Motivation for label information Patient image Atlas Deformed atlas
Intensity-based matching is confused in case intensities are ambigousNeed more specific information
Lesions
Introducing label information
Model-based tissue classification
From iso-intensity objects... ... to more relvant anatomical objects
Van Leemput 1999
Matching class labels instead of intensitiesTemplate image
class labelsReference image
class labelsAssume segmentations are available for both images, as probabilistic tissue maps
i, j(i,T) = indices of corresponding voxels, T = deformation field
k = class labele.g. WM,GM,CSF,OTHER
cik, cik = class probality
Actual correspondence of class labels
= fuzzy overlap of class labels in images R and T (voxel-wise averaged)
Assessing correspondence of class labels
= mutual information of labels k1 in image R with labels k2 in image T
But:
this does not exploit the fact that correspondence of labels k1 and k2 is known
Ideal correspondence of class labels
= fuzzy overlap of class labels in images R and T (voxel-wise averaged), assuming T ideally to be identical to R
Assessing correspondence of class labels
= Kullback-Leibler divergence between actual and ideal overlap of labels k1 in image R with labels k2 in image T
Force field for class-class matching
Partial volume Interpolation:wi,jn(i) = PV weights
Force field for class-class matching
D = 0.038 D = 0.00022
Matching class labels to intensitiesReference image
intensitiesAssume segmentations are available for only one of the images, as probabilistic tissue maps
i, j(i,T) = indices of corresponding voxels, T = deformation field
k = class labele.g. WM,GM,CSF,OTHER
cjk = class probalityr = intensity value
Template image class labels
Class-to-intensity matching
= minimizes the conditional entropy of IR given the labels CT
If a Gaussian mixture model is adopted for IR, this is equivalent to minimizing the intensity variance within each class
Joint class-intensity histogramPartial Volume interpolation:
WM
GM
CSF
Other
(pWM,pGM,pCSF,pO)T4 (pWM,pGM,pCSF,pO)T3
(pWM,pGM,pCSF,pO)T2(pWM,pGM,pCSF,pO)T1
iRj
Reference
Template
w2
w3 w4
w1
iRj
wk = Σ wj pkj
+wWM
+wGM
+wCSF
+wO
Force field for class-intensity matching
(PV interpolation)
Summarizing
• Use of Parzen estimator
• Continuous histogram model
• Dependent on image gradient
• Use of PV interpolation
•Discrete histogram model
• No image gradient!
II
CC
IC
Viscous fluid regularizer
),().()( uxFvv =∇∇++Δ μλμ
),( RT IIMI−∇=
),( RT ICMI−∇=
),( realideal ppD∇=
Experiment 1: Atlas to study image matching
template image (Brainweb atlas)
n reference images
(10 normal brains)
Preprocessing: rigid registration to the atlas + skull-strippingResolution: [91 109 91] voxels, voxel size: [2 2 2] mmValidation: overlap of WM/GM/CSF after registration
NRR using II, IC, CC
Experiment 1: Atlas to study image matching
II IC CC
63,882,881,5CC
55,278,978,9IC
41,573,276,4II
CSFGMWMOverlap values
Experiment 2: Recovering simulated deformations
template image (Brainweb atlas)
reference image (artificially deformed atlas)
Resolution: [91 109 91] voxels, voxel size: [2 2 2] mmValidation: RMSE, overlap of WM/GM/CSF after registration
NRR using II, IC, CC
Experiment 2: Recovering simulated deformations
0,370,680,94RMSE
CCICII
87,593,793,8CC
74,786,987,1IC
56,181,986,3II
CSFGMWMOverlap
Experiment 3: Inter-subject registration
template image(subject 1)
reference image (subject 2)
NRR using II, IC, CC
Resolution: [91 109 91] voxels, voxel size: [2 2 2] mmValidation: overlap of WM/GM/CSF after registration
Experiment 3: Inter-subject registration
64,083,881,9CC
53,879,979,1IC
48,178,477,7II
CSFGMWMOverlap
Conclusion: incorporating label information improves registration result, especially when label information is available for both images (CC)
But: requires segmentation …
Combined segmentation and registration
PROBLEM: • Atlas-guided tissue class segmentation requires atlas registration…• Tissue class-guided registration requires tissue segmentation…
SOLUTION: • Merge atlas registration and tissue segmentation in a single algorithm
RELATED WORK:Wyatt 2003, Chen MICCAI 2004, Ashburner 2005, Pohl 2006, D’Agostino WBIR 2006
Joint segmentation and registration
Model-based tissue classification using the EM algorithm:
Parameters: },...1,,{ CKkkk == σμθ
posterior atlas prior(fixed)
Gaussian mixture model(with bias field correction)
Van Leemput 1999
Joint segmentation and registration
Extended EM algorithm (EEM): include registration
Parameters: },,...1,,{ TCKkkk == σμθ
posterior atlas prior(deformable)
Gaussian mixture model(with bias field correction)
Joint segmentation and registration
E-step:
M-step (EM + EEM):
Joint segmentation and registration
• M-step (EEM):
(trilinear interpolation of the atlas)
Joint segmentation and registration: EEM
This defines a force field in each voxel
Optimize T in each iteration of the EM algorithm using (a few iterations of) the viscous fluid regularizer NRR
Example: EM vs EEM
EM (affine) EEM (non-rigid)
References
• J. Ashburner and K.J. Friston. Unified segmentation. NeuroImage, 26(2005), 839-851, 2005• X. Chen and M. Brady and D. Rueckert. Simultaneous segmentation and registration for medical
image. Medical Image Computing and Computer-Assisted Intervention (MICCAI'04), volume 3216 of Lecture Notes in Computer Science, pages 663{670, Saint-Malo, France, September 2004. Springer-Verlag, Berlin.
• Christensen, G., Rabbitt, R., Miller, M., 1996b. Deformable templates using large deformation kinetics. IEEE Transactions on Image Processing 5 (10), 1435–1447.
• E. D'Agostino, F. Maes, D. Vandermeulen, and P. Suetens. A viscous fluid model for multimodal non-rigid image registration using mutual information. Medical Image Analysis, 7(4):565-575, 2003.
• E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, Non-rigid atlas-to-image registration by minimization of class-conditional image entropy, Lecture notes in computer science, vol. 3216, pp. 745-753, 2004 (MICCAI 2004)
• E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, An information theoretic approach for non-rigid image registration using voxel class probabilities, Medical image analysis, vol. 10, no. 3, pp. 413-431, 2006
• E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, A unified framework for atlas based brain image segmentation and registration, Lecture notes in computer science, vol. 4057, pp. 136-143, 2006 (WBIR 2006)
• G. Hermosillo, C. Chef d'Hotel, and O. Faugeras. Variational methods for multimodal image matching. International Journal of Computer Vision, 50(3):329-343, 2002.
References
• D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information, LNCS, vol. 4584, pp. 725-737, 2007 (IPMI 2007)
• D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information, IEEE transactions on medical imaging, 2009 (in press)
• Meyer C.R., J.L. Boes, B. Kim, P.H. Bland, et al.: Demonstration of accuracy and clinical versatility of mutual information for automatic multimodality image fusion using affine and thin plate spline warped geometric deformations. Medical Image Analysis 1(3):195-206, 1997.
• Pohl KA, Fisher J, Grimson WEL, et al., A Bayesian model for joint segmentation and registration, NEUROIMAGE, 31(1):228-239, 2006
• D. Rueckert, L.I. Sonoda, C. Hayes, D. Hill, M.O. Leach, and D.J. Hawkes. Nonrigid registration using free-form deformations: application to breast MR images. IEEE Transactions on Medical Imaging, 18(8):712-721, 1999.
• Studholme C, Drapaca C, Iordanova B, et al., Deformation-based mapping of volume change from serial brain MRI in the presence of local tissue contrast change, IEEE TRANSACTIONS ON MEDICAL IMAGING, 25(5):626-639, 2006
• K. Van Leemput, F. Maes, D. Vandermeulen, and P. Suetens. Automated model-based tissue classification of MR images of the brain. IEEE Trans. Med. Img., 18(10):897-908, 1999.
• Paul P. Wyatt and J. Alison Noble. MAP MRF joint segmentation and registration of medical images. Medical Image Analysis, 7(4):539-552, 2003
1
Incorporating spatial information in mutual information-based registration
Josien Pluim
Image Sciences InstituteUniversity Medical Center UtrechtThe Netherlands
Outline
• Brief overview of suggested methods• α-Entropy and entropic graphs• An example application: multifeature mutual information
for registration of cervical MR images
Lack of spatial information Lack of spatial information
In theory, the mutual information of two images does not take spatial information into account.
In practice, it plays a very slight role, through interpolation and through blurring in multiscale optimization.However, explicitly incorporating spatial information may lead to much better registration results.
Spatial information: labelling
• Studholme 1996• Knops 2004• D’Agostino 2006
Combining MI with a labelling of (one of) the images.For instance, Studholme suggests
),,(),()(),,( JJJ LJIHLJHIHLJIMI −+=
Spatial information: local intensity
• Rueckert 2000
Higher-order mutual information:Use probabilities of neighbouring intensities (i,j) in images (co-occurrence matrix).
Joint entropy requires 4D histogram.
∑∑−=i j
jipjipXH ),(log),()(2
2
Spatial information: local intensity
• Bardera 2006
Extends Rueckert’s idea to groups of 3 neighbours,positioned on random lines through the image volume.
• Russakoff 2004
Extends Rueckert’s idea to 3x3 neighbourhood.Vector per pixel.
Image courtesy: Russakoff
Spatial information: local structure
• Pluim 2000
Combination of gradient and intensity information.
∑∩∈
∇∇=)(I)(
))J(,)I(min( ))(( J)G(I,Jx,Tx
x,Tx Txxw σσσα
J)MI(I, J)G(I, J)(I,MInew =
1
π
w
Spatial information: local structure
• Holden 2004
Adds first or second derivative information, similarly toStudholme.
• Gan 2008
‘Maximum distance-gradient’ of a voxel is the maximum ofthe gradients to all other voxels.
)',,',()',()',()',',,( JJIIHJJHIIHJIJIMI −+=
Spatial information: local structure
• Tomaževič 2004
Defines a vector of k features for each voxel, with intensityas first feature and the gradient at a single scale as secondfeature.
• Rodríguez 1998
Jumarie entropy: entropy on partial derivatives
Spatial information: local structure
• Luan 2008
‘Quantitative-qualitative MI’.Include utility of voxels (regional saliency; local entropy).
∑=)()(
),(log),(),(),(Txx
TxxTxxTxx JpIp
JIpJIpJIuJIQMI
Computing higher-dimensional MI
Problem: so-called ‘Curse of dimensionality’.
Higher-dimensional histogram becomes too sparsely populated for reliable estimation of densities.
A 4D histogram is feasible with a small number of bins.
3
Computing higher-dimensional MI
Assumption of normal distribution of densities.Entropy of a normally distributed set of points in Rd withcovariance matrix Σd:
• Tomaževič 2004• Russakoff 2004
))det()2log(()( 21
2dd
deH Σ=Σ π
Computing higher-dimensional MI
• Zhang 2005
Computation of entropy of M-dimensional distribution inO(N), with N the number of samples.For each sample i, define with the bin value of sample i in dimension j.Then compute the entropy of set Ci.
Miiii BBBC K21=
jiB
α-Entropy measures and entropic graphs
References: Hero 2002, Neemuchwala 2005, 2007
α-Entropy
α-Entropy:
Can be estimated with entropic graphs.
∫−= dzzffH )(log
11)( α
α α
α-Entropy through entropic graphs
Given a set of vectors in d-dimensional feature space.
Construct a minimal graph through zi.L(Zn) = total edge length of the graph.Log of normalized L(Zn) converges to α-entropy as n → ∞.
{ }nn zzΖ ,,1 K=
( ) ( ) cfHn
L n
n+=⎟
⎠⎞
⎜⎝⎛ Ζ− −
∞→ ααα log)1(lim 1
Example
Uniformly distributed pointshigh graph length, high entropy
Image courtesy: Neemuchwala
4
Example
Normally distributed pointslower graph length, lower entropy
Image courtesy: Neemuchwala
Convergence
Total normalized graph length
Image courtesy: Neemuchwala
Total graph length
Multifeature MI for registration of cervical MRI
Reference: Staring 2009
Registration of cervical MRI
Challenges:• Intensity inhomogeneity• Anatomical variation (e.g. bladder filling, deformation)• Highly anisotropic resolution
Features
Features describing local structure.Cartesian image structure invariants, at σ=1 and σ=2.Plus intensity.Total: d = 15 features.
LijLjkLki
LijLji
Lii
LiLijLjkLk
LiLijLj
LiLi
LEinstein notation
Features: examples
L LiLi LiLijLj
Lii LijLji LijLjkLki
LiLijLjkLk
5
Computing multifeature MI
We have a feature vector z(xi) for every point xi. Define
features in reference image
in transformed moving image
joint features
)( ir xz
))(( im xTz μ
{ }))((, imrrm xTzzz μ=
Computing multifeature MI
Entropy estimation using kNN graph.
Length function is distance to k neighbours:
Similarly for and
∑=
−=k
pip
ri
rri xzxzL
1
)()(
miL rm
iL
Computing multifeature MI
α-MI for reference image R and moving image M is
μ : deformationα : user-definedγ : d(1-α)n : number of samples
∑= ⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛
−
n
imi
ri
rmi
LLL
n 1
2
)()(1log
11
γ
α μμ
α
Computing multifeature MI
α = 0.99n = 5000k = 5
PCA on features to reduce computation time.First 6 PCs a good trade-off.
Experiments
Data:Follow-up MR T2 for radiotherapy treatment. 36 image pairs of 19 patients.
Evaluation:Manual delineations of CTV (clinical target volume), bladder and rectum.Deformed delineations compared to manual ones.
MI vs α-MI: Dice overlap
6
MI vs α-MI: distance error
Distances between segmentation surfaces.Mollweide projection, like map of the globe.
uterus
anus
left rectum right bladder left
MI vs α-MI: distance error
MI
α-MI
CTV (median error) bladder (median error)
Conclusions
• Multifeature α-MI outperforms standard MI• Better overlap, smaller distance errors• Downside is computation time: 28 vs 1 minute.
References
1. E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, An information theoretic approach for non-rigid image registration using voxel class probabilities, Med. Image Anal. 10(3):413-431, 2006
2. A. Bardera, M. Feixas, I. Boada, M. Sbert, High-dimensional normalized mutual information for image registration using random lines, WBIR, LNCS 4057:264-271, Springer, 2006
3. R. Gan, A.C.S. Chung, S. Liao, Maximum distance-gradient for robust image registration, Med. Image Anal. 12(4):452-468, 2008
4. A.O. Hero, B. Ma, O. Michel, J. Gorman, Applications of entropic spanning graphs, IEEE Signal Proc. Magazine, 19(5):85-95, 2002
5. M. Holden, L.D. Griffin, N. Saeed, D.L.G. Hill, Multi-channel mutual information using scale space, MICCAI, LNCS 3216:797-804, Springer, 2004
6. Z.F. Knops, J.B.A. Maintz, M.A. Viergever, J.P.W. Pluim, Registration using segment intensity remapping and mutual information, MICCAI, LNCS 3216:805-812, Springer, 2004
7. H. Luan, F. Qi, Z. Xue, L. Chen, D. Shen, Multimodality image registration by maximization of quantitative-qualitative measure of mutual information, Pattern Recognit. 41(1):285-298, 2008
References
8. H. Neemuchwala, A. Hero, P. Carson, Image matching using alpha-entropy measures and entropic graphs, Signal Process. 85(2):277-296, 2005
9. H. Neemuchwala, A. Hero, S. Zabuawala, P. Carson, Image registration methods in high dimensional space, Int. J. Imaging Syst. Technol., 16(5):130-145, 2007
10. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Image registration by maximization of combined mutual information and gradient information, IEEE Trans. Med. Imaging 19(8):809-814, 2000
11. C.E. Rodríguez-Carranza and M.H. Loew, A weighted and deterministic entropy measure for image registration using mutual information, SPIE Medical Imaging: Image Processing, Proc SPIE 3338:155-166, 1998
12. D. Rueckert, M.J. Clarkson, D.L.G. Hill, D.J. Hawkes, Non-rigid registration using higher-order mutual information, SPIE Medical Imaging: Image Processing, Proc. SPIE 3979:438-447, 2000
13. D.B. Russakoff, C. Tomasi, T. Rohlfing, C.R. Maurer, Jr., Image similarity using mutual information of regions, ECCV, LNCS 3023:596-607, Springer, 2004
14. M. Staring, U.A. van der Heide, S. Klein, M.A. Viergever, J.P.W. Pluim, Registration of cervical MRI using multifeature mutual information, IEEE Trans. Med. Imaging 28(9):1412-1421, 2009
References
15. C. Studholme, D.L.G. Hill, D.J. Hawkes, Incorporating connected region labellinginto automated image registration using mutual information, MMBIA:23-31, 1996
16. D. Tomaževič, B. Likar, F. Pernuš, Multi-feature mutual information, SPIE Medical Imaging: Image Processing. Proc. SPIE 5370:143-154, 2004
17. J. Zhang and A. Rangarajan, Multimodality image registration using an extensible information metric and high dimensional histogramming, IPMI, LNCS 3565:725-737, Springer, 2005