plugin-complete mi tutorial

54
MICCAI 2009 Tutorial Information theoretic similarity measures for image registration and segmentation Sunday - 20th September 14:00-14:30: William M Wells III Alignment by maximization of mutual information This talk will summarize the historical emergence of the mutual information (MI) approach to image registration. Subsequently, it will describe how it was developed, implemented and evaluated, primarily from the perspective of the MIT / Harvard Medical School group. D. Hill, Studholme, C., and Hawkes, D.. Voxel Similarity Measures for Automated Image Registration. VBC 1994 Collignon, A., Vandermeulen, D., Suetens, P., and Marchal, G.. 3d multi-modality medical image registration using feature space clustering. CVRMED 1995. Viola, P. and Wells, W.. Alignment by maximization of mutual information. In Proceedings of the 5th International Conference of Computer Vision, 1995. Viola, P. Alignment by maximization of Mutual Information. MIT PhD Thesis, 1995. Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996. Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 1997;24:137-154. West JB, Fitzpatrick JM, et al.. "Comparison and evaluation of retrospective intermodality image registration techniques. JCAT 1997. C. Studholme, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, Vol. 32(1), Jan 1999, pp 71-86. 14:30-15:00: Frederik Maes Multimodality image registration by maximization of mutual information This talk will present the concept of MI for multimodality image registration from the Leuven perspective. It will discuss implementation issues such as histogram binning, interpolation and optimization, validation of robustness and accuracy, and also some limitations of the MI criterion in real world applications. A. Collignon, F. Maes, D. Delaere, D. Vandermeulen, P. Suetens, G. Marchal, Automated multi-modality image registration based on information theory, Proc. IPMI 1995 F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, P. Suetens, Multimodality image registration by maximization of mutual information, IEEE TMI, 16(2):187-198, 1997

Upload: arunava-chakravarty

Post on 31-Jul-2015

37 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Plugin-Complete MI Tutorial

MICCAI 2009 Tutorial Information theoretic similarity measures for image registration and segmentation Sunday - 20th September 14:00-14:30: William M Wells III Alignment by maximization of mutual information This talk will summarize the historical emergence of the mutual information (MI) approach to image registration. Subsequently, it will describe how it was developed, implemented and evaluated, primarily from the perspective of the MIT / Harvard Medical School group. D. Hill, Studholme, C., and Hawkes, D.. Voxel Similarity Measures for Automated Image Registration. VBC 1994 Collignon, A., Vandermeulen, D., Suetens, P., and Marchal, G.. 3d multi-modality medical image registration using feature space clustering. CVRMED 1995. Viola, P. and Wells, W.. Alignment by maximization of mutual information. In Proceedings of the 5th International Conference of Computer Vision, 1995. Viola, P. Alignment by maximization of Mutual Information. MIT PhD Thesis, 1995. Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996. Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 1997;24:137-154. West JB, Fitzpatrick JM, et al.. "Comparison and evaluation of retrospective intermodality image registration techniques. JCAT 1997. C. Studholme, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, Vol. 32(1), Jan 1999, pp 71-86.

14:30-15:00: Frederik Maes Multimodality image registration by maximization of mutual information This talk will present the concept of MI for multimodality image registration from the Leuven perspective. It will discuss implementation issues such as histogram binning, interpolation and optimization, validation of robustness and accuracy, and also some limitations of the MI criterion in real world applications. A. Collignon, F. Maes, D. Delaere, D. Vandermeulen, P. Suetens, G. Marchal, Automated multi-modality image registration based on information theory, Proc. IPMI 1995 F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, P. Suetens, Multimodality image registration by maximization of mutual information, IEEE TMI, 16(2):187-198, 1997

Page 2: Plugin-Complete MI Tutorial

West JB, Fitzpatrick JM, et al.. "Comparison and evaluation of retrospective intermodality image registration techniques. JCAT 1997. F. Maes, Segmentation and registration of multimodal medical images: from theory, implementation and validation to a useful tool in clinical practice, K.U.Leuven PhD Thesis, 1998 F. Maes, D. Vandermeulen, P. Suetens, Comparative evaluation of multiresolution optimization strategies for multimodality image registration by maximization of mutual information, Medical image analysis, 3(4):373-386, 1999 F. Maes, D. Vandermeulen, P. Suetens, Medical image registration using mutual information, Proc IEEE - special issue on emerging medical imaging technology, 91(10):1699-1722, 2003

15:00-15:30: Josien Pluim Aspects of mutual information-based image registration After the introduction of mutual information for medical image registration and the promising first results, the focus shifted towards implementation issues (such as preprocessing, interpolation artifacts and related solutions) and the suitability of information measures in general for image registration. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Mutual information matching in multiresolution contexts, Image Vis. Comput., 19(1-2), pp 45-52, 2001 J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Interpolation artefacts in mutual information based image registration, Comput. Vis. Image Underst., 77(2), pp 211-232, 2000 J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, f-Information measures in medical image registration, IEEE Trans. Med. Imaging, 23(12), pp 1508-1516, 2004 S. Klein, M. Staring, J.P.W. Pluim, Evaluation of optimisation methods for nonrigid medical image registration using mutual information and B-splines, IEEE Trans. Image Process., 16(12), pp 2879-2890, 2007

15:30-15:45: Break 15:45-16:15: William M Wells III Probabilistic and information-theoretic approaches to registration This talk will describe more recent approaches to pair-wise and group-wise registration that are based on generative models of images and information theory, with an emphasis on explicit and implicit modeling assumptions and the interconnections among the methods. Topics will include optimality of the MI criteria, the inclusion of controlled amounts of domain-specific information about image intensities, and use of the EM algorithm for registration and model estimation. A. Roche, G. Malandain, and N Ayache. Unifying maximum likelihood approaches in medical image registration. International Journal of Imaging Systems and Technology, 11(7180):71–80, 2000 Zollei L, Fisher J, Wells W. A Unified Statistical and Information Theoretic Framework for Multi-modal Image Registration. Image Processing in Medical Imaging 2003, Ambleside, UK, 2003. Erik Learned-Miller, (2005) Data Driven Image Models through Continuous Joint Alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). Zollei L, Learned-Miller E, Grimson WEL, Wells W. Efficient Population Registration of 3D Data. Proc ICCV 2005, Computer Vision for Biomedical Image Applications, Beijing, China 2005 Zöllei L, Wells W. Multi-modal Image Registration Using Dirichlet-encoded Prior Information. Third International Workshop on Biomedical Image Registration, Utrecht, 2006. Zollei L, Jenkinson M, Timoner S, Wells W. A Marginalized MAP Approach and EM Optimization for Pair-Wise Registration. Proc. IPMI, Kerkrade Netherlands, 2007.

Page 3: Plugin-Complete MI Tutorial

16:15-16:45: Frederik Maes Incorporating local context in mutual information based registration: spatial and voxel label information MI based registration maximizes the statistical correlation between different images without assuming a specific intensity relationship between them. While this has been shown to be a major benefit over other more informed methods in case of affine registration applications, MI of voxel intensities may not be optimally suited for non-rigid registration due to ambiguity in the local intensity information. Inclusion of local context can help to make MI based non-rigid registration more robust. Two different strategies are presented here: one based on including local spatial information (‘conditional MI’) and one based on voxel label information. Voxel label information assumes a prior segmentation of at least one of the images to be registered. If one of the images is an atlas, non-rigid atlas registration and atlas-based segmentation can be combined in a unified framework based on the Expectation-Maximization algorithm. D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information, Proc. IPMI 2007 D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information, IEEE TMI, 2009 (in press) E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, An information theoretic approach for non-rigid image registration using voxel class probabilities, Medical image analysis, 10(3):413-431, 2006 E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, A unified framework for atlas based brain image segmentation and registration, Proc. WBIR 2006 Ashburner, J., Friston, K.: Unified segmentation. NeuroImage 26 (2005) 839–851 Pohl, K., Fisher, J., Grimson, W., Kikinis, R., Wells, W.: A bayesian model for joint segmentation and registration. NeuroImage 31 (2006) 228–239

16:45-17:15: Josien Pluim Incorporating spatial information in mutual information-based registration This part will continue the theme of including additional information in mutual information-based image registration. Examples covered are the combination of gradient and mutual information, higher-order mutual information and multifeature mutual information. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Image registration by maximization of combined mutual information and gradient information, IEEE Trans. Med. Imaging, 19(8), pp 809-814, 2000 D. Rueckert, M. J. Clarkson, D. L. G. Hill, D. J. Hawkes, Non-rigid registration using higher-order mutual information, in Medical Imaging: Image Processing, K. M. Hanson, Ed. 2000, vol. 3979 of Proc. SPIE, pp. 438–447, SPIE Press, Bellingham, WA. H. F. Neemuchwala, A. Hero, P. Carson, Image matching using alpha-entropy measures and entropic graphs, Signal Processing, vol. 85, no. 2, pp. 277 – 296, 2005. M. Staring, U.A. van der Heide, S. Klein, M.A. Viergever, J.P.W. Pluim, Registration of cervical MRI using multifeature mutual information, IEEE Trans. Med. Imaging, in press 17:15-17:30: Concluding remarks

Page 4: Plugin-Complete MI Tutorial

1

1

Alignment by Maximization of Mutual Information

William WellsAssociate Professor of RadiologySurgical Planning LaboratoryHarvard Medical School and Brigham and Women’s Hospital

Affiliated Faculty: Harvard – MIT Division of Health Sciences and Technology

Research Scientist – MIT CSAIL

2

Summary

• Historical Emergence of MI registration approach

• Development, implementation, and evaluation• MIT / Harvard Medical School perspective

3

Antecedents

• Voxel Similarity Measures for Automated Image Registration. VBC 1994 – Hill D., Studholme, C., and Hawkes, D. – Meeting at Mayo Clinic, Oct 4 – 7 1994– 3rd order moments (and other) measures– MOVIE by Colin Studholme…

4

Joint Scatter of MRI and CT

Movie prepared by Colin Studholme

5

Early Entropy / MI Registration

• Minimum Entropy and Registration:– Collignon A., Vandermeulen, D., Suetens, P., and Marchal, G..

3d multi-modality medical image registration using feature space clustering. CVRMED April 1995.

• Maximum Mutual Information Registratoin– Viola, P. and Wells, W.. Alignment by maximization of mutual

information. In Proceedings of the 5th International Conference of Computer Vision, June 20 – 23, 1995.

– Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G, Automated multi-modality image registration based on information theory. IPMI June 26, 1995.

– Viola, P. Alignment by maximization of Mutual Information. MIT PhD Thesis, June 1995.

6

More MI Registration (Journal Articles)

• Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996. (1761*)

• Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 1997;24:137-154. (1011*)

• Maes F, Collignon A,Vandermeulen D, Marchal G, Suetens P. Multimodality image registration by maximization of mutual information, IEEE TMI, 16(2):187-198, 1997 (2005*)

* Google scholar citation counts sept, 2009

Page 5: Plugin-Complete MI Tutorial

2

7

Medical image data sets

Transform (move around)

Compare with objective function

Optimization algorithminitialvalue

motion parameters

score

Medical Image Registration

8

Notation

• Images: u(x), v(x)• Transformation (deformation model): T(x)• arg max notation:

– “the value of x that maximizes f”– similar for min

argmaxxf(x)

9

Minimum Joint Entropy Registration

• find the transformation that minimizes the joint entropy of images u and v under transformations T

T̂ = argminTH[u(x), v(T (x))]

10

Entropy of Images

• histogram the joint data• calculate the entropy of the histogram

– (normalize the histogram)

H[u(x), v(T (x))]

11

Entropy of Images…

• histogram the joint data• estimate (the parameters of) a distribution p on

the pairs (u(x), v(T(x)))• calculate the entropy of that distribution

H[p(u(x), v(T (x)))]

12

• correlation fails for multi-mode registration when intensities are different; e.g.: MR-CT

• one solution:⇒ apply a special intensity transform to the MRI to

make it look more like CT; then compute the correlation measure

Multimodal Inputs

Petra A. van den Elsen. Multimodality Matching of Brain Images. PhD thesis, Utrecht University, The Netherlands, 1992. Petra A. van den Elsen. Retrospective fusion of ct and mr brain images using mathematical operators. In Applications of Computer Vision in Medical Image Processing, Spring Symposium Series. AAAI, March 1994.

Page 6: Plugin-Complete MI Tutorial

3

13

MR-CT situation

CT

MR

air

bone

white matter

gray matter CSF

fat

14

Histograms

1.4

3.41.0 3.0

1.3 1.6

1.0

10.0

1.2

Counts:

Bins or buckets

0 1 2 3 4 5 6 7 8 9 10

0000000010000003060

Rel. frequency:

Data

610

110

310

15

Histogram Joint Intensity of Images

121121121

222333222

Images:

U V

intensities

Joint intensities:(1,2)(2,2)(1,2)(1,3)(2,3)(1,3)(1,2)(2,2)(1,2)

histogram

relative freq.

Y

X

Y

X U

V

3

2

1 2

29

49

19

29

16

MRI & CT pairs

17

Joint histogram: MRI & CT registered

MRI

CT CT

MRI

(1 1) entry suppressed for clarity:

18

Joint histogram: MRI & CT; slightly off

Correct registration Slight mis-registration

MRI

CT CT

MRI

Page 7: Plugin-Complete MI Tutorial

4

19

Joint histogram:MRI & CT; significantly off

Correct registration Significant mis-registration

MRI

CT CT

MRI

20

Entropy

• entropy is a measure of the uncertainty of randomness in a random variable

• it is the minimum length of a message that describes the result of the experiment characterized by p(x)

H[p(x)].= Ex[log

1

p(x)] = −

Xx

p(x) log p(x)

21

entropy:

predictable coin --no uncertainty,lowest entropy fair coin --

most uncertain,highest entropy

p

1

0

.5

H T

biased coin --moderate uncertainty,moderate entropy

H T H T

H[p(x)] = −Xx

p(x) log p(x)

22

Examples of joint intensity distributions

14

14

14

14

16

16

16

16

16

16

U

V

U

V

2log (4)H = 2log (6)H =

23

Maximum Mutual Information Registration

T̂ = argmaxTI[u(x), v(T (x))]

I[xy].= H [x] +H[y]−H[x, y]

find the transformation that maximizes the mutual information of images u and v under transformations T

mutual information definition:

24

Registration of Video and 3D Model

Paul Viola MIT PhD Thesis 1996

Page 8: Plugin-Complete MI Tutorial

5

25

MR-CT Registration

•1996 •EMMA•Stochastic Gradient Descent

26

2D-3D Rigid-body Registration of X-Ray Fluoroscopy and CT

• Motivating applications: Image Guided Surgery (IGS)– 3D Roadmapping (Neuro catheter procedure)– Orthopedics (total hip replacement, revision surgery,

spine procedures, metastatic bone cancer)

L . Zöllei: "2D-3D Rigid-Body Registration of X-Ray Fluoroscopy and CT Images", Masters Thesis, MIT AI Lab, August 2001.

L . Zöllei, E. Grimson, A. Norbash, W. Wells: "2D-3D Rigid Registration of X-Ray Fluoroscopy and CT Images Using Mutual Information and Sparsely Sampled Histogram Estimators", IEEE CVPR, 2001.

27

2D-3D Rigid Registration

Problem: find T28

Gage Before

Provided by L. Zollei

Skull: Before Registration

29

Provided by L. Zollei

Skull: After Registration

30

Plastic Pelvis: Before Registration

Provided by L. Zollei

Page 9: Plugin-Complete MI Tutorial

6

31

Plastic Pelvis: After Registration

Provided by L. Zollei

32

MI-based Audio / Video Fusion

Learning Joint Statistical Models for Audio-Visual Fusion and Segregation

John Fisher et al.

NIPS 2000

33

Video – Audio Joint Statistics

34

Video – Audio MI

35

MI Tracking with Graphics Hardware

• MI registration• apparent surface normals to video intensity• gradient descent with “differentiated histogram”

Wells W, Halle M, Kikinis R, Viola P.Alignment and Tracking using Graphics Hardware.Image Understanding Workshop, 1995. 36

MI Tracking with Graphics Hardware

Page 10: Plugin-Complete MI Tutorial

7

37

MI Tracking with Graphics Hardware

• SUN ffb graphics board – colored lights -> surface

normals calculated in hdwr• SUN SPARC ULTRA• 4 Hz iteration rate (1995)

38

Normalised Mutual Information

• analyzed effect of changes in image overlap• showed improved behavior in synthetic and

clinical images over range of fields of view

C. Studholme, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, Vol. 32(1), Jan 1999, pp 71-86.

NMI(u, v).=H(u) +H(v)

H(u, v)

39

EMMA entropy estimatorH[x]

.= −Ex[log(p(x))]

≈ − 1

|A|Xxi∈A

log(p(xi))

p(x) ≈ 1

|B|Xxj∈B

K(x− xj)

; A: sample of data; weak law of large numbers

; B: sample of data; K: kernel; Parzen density estimation

H [x] ≈ − 1

|A|Xxi∈A

log

⎛⎝ 1

|B|Xxj∈B

K(xj = xi)

⎞⎠; EMMA entropy estimator

40

EMMA and Stochastic Gradient Descent

• Gradient descent – closed form derivative• Samples A and B redrawn at each iteration• |A| and |B| typically = 50 (very small subsamples)• lightweight iterations• noisy estimates of gradient of entropy, MI• many iterations (~20K)• tolerant of local minima : bounces out of them

H [x] ≈ − 1

|A|Xxi∈A

log

⎛⎝ 1

|B|Xxj∈B

K(xj = xi)

⎞⎠

41

areas of non-overlap of images

• one strategy: restrict calculation to region of overlap

• MIT strategy:– sample all of u(x)– if T(x) falls outside valid part of v(x)

• substitute the value zero for the intensity v(T(x))

– v(x) surrounded by a “sea of black”– resistant to “scalloping” of objective function

42

Web Course

• MIT Open Courseware• HST582: Biomedical Signal and Image

Processing• Lectures• MATLAB

– registration with RIRE data• http://ocw.mit.edu/OcwWeb/Health-Sciences-

and-Technology/HST-582JSpring-2007/CourseHome/

Page 11: Plugin-Complete MI Tutorial

8

43

software systems

• ItK: image processing “c” libraries– funded by NIH/NLM– well documented

• 3D Slicer – incorporates ItK, has GUI, graphics

• FLIRT / FSL– collection of independent c programs– fMRI analysis

• SPM / AIR– matlab, standardized defaults

44

References• Hill D, Studholme, C, and Hawkes, D.Voxel Similarity Measures for

Automated Image Registration. VBC 1994.• Collignon A., Vandermeulen, D., Suetens, P., and Marchal, G.. 3d multi-

modality medical image registration using feature space clustering. CVRMED April 1995.

• Viola, P. and Wells, W.. Alignment by maximization of mutual information. In Proceedings of the 5th International Conference of Computer Vision, June 20 – 23, 1995.

• Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G, Automated multi-modality image registration based on information theory. IPMI June 26, 1995.

• Viola, P. Alignment by maximization of Mutual Information. MIT PhD Thesis, June 1995.

• Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996.

• Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 1997;24:137-154.

45

References

• Maes F, Collignon A,Vandermeulen D, Marchal G, Suetens P. Multimodality image registration by maximization of mutual information, IEEE TMI, 16(2):187-198, 1997.

• Petra A. van den Elsen. Multimodality Matching of Brain Images. PhD thesis, Utrecht University, The Netherlands, 1992.

• Petra A. van den Elsen. Retrospective fusion of ct and mr brain images using mathematical operators. In Applications of Computer Vision in Medical Image Processing, Spring Symposium Series. AAAI, March 1994.

• L . Zöllei: "2D-3D Rigid-Body Registration of X-Ray Fluoroscopy and CT Images", Masters Thesis, MIT AI Lab, August 2001.

• L . Zöllei, E. Grimson, A. Norbash, W. Wells: "2D-3D Rigid Registration of X-Ray Fluoroscopy and CT Images Using Mutual Information and Sparsely Sampled Histogram Estimators", IEEE CVPR, 2001.

• Wells W, Halle M, Kikinis R, Viola P. Alignment and Tracking using Graphics Hardware. Image Understanding Workshop, 1995.

• C. Studholme, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, Vol. 32(1), Jan 1999, pp 71-86.

Page 12: Plugin-Complete MI Tutorial

Multimodality image registration by maximization of mutual information

Frederik Maes

K.U. Leuven Dept. of Electrical Engineering (ESAT/PSI)

UZ Gasthuisberg Medical Imaging Research CenterLeuven, Belgium

Outline

MI from a Leuven perspective: (Collignon CVRMed 2005)Collignon IPMI 2005, Maes IEEE TMI 1997, Maes Proc IEEE 2003

• Motivation & inspiration• Concept & interpretation• Implementation• Initial validation• Some applications

The motivation: stereotactic neurosurgery planning Prospective marker-based registration

The position of each image slice in 3D space is determined from the locations of the markers in the image.

Different images are registered based on their relative position in 3D space.

The problem: retrospective multimodality registration

MR/CT

MR/PET

MR/MR

Retrospective registration strategies

• For reviews, see e.g.:van den Elsen 1993, Maintz 1998, Zitova 2003

• Internal landmarkse.g. Hill 1991, Rohr 1997

• Surface based registratione.g. Borgefors 1988, Pellizari 1989, Besl & McKay 1992

requires segmentationdifficult to automate and introduces inaccaracies

• Voxel based registrationuses the intensity information directly, without need for segmentation

Page 13: Plugin-Complete MI Tutorial

Voxel-based registration

p

q

I1 I2

q = T(p)a = I1(p)b = I2(q)

Registered (correct T) Not registered (incorrect T)

Intensity-based voxel similarity measures

• SSD• Correlation• Deterministic / stochastic sign change

Venot 1984

• Cross-correlation after intensity remapping van den Elsen 1994

• Cross-correlation of edges/ridges van den Elsen 1995, Maintz 1996

Joint histogram: same modality

p q

I1 I2

q = T(p)a = I1(p)b = I2(q)

I2

I1

b

a

p(a,b)

I2

Unimodal intensities a and b of

corresponding voxels p and q of registered images I1 and I2 are likely to be similar

p(a,b) is clustered around diagonal when registered

p(a,b) shows significant non-zero off-diagonal elements in case of misregistraion

Registered (correct T) Not registered (incorrect T)

Joint histogram: different modalities

p q

TαI1 I2

q = T(p)a = I1(p)b = I2(q)

I2

I1

b

a

p(a,b)

I2

Registered (correct T) Not registered (incorrect T)

Multimodal relationship between a

and b is strongly data dependent

Caveat: only voxels in the region of overlap of both images are considered

p(a,b) depends on T through varying correspondence (p,q) and through varying region of overlap

Histogram-based voxel similarity measures

• Variance of intensity ratio’sWoods 1993

• Third order momentHill 1994

• N-th order momentStudholme IPMI 1995

• EntropyCollignon SPIE 1994

Joint histogram & statistical dependence

I1

p(a|I2=b)more dispersed

p(a|I2 = b) = likelihood of observing intensity a in I1 given that the intensity of the corresponding voxel in I2 is b

The more clustered p(a|I2=b), the less uncertainty there is about I1 given I2=b, thus the more information the knowledge of one value (I2=b) contains about the other (I1)

If p(a|I2=b) = p(a), knowledge of I2=b does not contain information about a

I2I2

I1

b b

p(a|I2=b)more clustered

Page 14: Plugin-Complete MI Tutorial

Information theory: mutual information

= a special case of the Kullback-Leibler divergence between two probabilities P and Q:

= (kind of) ‘distance’ (in bits) between the joint probability and the product of the marginals

= measure of the statistical dependence of two random variables:

A and B independent p(a,b) = p(a).p(b) I(A,B) = 0A and B one-to-one related p(a,b) = p(a) = p(b) I(A,B) = H(A) = H(B) = entropy

Mutual information: interpretation

= amount of information that one variable contains about another

marginal entropy

conditional entropy

joint entropy

The mutual information registration criterion

Collignon CVRMED/IPMI 1995, Viola ICCV 1995:

“Mutual information is maximal at registration”

p,a q,b

A B

(α = registration parameters)

Example

Original CT MR Resampled CT

Example

I(CT,MR) = 0.52 I(CT,MR) = 0.86

Example

Page 15: Plugin-Complete MI Tutorial

Interpretation

“Find as much of the complexity in the separate datasets (maximizing HA + HB) such that at the same time they explain each other well (minimizing HAB).”

HA + HB = number of bits required to optimally encode A and B separatelyHAB = number of bits required to optimally encode A and B combinedIAB = (HA + HB) - HAB ≥ 0 because A and B contain redundant information“Information redundancy is maximal at registration”

IAB(α) = HA(α) + HB(α) - HAB(α)(α = registration parameters)

MI versus joint entropy

• Minimization of joint entropy by itself does not work• Indeed: HAB is minimal (zero) when the images do not overlap…• The marginal entropies vary with varying image overlap• Inclusion of the marginal entropies in MI is essential in order to assure that

the region of overlap at the registration solution contains information

max (HA(α) + HB(α) - HAB(α)) ≠ min HAB(α)

MI versus normalized MI

• In case the overlap between the images is small at registration, maximizing HA+HB may prevail over minimizing HAB , leading to solutions that prefer larger overlap instead of better correspondence

• Normalization:

Normalized mutual informationStudholme 1999

Entropy correlation coefficientMaes1997

MI as a measure of overlap

I1 = 0 I1 =1

I2 = 0

I2 = 1

4/36

5/36

5/36

22/36

I1 I2 p(i1,i2) p2(i2)

1/4

3/4

045.0

577.1)3622log

3622

365log

365*2

364log

364(

811.0)43log

43

41log

41(

811.0)43log

43

41log

41(

045.04/3*4/3

36/22log3622

4/1*4/336/5log

365*2

4/1*4/136/4log

364

1221

22212

222

221

222

=−+=

=++−=

=+−=

=+−=

=++=

HHHI

H

H

H

I

MI as a measure of overlap

I1 = 0 I1 =1

I2 = 0

I2 = 1

6/36

3/36

3/36

24/36

I1 I2 p(i1,i2) p2(i2)

1/4

3/4

204.0

418.1)3624log

3624

363log

363*2

366log

366(

811.0)43log

43

41log

41(

811.0)43log

43

41log

41(

204.04/3*4/3

36/24log3624

4/1*4/336/3log

363*2

4/1*4/136/6log

366

1221

22212

222

221

222

=−+=

=++−=

=+−=

=+−=

=++=

HHHI

H

H

H

I

MI as a measure of overlap

I1 = 0 I1 =1

I2 = 0

I2 = 1

9/36

0/36

0/36

27/36

I1 I2 p(i1,i2) p2(i2)

1/4

3/4

811.0

811.0)3627log

3627

360log

360*2

369log

369(

811.0)43log

43

41log

41(

811.0)43log

43

41log

41(

204.04/3*4/3

36/27log3627

4/1*4/336/0log

360*2

4/1*4/136/9log

369

1221

22212

222

221

222

=−+=

=++−=

=+−=

=+−=

=++=

HHHI

H

H

H

I

Page 16: Plugin-Complete MI Tutorial

Impact of spatial correlation Limiting assumptions

• Both images share information…

CT PET

Limiting assumptions

• Nature of relationship between image intensities is spatially stationary…

Limiting assumptions

• Joint probability density can be estimated reliably …>< small region of overlap>< low resolution>< interpolation artifacts>< histogram size and binning strategy>< image degradations>< ...

Sampling

Transformation

Interpolation

b

a

Joint histogram

Binning

I(α)

Optimization

pa = A(p)

q = Tα(p)b = B(q)

Floating Image (A)

Reference Image (B)

+1

α∗

sub/supermulti-resolution

rigid/affine, non-rigid

NN, TRI, PV

256 x 256 Non-gradient-basedGradient based

Implementation

X

Y

Z

X

Y

Z

Sub sampling

• Start with few samples initially to speed up the criterion evaluation• Add more samples as the registration proceeds to improve accuracy

• In practice: not more than 2 levels (course and fine), as additional levels only increase computation time (Maes 1999)

Page 17: Plugin-Complete MI Tutorial

Super sampling

• Resampling one of the images at a finer grid as a pre-processing step may be useful to increase accuracy

• Can avoid interpolation artifacts due to grid-aligning transformations

Joint probability estimation

pa = I1(p) q

b = I2(q) ?

T

I1 I2

q1 q2

q3q4

q

q = T(p)a = A(p), b = B(q1)

h(a,b) += 1

q1 q2

q3q4

qw3 w4

w2 w1

q = T(p)A = A(p), bi = B(qi)

b = Σ wi bi , Σ wi = 1h(a,b) += 1

Similar to linear, but using more

neighbours

Nearest neighbour(order 0)

Linear(order 1)

Cubic, B-spline,...(higher order)

Intensity interpolation Histogram binning

• If the histogram is large, it will only be sparsely filledsmall changes in T have significant impact on many bins in Hmany local optima in MIimprove robustness by intensity binning

• Linear intensity remapping: e.g. to range [0-255]converts original intensities in ‘iso-intensity’ objects

• Parzen windowing: distribute each sample over multiple bins

• Partial volume distributionavoids intensity interpolation, but treats image values as labels

Partial volume distribution interpolation

q1 q2

q3q4

qw3 w4

w2 w1

a

b1q = T(p)

A = A(p), bi = B(qi)

h(a,bi) += wi, Σ wi = 1

b3

b2

b4

+w1

+w4

+w2

+w3

Fractions wi vary smoothly with q histogram and MI vary smoothly with T MI a.e. differentiable w.r.t. T

Joint histogram

Collignon 1995, Maes 1997, Chen & Varshny 2003

MI traces for in-slice rotation around registered position: [-180,+180] degrees

NN TRIPV

-180 -120 -60 0 60 120 1800.3

0.4

0.5

0.6

0.7

0.8

0.9

1

MR:1x1x1mm CT:1x1x1.5mm

Behavior of MI

Page 18: Plugin-Complete MI Tutorial

-0.5 -0.25 0 0.25 0.50.881

0.882

0.883 NN

-0.5 -0.25 0 0.25 0.5

0.963

0.964

0.965

TRI

-0.5 -0.25 0 0.25 0.50.875

0.876

0.877

PV

MI traces for in-slice rotation around registered position: [-0.5,+0.5] degrees

0-180 180

Influence of interpolationon optimization behavior

NN TRIPV

0 0.1 0.2 0.30

1

2 x 10-4

I(α*) - I(α)

| α - α* |mmdegrees

Same registration experiment, using different interpolation methods and starting from different initial parameter values

α = registration parametersα* = optimal value for a particular interpolation type

Gradient of MI

q1 q2

q3q4

qw3 w4

w2 w1

δq/δα

δw2 δw1

δw3 δw4

PV interpolation

(Maes 1999)

Different interpolation and binning schemes, lead to other gradient expressions,e.g. Thevenaz 2000, Hermosillo 2002, Mattes 2003

Optimization strategies

Powell SimplexSteepestdescent

Conjugategradient

Quasi-Newton

Levenberg-Marquardt

Non-gradient Gradient

Multiresolution voxel similarity measures for MR-PET registrationStudholme C., D.L.G. Hill, and D.J. Hawkes, IPMI 1995

Retrospective Registration Evaluation Project (RREP), J.M.Fitzpatrick et al., 1996• comparitive validation of retrospective registration techniques• for CT/MR and PET/MR matching of the brain• using the stereotactic registration solution as the gold standard• blind study: images were edited to remove markers• study demonstrated the subvoxel accuracy of the MI matching criterion

Early validation Initial RREP results

West 1997, Maes 2003

Page 19: Plugin-Complete MI Tutorial

CT PET-FDG emissionPET transmissionAligned by acquisitionMatched using MMI

Application: thorax tumor staging from PET and CT

detection in PET

localisationin CT

Vansteenkiste 1998

Application: prostate radiotherapy planning from CT and MR

Debois 1999

hardware phantomknown geometry

CT imageideal

image

Application: geometric accuracy of spiral CT imaging

geometricalmodel

ESP (anthropomorphicspine phantom)

matchedcomparison

Model-to-image registration

Histogram dispersion

cortical wall 1.5 mm cortical wall 1.0 mm cortical wall 0.5 mm

Image intensity Image intensity Image intensity

Before

Before

BeforeAfter

AfterAfter

Conclusion

• histogram-based instead of intensity-based• robust against image degradations (noise, artifacts, local

distortions)• no limitations imposed on the data• theoretically well founded (information theory)• no segmentation required• no need for user intervention• completely automated• ‘easy’ to implement• same algorithm applicable in a variety of applications• very broad applicability (see Pluim 2003 for a survey)

Page 20: Plugin-Complete MI Tutorial

Impact on the field

In 2000 recognized by IEEE as “a landmark in the profession, with enduring importance and influence far beyond its peers”.

In 2005 recognized by ISI as one of the 10 most cited papers of the last decade published in Engineering (September 2009: >1400 citations, ISI Web of Science)

Commercial implementation

Some software tools

• AIR (UCLA, Loni): http://www.loni.ucla.edu/Software/AIR• DROP (TU Muenchen): http://www.mrf-registration.net• Elastix (Image Sciences Institute, Utrecht): http://elastix.isi.uu.nl• FSL/FLIRT/FNIRT (FMRIB, Oxford): http://www.fmrib.ox.ac.uk/fsl• IRTK (Imperial College London): http://www.doc.ic.ac.uk/~dr/software/• Slicer (BWH): http://www.slicer.org• SPM (University College London): http://fil.ion.ucl.ac.uk/spm• ...

References

• Besl P.J., McKay N.D. A method for registration of 3-D shapes, IEEE PAMI, 14(2):239-256, 1992• Borgefors G., Hierarchical chamfer matching: a parametric edge matching algorithm, IEEE PAMI,

10(6):849-865, 1988• Chen HM, Varshney PK. Mutual information-based CT-MR brain image registration using

generalized partial volume joint histogram estimation. IEEE Trans. Med. Img., 22(9): 1111-1119, 2003

• Collignon, A., Vandermeulen, D., Suetens, P., and Marchal, G.. 3d multi-modality medical image registration using feature space clustering. CVRMED April 1995.

• Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G, Automated multi-modality image registration based on information theory. IPMI June 26, 1995.

• Cover T.M., and J.A. Thomas, Elements of Information Theory, 1991• Debois M, Oyen R, Maes F, et al., The contribution of magnetic resonance imaging to the three-

dimensional treatment planning of localized prostate cancer, INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 45(4): 857-865, 1999

• Hermosillo G, Chefd'Hotel C, Faugeras O, Variational methods for multimodal image matching, INTERNATIONAL JOURNAL OF COMPUTER VISION, 50(3): 329-343, 2002

• Hill D, Studholme, C., and Hawkes, D.. Voxel Similarity Measures for Automated Image Registration. VBC October 1994

References

• Maes F, Collignon A,Vandermeulen D, Marchal G, Suetens P. Multimodality image registration by maximization of mutual information, IEEE Trans. Med. Img., 16(2):187-198, 1997

• Maes F. Segmentation and Registration of Multimodal Medical Images: from Theory, Implementation and Validation to a Useful Tool in Clinical P actice. PhD thesis, KU Leuven, 1998.

• F.Maes, D. Vandermeulen, and P. Suetens. Comparative evaluation of multiresolution optimization strategies for multimodality image registration by maximization of mutual information. Medical Image Analysis, 3(4):373–386, 1999.

• Maes F, Vandermeulen D, Suetens P. Medical image registration using mutual information, Proc. IEEE 91(10): 1699-1722, 2003

• Maintz J.B.A., P.A. van den Elsen, M.A. Viergever, Evaluation of ridge seeking operators for multimodality medical image matching", IEEE Trans. PAMI, 1996.

• Maintz JBA, Viergever MA. A survey of medical image registration, Medical Image Analysis, 2(1):1-37, 1998

• Mattes D, Haynor DR, Vesselle H, et al., PET-CT image registration in the chest using free-form deformations, IEEE Trans. Med. Img., 22(1): 120-128, 2003

• Pelizzari C.A., G.T.Y. Chen, D.R. Spelbring, R.R. Weichselbaum, C-T. Chen, Accurate Three-Dimensional Registration of CT, PET, and/or MR Images of the brain", Journal of Computer Assisted Tomography, 13(1) (1989) 20-26

References

• Pluim JPW, Maintz JBA, Viergever MA, Mutual-information-based registration of medical images: A survey, IEEE Trans. Med. Img., 22(8): 986-1004, 2003

• Rohr K. On 3D differential operators for detecting point landmarks. Image and Vision Computing, 15(3):219-233, 1997

• Studholme C., D.L.G. Hill, and D.J. Hawkes, Multiresolution voxel similarity measures for MR-PET registration, IPMI 1995

• Studholme C, D.L.G.Hill, D.J. Hawkes, An Overlap Invariant Entropy Measure of 3D Medical Image Alignment, Pattern Recognition, 32(1), Jan 1999, pp 71-86.

• Thevenaz P, Unser M, Optimization of mutual information for multiresolution image registration, IEEE Trans. Image processing, 9(12): 2083-2099, 2000

• van den Elsen P.A., J.B.A. Maintz, E.-J.D. Pol, M.A. Viergever, Automatic Registration of CT and MR Brain Images Using Correlation of Geometrical Features", IEEE TMI, 14(2): 384-396, 1995

• van den Elsen P.A., E.J.D. Pol, T.S. Sumanaweera, P.F. Hemler, S. Napel, J.R. Adler, Grey value correlation techniques used for automatic matching of CT and MR brain and spine images, SPIE 1994

• van den Elsen, P. A., Pol, E. J. D., and Viergever, M. A. Medical image matching– a review with classification. IEEE Engineering in medicine and biology, 12(1), 26–39, 1993

Page 21: Plugin-Complete MI Tutorial

References

• Vansteenkiste JF, Stroobants SG, Dupont PJ, et al., FDG-PET scan in potentially operable non-small cell lung cancer: do anatometabolic PET-CT fusion images improve the localisation of regional lymph node metastases?, EUROPEAN JOURNAL OF NUCLEAR MEDICINE, 25(11):1495-1501, 1998

• Venot A., J.F. Lebruchec, and J.C. Roucayrol, A New Class of Similarity Measures for Robust Image Registration," Computer Vision, Graphics, and Image Processing, vol. 28, pp. 176-184, 1984.

• Viola, P. and Wells, W.. Alignment by maximization of mutual information. In Proceedings of the 5th International Conference of Computer Vision, June 20 – 23, 1995.

• Viola P, Wells WM. Alignment by maximization of mutual information. International Journal of Computer Vision. 24:137-154, 1997

• Wells WM, Viola P, Atsumi H, Nakajima S, Kikinis R. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35-51, 1996.

• West JB, Fitzpatrick JM, et al.. "Comparison and evaluation of retrospective intermodality image registration techniques. JCAT 1997.

• Woods R.P., Mazziotta J.C., Cherry S.R. MRI-PET registration with automated algorithm, Journal of Computer Assisted Tomography, 17(4):536-546, 1993

• Zitova B, Flusser J. Image registration methods: a survey. Image and Vision Computing, 21(11):977-1000, 2003

Page 22: Plugin-Complete MI Tutorial

1

Aspects of mutual information-based image registration

Josien Pluim

Image Sciences InstituteUniversity Medical Center UtrechtThe Netherlands

Outline

Image registration involves• a similarity measure• interpolation• optimization

– f-information measures– interpolation artefacts– acceleration of optimization

f-Information measures

Reference: Pluim 2004

f-Divergence measures

Distance between two probability distributions.

Definition:

Example: Kullback-Leibler distance:

)()||( ∑=i i

ii q

pfqQPf

∑i i

ii q

pp log

f-Information measures

Subclass of f-divergence, measure of dependence.Divergence between joint probability pij and joint probability in case of independence pi pj .

Definition:

Example: mutual information

)()||(,

21 ∑=×ji ji

ijji pp

pfppPPPf

ji

ij

jiij pp

pp∑

,log

Choice of f

Varying the function f (subject to certain requirements) yields various measures.Example f :

( ) 1,0,)1(

1 ≠≠−

−+−= αααα

ααα

αxxxI

I0.2

Page 23: Plugin-Complete MI Tutorial

2

Choice of f

Resulting Iα – information:

For equals mutual information.

( ) 1,0,1)()1(

1||,

121 ≠≠⎟⎟⎠

⎞⎜⎜⎝

⎛−

−=× ∑ − αα

αα α

α

αji ji

ij

ppp

PPPI

( )21||,1 PPPI ×→ αα

Other f-information measures

V - information:

Matusita information:

χα – information:

Rényi information:

( ) ∑ −=×ji

jiij pppPPPV,

21||

( ) ∑ −=×ji

jiij pppPPPM,

1

21 )(|| αααα

( ) ∑ −

−=×

ji ji

jiij

ppppp

PPP,

121 )(|| α

α

αχ

( ) ∑ −−=×

ji ji

ij

ppp

PPPR,

121 )(log

11|| α

α

α α

Examples

MR-CT, head, out-of-plane rotation, -60 to 60 degrees

0.2 0.5 0.8

0.2 0.5 2.0 3.0MIIα

Mα V

Evaluation

Evaluation on MR-CT and MR-PET registration: head, rigid,RIRE data.• Accuracy (screw marker-based gold standard).• Robustness (convergence from many starting positions).

Conclusions

Conclusions:• Choice of α=1 seems best option for function

smoothness• Functions for small and large α more difficult to optimize• Some measures achieved better accuracy than MI

(Iα, Rα, Mα; α ∈ {0.2, 0.5}).

Interpolation artefacts

References: Maes 1998, Pluim 2000

Page 24: Plugin-Complete MI Tutorial

3

Interpolation artefacts

Examples, MR-CT, head, axial translation

partial volume interpolationlinear interpolation

Interpolation

y1 y2

y3 y4

T(x)

w4 w3

w2 w1

Partial volume interpolationh( I(x),J(yi) ) += wi , ∀ i

Linear interpolation

I(T(x)) = Σi wi · yih( I(x),J(T(x)) ) += 1

Interpolation artefacts

When do they occur?

For images of equal grids.Problems occur when interpolation is not required for every transformation, i.e. when the grids align.

Linear interpolation

Local minima at grid alignment.

Typical example: MR-CT, head, axial translation

Artefacts occur because linear interpolation smoothes image. Reduction of noise causes a decrease in joint entropy.For grid-aligning transformations, there is no interpolation, resulting in higher joint entropy and lower MI.

Linear interpolation and noise

Example: MNI brain atlas, MR-T1 and T2

20 - 10 0 10 20 20 - 10 0 10 20 20 - 10 0 10 20

“no” noise 3 percent noise 5 percent noise

- - -

Partial volume interpolation

Local maxima at grid alignment.

Typical example: MR-CT, head, axial translation

Artefacts occur because PV interpolation increases the dispersion of the joint histogram. It causes an increase in joint entropy.For grid-aligning transformations, there is no interpolation, resulting in lower joint entropy and higher MI.

Page 25: Plugin-Complete MI Tutorial

4

Resampling

Example: MR-CT, head, axial translation

partial volume interpolationlinear interpolation

voxel size 1.5 mm

voxel size 1.53 mm

Subvoxel accuracy

Interpolation artefacts may impede subvoxel accuracy.

Example: MR-CT, head, in-plane translation

original resolution downsampled

Summary

Interpolation artefacts• can occur when images have equal voxel size(s),• are more pronounced for images of low resolution,• can occur both for partial volume interpolation (local

maxima) and linear interpolation (local minima),• impede subvoxel accuracy.

Related work

Further studies into interpolation artefacts, including other similarity measures and interpolation methods:

Holden 2001Tsao 2003 Ji 2003Aljabar 2005Rohde 2005Inglada 2007Thévenaz 2008Rohde 2009

Proposed solutions

• Resampling• Initial rotation• Smoothing images

Inglada 2007, Rohde 2009

• Blurring of the joint histogramTsao 2003

• Small number of histogram binsJi 2003

• OversamplingJi 2003

Proposed solutions

• Use of a prior probabilityAt coarse levels in a pyramid, include the joint pdf from the finest level:Likar 2001, Gan 2004

• Higher-order interpolationTsao 2003, Aljabar 2005, Rohde 2009

• Generalized Partial Volume EstimationPV interpolation with a B-spline instead of a linear kernelIntroduced in Chen 2003Wei 2004 (Gaussian instead of B-spline)Lu 2008 (Hanning windowed sinc instead of B-spline)

( ) ( ) ( ) ( )JIpJIpJIp priorcurrent ,1,, λλ −+=

Page 26: Plugin-Complete MI Tutorial

5

Proposed solutions

• Off-grid samplingE.g. randomly perturbed grid positions, (x+Δx, y+Δy).Likar 2001, Tsao 2003, Seppä 2008

• Random sampling Thévenaz 2008, Rohde 2009

• Constant variance interpolationThévenaz 2008a

• Variance correction filterPost-interpolation filter to counteract change in varianceSalvado 2007

Accelerating optimization

Reference: Klein 2007

Some related work

• Various optimization methods Maes 1999

• Multiresolution approachesThévenaz 2000, Pluim 2001, Likar 2001

• Look-up tablesSarrut 1999, Meihe 1999

• Parallelization / hardware implementationsCastro-Pareja 2004, Levin 2004, Ino 2005, Vetter 2007, Modat 2009

Optimization

Deformation modelled by B-splines.

Control point displacements: μ = {μ1, μ2, μ3, ...... }Cost function: F(μ)

Aim: find μ that minimises F(μ)

Gradient descent

gkak.-μk=μk+1

μ3

μ2

μ1

g3ak.-=μ3

g2μ2

g1μ1

::

::

::

k+1 k k

- - -

= ∂F∂μ1 k

Gradient descent

F(μ)-

μ1μ2

Page 27: Plugin-Complete MI Tutorial

6

Smarter steps

F(μ)-

μ1μ2

Cheaper steps

F(μ)-

μ1μ2

Comparison

cheaper steps• Stochastic gradient

• Conjugate gradientsmarter steps

• Quasi-Newton

reference• Gradient descent

dkak.+μk=μk+1

Stochastic approach

Stochastic method uses an approximation to gk by subsampling. Convergence guaranteed if bias in approximation error goes to zero

Therefore take new set of random samples in every iteration.

Deterministic methods can be used with a single set of regular samples.

∞→→ kgg kk as,)~(E

Experiments

Cardiac CT images, 3D, known deformations

0

0.5

1

1.5

2

2.5

3

0.001 0.01 0.1 1 10 100 1000

Results

‘computation time’

e [mm]

gradient descentquasi-Newtonconjugate gradientstochastic gradient

Page 28: Plugin-Complete MI Tutorial

7

0

0.5

1

1.5

2

2.5

3

0.001 0.01 0.1 1 10 100 1000

Results

‘computation time’

e [mm]

gradient descentquasi-Newtonconjugate gradientstochastic gradient248

Conclusions

• Cheap steps result in more acceleration than smart steps.

• Stochastic methods allow strong subsampling and hence a large reduction in computation time.

References

1. P. Aljabar, J.V. Hajnal, R.G. Boyes, D. Rueckert, Interpolation artefacts in non-rigid registration, MICCAI, LNCS 3750:247-254, Springer, 2005

2. A. Bardera, M. Feixas, I. Boada, Normalized similarity measures for medical image registration, SPIE Medical Imaging: Image Processing, Proc. SPIE 5370:108-118, 2004

3. C.R. Castro-Pareja, J.M. Jagadeesh, R. Shekhar, FAIR: a hardware architecture for real-time 3-D image registration, IEEE Trans. Inf. Technol. Biomed. 7(4):426-434, 2003

4. H. Chen and P.K. Varshney, Mutual information-based CT-MR brain image registration using generalized partial volume joint histogram estimation, IEEE Trans. Med. Imaging 22(9):1111-1119, 2003

5. R. Gan, J. Wu, A.C.S. Chung, S.C.H. Yu, W.M. Wells III, Multiresolution image registration based on Kullback-Leibler distance, MICCAI, LNCS 3216:599-606, Springer, 2004

6. Y. He, A.B. Hamza, H. Krim, A generalized divergence measure for robust image registration, IEEE Trans. Signal Process., 51(5):1211-1220, 2003

7. M. Holden, Registration of 3D serial MR brain images, PhD thesis, University of London, UK, 2001

References

8. J. Inglada, V. Muron, D. Pichard, T. Feuvrier, Analysis of artifacts in subpixel remote sensing image registration, IEEE Trans. Geosci. Remote Sensing 45(1):254-264, 2007

9. F. Ino, K. Ooyama, K. Hagihara, A data distributed parallel algorithm for nonrigidimage registration, Parallel Comput. 31(1):19-43, 2005

10. J.X. Ji, H. Pan, Z.P. Liang, Further analysis of interpolation effects in mutual information-based image registration, IEEE Trans. Med. Imaging 22(9):1131-1140, 2003

11. S. Klein, M. Staring, J.P.W. Pluim, Evaluation of optimization methods for nonrigidmedical image registration using mutual information and B-splines, IEEE Trans. Image Process. 16(12):2879-2890, 2007

12. S. Klein, J.P.W. Pluim, M. Staring, M.A. Viergever, Adaptive stochastic gradient descent optimisation for image registration, Int. J. Comput. Vis. 81(3):227-239, 2009

13. D. Levin, D. Dey, P.J Slomka, Acceleration of 3D, nonlinear warping using standard video graphics hardware: implementation and initial validation, Comput. Med. Imaging Graph. 28(8):471-483, 2004

14. B. Likar and F. Pernuš, A hierarchical approach to elastic registration based on mutual information, Image Vis. Comput. 19(1-2):33-44, 2001

References

15. X. Lu, S. Zhang, H. Su, Y. Chen, Mutual information-based multimodal image registration using a novel joint histogram estimation, Comput. Med. Imaging Graph. 32(3):202-209, 2008

16. F. Maes. Segmentation and registration of multimodal medical images: from theory, implementation and validation to a useful tool in clinical practice, PhD thesis, Catholic University of Leuven, Belgium, 1998.

17. F. Maes, D. Vandermeulen, P. Suetens, Comparative evaluation of multiresolutionoptimization strategies for multimodality image registration by maximization of mutual information, Med. Image Anal., 3(4):373-386, 1999

18. X. Meihe, R. Srinivasan, W.L. Nowinski, A fast mutual information method for multi-modal registration, IPMI, LNCS 1613: 466-471, Springer, 1999

19. M. Modat, G.R. Ridgway, Z.A. Taylor, D.J. Hawkes, N.C. Fox, S. Ourselin, A parallel-friendly normalized mutual information gradient for free-form registration, SPIE Medical Imaging: Image Processing, Proc. SPIE 7259, 2009

20. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Interpolation artefacts in mutual information based image registration, Comput. Vis. Image Underst. 77(2):211-232, 2000

References

21. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Mutual information matching in multiresolution contexts, Image Vis. Comput., 19(1-2):45-52, 2001

22. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Mutual-information-based registration of medical images: a survey, IEEE Trans. Med. Imaging, 22(8):986-1004, 2003.

23. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, f-Information measures in medical image registration, IEEE Trans. Med. Imaging 23(12):1508-1516, 2004

24. G.K. Rohde, A.S. Barnett, P.J. Basser, C. Pierpaoli, Estimating intensity variance due to noise in registered images: Applications to diffusion tensor MRI, NeuroImage26(3):673-684, 2005

25. G.K. Rohde, A. Aldroubi, D.M. Healy, Interpolation artifacts in sub-pixel image registration, IEEE Trans. Image Process. 18(2):333-345, 2009

26. O. Salvado and D.L. Wilson, Removal of local and biased global maxima in intensity-based registration, Med. Image Anal. 11(2):183-196, 2007

27. D. Sarrut and S. Miguet, Fast 3D image transformations for registration procedures, ICIAP, 446-452, IEEE Computer Society, 1999

28. M. Seppä, Continuous sampling in mutual-information registration, IEEE Trans. Med. Imaging 17(5):823-826, 2008

Page 29: Plugin-Complete MI Tutorial

8

References

29. P. Thévenaz and M. Unser, Optimization of mutual information for multiresolutionimage registration, IEEE Trans. Image Process., 9(12):2083-2099, 2000

30. P. Thévenaz, M. Bierlaire, M. Unser, Halton sampling for image registration based on mutual information, Sampling Theory Signal Image Process. 7(2):141-171, 2008

31. P. Thévenaz, T. Blu, M. Unser, Short basis functions for constant-variance interpolation, SPIE Medical Imaging: Image Processing, Proc. SPIE 6914, 2008a

32. J. Tsao, Interpolation artifacts in multimodality image registration based on maximization of mutual information, IEEE Trans. Med. Imaging 22(7):854-864, 2003

33. C. Vetter, C. Guetter, C. Xu, R. Westermann, Non-rigid multi-modal registration on the GPU, SPIE Medical Imaging: Image Processing, Proc. SPIE 6512, 2007

34. M. Wei and J. Liu, Artifacts reduction in mutual information-based CT-MR image registration, SPIE Medical Imaging: Image Processing, Proc. SPIE 5370:1176-1186, 2004

Page 30: Plugin-Complete MI Tutorial

1

1

Probabilistic and information-theoretic approaches to registration

William WellsAssociate Professor of RadiologySurgical Planning LaboratoryHarvard Medical School and Brigham and Women’s Hospital

Affiliated Faculty: Harvard – MIT Division of Health Sciences and Technology

Research Scientist – MIT CSAIL

2

plan

• tour of probabilistic and information theoretic registration methods

• focus on generative models of the methods– describe models with equations– state results (skip derivations)

• framework for motivation and comparison

3

Outline• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters *• KLD Registration and MI• MAP with unknown model parameters

– Prior probabilities on model parameters– Joint MAP *– Marginalized MAP

• Weak prior *• Informative prior *• Strong prior

– EM Algorithm to obtain estimates• Simple iteration *• Experimental Results

– Group-wise Registration *

* Connections to prior work4

A Marginalized MAP Approach and EM Optimization for Pair-Wise Registration

IPMI 2007

Lilla ZolleiMark JenkinsonSamson TimonerWilliam Wells

Formalism from:

5

Estimation

y: dataθ: model parameters

Maximum Likelihood:

Maximum A Posteriori (MAP):

using Bayes rule

θ̂ = argmaxθlog p(y|θ)

θ̂ = argmaxθlog p(θ|y)

= argmaxθ[log p(y|θ) + log p(θ)]

6

Basic MAP Registration

T̂ = argmaxTlog p(T |u, v)

T̂ = argmaxTlog [p(u, v|T ) p(T )]

• u, v: images• T: transform on image• Maximum A-posteriori Probability

Page 31: Plugin-Complete MI Tutorial

2

7

Probability on Image Pairs

• Kinematic Assumption

• Independently and Identically Distributed (IID) in space

• xi : voxel

p(u, v|T ) =Yxi

p(u(xi), v(xi)|T )

p(u, v|T ) = p(u(x), v(T (x)))

8

Probability on Intensity Pairs

• : Bin index• Multinomial Distribution

– One trial

p(ui, vi|Θ) = Mult(B(ui, vi); 1,Θ)

= θB(ui,vi)

0 ≤ θj ≤ 1Xj

θj = 1

B(·, ·)

9

Estimate Multinomial Distribution from Data

• Maximum Likelihood Method:– Histogram the data– Set the parameters to be normalized histogram counts

10

Joint Image Histogram

u intensities

v intensities1 2 43

11

5

106 987

Data:

Bin Counts, e.g., n9 = 2

BIN Indices

(u(xi), v(xi))

11

Probability on Image Pairs…

• nj(T) : Number of voxel pairs that map to bin j • Multinomial on sequence of observations

– (not counts)

p(u, v|T,Θ)

=Yi

Mult(B(ui, vi); 1,Θ)

= Mult({B(ux1 , vy1) · · · B(uxN , vyN )};N,Θ)

=Yj

θnj(T )j

12

MAP Registration: Known Model

T̂ = argmaxTlog p(T |u, v,Θ)

= argmaxTlog [p(u, v|T,Θ) p(T )]

= argmaxT

⎡⎣Xj

n(T )j log(θj) + logP (T )

⎤⎦

Page 32: Plugin-Complete MI Tutorial

3

13

MAP Registration: Known Model…

• Training– Estimate Θ from registered images

• ML: normalized histogram:

• Registration– Simple Objective Function

• Linear in counts

T̂ = argmaxT

⎡⎣ gXj=1

n(T )j log(θj) + logP (T )

⎤⎦

Θ̂ =n(T0)

N

14

ML: Known Model

T̂ = arg maxTlog [p(u, v|T,Θ)]

* M Leventon, W Grimson, W Wells. Multi-Modal Volume Registration Using Joint Intensity Distributions. MICCAI 98

15

KLD Registration

• D[· || ·] : KL Divergence– Compares probability distributions– Non neg, zero for identical distributions, not symmetric

T̂ = argminTD[p̂(u, v;T ) || p(u, v;T = 0)]

D[p(x) ||q(x)].=Xx

p(x) logp(x)

q(x)

estimated from current images at transform T

estimated from correctly registered images

16

Multi-Modality image registration by minimising Kullback-Leibler distance

ACS Chung, WM Wells, WEL Grimson, A NorbashMICCAI 2002

• 2D/3D DSA to MRA

Provided by Albert Chung

KLD Registration Example…

17

Before Registration

Provided by Albert Chung18

After Registration

Provided by Albert Chung

Page 33: Plugin-Complete MI Tutorial

4

19

KLD and MI Registration

T̂ = argmaxTD[p̂(u, v;T ) || p̂(u) p̂(v|T )]

KLD registration:

MI Registration:

T̂ = argminTD[p̂(u, v;T ) || p(u, v;T = 0)]

20

Strong vs. Weak Models

• Capture / Bias Tradeoff– Strong Model, e.g.: ML, MAP, KLD

• Robust: large capture• Model may be inaccurate for new images

– Less accurate estimate of T

– Weak Model, e.g.: min Entropy, max MI• Less robust: smaller capture• More accurate estimate of T

21

Outline• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters *• KLD Registration and MI• MAP with unknown model parameters

– Prior probabilities on model parameters– Joint MAP *– Marginalized MAP

• Weak prior *• Informative prior *• Strong prior

– EM Algorithm to obtain estimates• Simple iteration *• Experimental Results

– Group-wise Registration *

* Connections to prior work22

Model Parameters, Θ ,Unknown

p(T,Θ|u, v, w) ∝ p(u, v|T,Θ)p(T )p(Θ|w)

• Prior: p(Θ|w)• Joint prior is independent

– p(T,Θ) = p(T) p(Θ|w)

• Joint Posterior:

23

Joint MAP

• Estimate T, Θ jointly• Θ is a nuisance parameter

– discard estimate of Θ

dTΘ = argmaxTΘ

p(T,Θ|u, v, w)

bT = argmaxT

hmaxΘp(T,Θ|u, v, w)

i

24

Joint MAP *• Θ is maximized out

• Joint Maximum Likelihood– Connections to entropy and Mutual Information

• Related:

* L . Zöllei. A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images.Ph.D. thesis, MIT

bT = argmaxT

hmaxΘp(T,Θ|u, v, w)

i

A. Roche, G. Malandain, and N Ayache. Unifying maximum likelihood approaches in medical image registration. International Journal of Imaging Systems and Technology, 11(7180):71–80, 2000

Page 34: Plugin-Complete MI Tutorial

5

25

Marginalize Nuisance Parameter

• Alternative: Average or Marginalize:

bT = argmaxT

·Zp(T,Θ|u, v, w)dΘ

¸= argmax

Tp(T |u, v, w)

= argmaxT

·Zp(u, v|T,Θ)p(T )p(Θ|w)dΘ

¸

bT = argmaxT

hmaxΘp(T,Θ|u, v, w)

i• Maximize out:

26

• Conjugate prior for Multinomial• Multi-category generalization of Beta• Parameterized by pseudo-data counts: w

Dirichlet Prior on Θ

p(Θ|w) = Dir(Θ;w)

27

Marginalized MAP

T̂ = argmaxTlog p(T |u, v, w)

= argmaxT

⎡⎣logP (T ) + gXj=1

logΓ (nj(T ) + wj)

⎤⎦

28

3 Cases on strength of prior

• Weak prior: Laplace Prior• Informative prior • (Strong prior: dominates data : model is known)

– back to where we started

29

Laplace Prior: wj = 1

T̂ = argmaxT

⎡⎣logP (T ) + gXj=1

log Γ (nj(T ) + 1)

⎤⎦

T̂ ≈ argminT

·N ·H

·Mult

µ1,n(T )

N

¶¸− log p(T )

¸Use Stirling’s approximation…

Minimum Entropy Registration emerges!

30

Informative Prior

• Minimize entropy of pooled data *

T̂ = argmaxT

⎡⎣logP (T ) + gXj=1

logΓ (nj(T ) + wj)

⎤⎦≈ argmin

T

·N · c ·H

·Mult

µ1,n(T ) + w

N + w0

¶¸− log p(T )

¸

* Mert Sabuncu. Entropy-based Methods for Image Registration. PhD Thesis, Princeton 2006.

• using log(Γ(x)) ≈ x log(x)

Page 35: Plugin-Complete MI Tutorial

6

31

Entropic Probability Model

Recap on the posterior distribution:

p(T |u, v, w) ∝ e−NH

£Mult

¡1,

n(T)+wN+w0

¢¤p(T )

32

Example use of Dirichlet Prior

L . Zöllei, W.M. Wells III: "Multi-modal Image Registration Using Dirichlet-encoded Prior Information", WBIR06

33

Recap: Marginalized MAP registration

• 3 Cases on strength of prior– Weak prior: Laplace Prior

• Minimize entropy

– Informative prior • Minimize entropy of pooled data

– (Strong prior: dominates data• MAP using known fixed model)

34

Outline• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters *• KLD Registration and MI• MAP with unknown model parameters

– Prior probabilities on model parameters– Joint MAP *– Marginalized MAP

• Weak prior *• Informative prior *• Strong prior *

– EM Algorithm to obtain estimates• Simple iteration *• Experimental Results

– Groupwise Registration

35

EM Algorithm

• ML parameter estimation • Observed data• Hidden data• Simple iterative algorithm• Nice convergence properties

A. P. Dempster, N. M. Laird, and D. Rubin. Maximum likelihood from incompletedata via the em algorithm. Journal of the Royal Statistical Society, 39(1):1–38,1977.

36

EM Estimator of T | u v w

• Timoner’s Iteration:

T̂next ≈ argmaxT

⎡⎣ gXj=1

n(T )j log³n(T̂old)j + wj − .5

´+ logP (T )

⎤⎦

T̂next ≈ argmaxT

⎡⎣ gXj=1

n(T )j log³n(T̂old)j + ²

´+ logP (T )

⎤⎦

• Iterate:– (Re) estimate model from current configuration

• Histogram joint intensities

– Do MAP registration with fixed model• Simple objective function: linear in counts

Page 36: Plugin-Complete MI Tutorial

7

37

Experiment *• Samson Timoner PhD Thesis• Sequential Intra-Operative MRI• “Samson’s Iteration”• Linear Elastic Deformation Energy: E(T)

– p(T) ∝ exp( – E(T))• Iterated relaxation of deformation energy

– Equivalent to Viscous fluid **

* S Timoner. Compact Representations for Fast Nonrigid Registration of Medical Images. PhD Thesis, MIT 2003

** X. Papademetris, E. T. Onat, A. J. Sinusas, D. P. Dione, R. T. Constable, and J. S. Duncan. The active elastic model. In Proceedings of IPMI, volume 0558 of LNCS, pages 36–49. Springer, 2001.

38

Timoner’s Iteration

• Iterate:– (Re) estimate model from current configuration

• Histogram joint intensities

– Do MAP registration with fixed model• Simple objective function: linear in counts

– Relax energy of current deformation

39

Signa SP (GE Medical Systems)

R. Pergolizzi40

Intra-Operative Image

Provided by Samson Timoner

41

Second Intra-Operative Image

Provided by Samson Timoner42

Second Image Warped to First Image

Provided by Samson Timoner

Page 37: Plugin-Complete MI Tutorial

8

43

Outline• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters *• KLD Registration and MI• MAP with unknown model parameters

– Prior probabilities on model parameters– Joint MAP *– Marginalized MAP

• Weak prior *• Informative prior *• Strong prior *

– EM Algorithm to obtain estimates• Simple iteration *• Experimental Results

– Groupwise Registration

44

T1

T5

T4

T3

T2

T6

T7

TN

Groupwise Registration

Goal: find “central tendancy”:

45

Groupwise Registration

Consider IID in space, kinematic assumption

But, High dimension density estimation is difficult…

46

Model for Congealing

Independent (not identical) in spaceIID across the images

One dimensional density estimation is easier…

47

Congealing

Minimize total entropy of PDFs at all voxels

• (different) Multinomial Model at each voxel• Uninformative prior on model parameters• (no prior on transforms)• Marginalize out unknown model parameters

T̂ ≈ argminT

Xj

Hhp̂j(I|T )

i

Probability distribution on intensity at voxel jestimated from observed intensities given transform T

48

Congealing…• Erik G. Miller, (Feb., 2002) Ph.D. Thesis: Learning from One Example

in Machine Vision by Sharing Probability Densities. MIT EECS.

• Erik Learned-Miller, (2005) Data Driven Image Models through Continuous Joint Alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).

• Lilla Zollei, Erik Learned-Miller, Eric Grimson and William Wells, (2005) Efficient population registration of 3D data. Workshop onComputer Vision for Biomedical Image Applications: Current Techniques and Future Trends, at the International Conference ofComputer Vision (ICCV). (Best Paper Award)

Page 38: Plugin-Complete MI Tutorial

9

49

127 Adult MRI

50

Before and After Congealing

Data set: 127 T1w MRI; [256x256x124] with (0.9375, 0.9375, 1.5) mm3 voxels;Experiment: 3 levels; 12-param. affine; N = 800-1600; iter = 250; time = 6hrs

51

• Balci S, Golland P, Wells W. Non-Rigid groupwise registration using b-splinedeformation model. Insight Journal, http://hdl.handle.net/1926/568, 2007.

30 FBIRN Subjects

Affine

B-Spline

52

Joint Congealing Two Infant Populations

• 17 full term• 22 pre term

• Group analysis of transform params– Significant difference

in shape

53

Summary• Basic MAP approach to image registration• Parametric models on image and intensity pairs• MAP with known model parameters• KLD Registration and MI• MAP with unknown model parameters

– Prior probabilities on model parameters– Joint MAP– Marginalized MAP

• Weak prior• Informative prior• Strong prior

– EM Algorithm to obtain estimates• Simple iteration• Experimental Results

– Group-wise Registration

54

References

• Zollei L, Jenkinson M, Timoner S, Wells W. A Marginalized MAP Approach and EM Optimization for Pair-Wise Registration. IPMI 2007.

• M Leventon, W Grimson, W Wells. Multi-Modal Volume Registration Using Joint Intensity Distributions. MICCAI 98.

• Chung ACS, Wells W, Grimson W, Norbash A. Multi-Modality image registration by minimising Kullback-Leibler distance. MICCAI 2002.

• L . Zöllei. A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images. Ph.D. thesis, MIT

• A. Roche, G. Malandain, and N Ayache. Unifying maximum likelihood approaches in medical image registration. International Journal of Imaging Systems and Technology, 11(7180):71–80, 2000

• Mert Sabuncu. Entropy-based Methods for Image Registration. PhD Thesis, Princeton 2006.

• L . Zöllei, W.M. Wells III: "Multi-modal Image Registration Using Dirichlet-encoded Prior Information", WBIR06.

Page 39: Plugin-Complete MI Tutorial

10

55

References• A. P. Dempster, N. M. Laird, and D. Rubin. Maximum likelihood from

incomplete data via the em algorithm. Journal of the Royal Statistical Society, 39(1):1–38,1977.

• S Timoner. Compact Representations for Fast Nonrigid Registration of Medical Images. PhD Thesis, MIT 2003

• X. Papademetris, E. T. Onat, A. J. Sinusas, D. P. Dione, R. T. Constable, and J. S. Duncan. The active elastic model. In Proceedings of IPMI, volume 0558 of LNCS, pages 36–49. Springer, 2001.

• Erik G. Miller, (Feb., 2002) Ph.D. Thesis: Learning from One Example in Machine Vision by Sharing Probability Densities. MIT EECS.

• Erik Learned-Miller, (2005) Data Driven Image Models through Continuous Joint Alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).

• Lilla Zollei, Erik Learned-Miller, Eric Grimson and William Wells, (2005) Efficient population registration of 3D data. Workshop on Computer Vision for Biomedical Image Applications: Current Techniques and Future Trends, at the International Conference of Computer Vision (ICCV). (Best Paper Award)

Page 40: Plugin-Complete MI Tutorial

Incorporating local context in MI based registration: spatial and voxel label information

Frederik Maes, D. Loeckx

K.U. Leuven Dept. of Electrical Engineering (ESAT/PSI)

UZ Gasthuisberg Medical Imaging Research CenterLeuven, Belgium

Intensity-based non-rigid registration

p2 = p1 – u(p1)

p1 p2

Reference Template

• Find ‘realistic’ deformation field that maximizes ‘similarity’ between both images

Regularization

Original Deformed: valid Deformed: not valid

• Not all deformation fields are acceptable:

Impose constraints using a suitable deformation model

Deformation model

• Implicit regularization: basis functions– Global support: polynomial, Gaussian, thin-plate spline– Local support: B-spline, radial basis functions– Implicitly smooth at small scale, explicit regularization at larger scale

• Explicit regularization: – Smoothness penalty: Jacobian, volume preservation, rigidity– Physics-inspired PDE: elastic, viscous fluid, diffeomorphism

• (Biomechanical deformation model)• (Statistical deformation model)

Example: B-spline deformation model

Rueckert 1999 (B-spline), Meyer 1997 (thin plate spline)

( ) ( ) ( ) ( )zkR

yjR

xiR

ijkijkR kzkykx

zyx−−−= ΔΔΔ∑ 222; βββμμrg

Example: viscous fluid deformation model

• u = deformation field, v = deformation velocity• F = force field, derived from similarity measure• λ, μ = material parameters (μ=1,λ=0)

( ) ( ) ( ) 0,2 =+∇∇++∇ uxFvv μλμ

Christensen 1998ii

i xuv

tu

dtduv

∂∂

+∂∂

== ∑=

3

1

Page 41: Plugin-Complete MI Tutorial

Model-based non-rigid registration

Cost function:

C(u) = -Csimilarity(u,I) + Cpenalty (u)

Optimization: minimize C wrt deformation field u:

iu

Cu

CuC

i

similarity

i

penalty

i

∀=∂

−∂+

∂=

∂∂ ,0

)(

MI as similarity measure for NRR?

• Global Histogram– Multimodal registration:

• MMI wants to minimise minor in favour of major peaks in the histogram• NRR will reduce smaller image details

– Non-stationary intensity relationship, e.g. bias:• NRR will register bias fields, not image features

Proposed solution: incorporate voxel label information

• Local Histogram– Limited number of samples

• Statistical power?Proposed solution: overlapping subregions

Spatially conditional MI

D. Loeckx

Local Mutual Information

• Introduce spatial location as an extra variable– : Allows for spatially varying p(r,f)

– : Spatial label, spatial bins overlapping regions

• How to compute MI between r, f, x ?– Total correlation (Studholme 2006):

– Conditional MI: = MI between R and F when X is known’

( ) ( ) ( ) ( )XFRHXHFHRHXFRC ,,)(,, −++=

( ) ( )x,,, frpfrp →

( )xp

( ) ( ) ( ) ( )XR,FHXFHXRHXR,FI −+=

Mutual information

( )RH( )FRH ,

( )FH

+ – =

( )H R ( )H F

( ) ( ) ( ) ( ), ,I R F H R H F H R F= + −

( ) ( ) ( )logx

H X p x p x= −∑F

: Reference image (histogram): Floating image (histogram)

: Entropy

R

Conditional Mutual Information

( )XRH ( )XFH

( )XFRH ,

( ) ( ) ( ) ( ), ,I R F X H R X H F X H R F X= + −

+ – =

Locally, if I know p(r),can I better predict p(f)?

( )H R ( )H F

( )H X

( ) ( ) ( ) ( )( ) ( )

( ) ( )

,, , log

,

x r f

x

p r fI R F X p p r f

p r p f

p I R F

⎛ ⎞= ⎜ ⎟⎜ ⎟

⎝ ⎠=

∑ ∑∑

∑ x x

xx x

x x

x

Page 42: Plugin-Complete MI Tutorial

Conditional Parzen window

( ) ( )( ) ( )( )( )∑∈

• −−=R

RFfRRrR

fwrwfrx

μxgxμ ;II;,H

( ) ( )( )∑

=•

fr,;,H

;,H;,μ

μμfr

frfrp

( ) ( )ξξξξ μμμμ ijk

RT

f

ijk

f

R ijk

f

ijk iwwwfr

R∂

∂∂∂

∂=

∂=

∂∂ ∑

•μxg

x

;I,;,HKK

( )μx;,,H frs

( )xxx −⋅ Rw( )μx;,,H frs∂

( ) ( )( ) ( )

( )( )∑

∑==

χμχ

μxμx

μxμxμx

r,f, s

r,f ss

fr

frp

pfrfrp

;,,H

;,,H;,

;;,,H;,

Spatial bins

• Same concept of ‘spatial resolution’⇒ Use same settings for mesh knots and spacing⇒ Local transformation guided by local joint histogram

( ) ( ) ( ) ( )zkzR

yjyR

xixR

ijkijkR kxkxkx

zyx−−−= ΔΔΔ∑ ,

2,

2,

2; βββμμxg~

( ) ( ) ( )zzRyyRxxR xxxxxxzyx

−−−= ΔΔΔ ,2

,2

,2 βββ

( ) ( )( ) ( )( )( )∑∈

• −−=R

RFfRRrR

fwrwfrx

μxgxμ ;II;,H( )μx;,,H frs

( )xxx −⋅ Rw

Toy experiment

• 200 2D image pairs– ‘CT’, ‘MR’– 256x256 pixels– I = 0, 200, 400, noise σ =50– Mesh spacing 32 voxels– 32 bins, PW, PV

• Initial transformation– μ uniform, < 30 pixels

• Validation– Intensity difference– Warping index– ROI: 10% outside polygon

CT

(orig

inal

) MR

(warped)

PV, c

ondi

tiona

l MI PW

,conditionalMI

PV, g

loba

l MI PW

, global MI

Toy experiment

CT

(orig

inal

) MR

(warped)

PV, c

ondi

tiona

l MI PW

,conditionalMI

PV, g

loba

l MI PW

, global MI

Theoretical foundation

R F

Global MI:whole image

local optimum⎥⎦

⎤⎢⎣

31

2

000

~AA

AH

⎥⎦

⎤⎢⎣

′′

3

2

000~0~A

AHConditional MI:central region

global optimum

Clinical CT/MR

• Data– Radiotherapy

• Colorectal cancer• Delineations MR CT

– 3 CT/MR pairs– Manual delineations

• rectum, mesorectum

• Settings– PW, multiresolution, 64 bins

• Validation– Dice similarity (DSC)– Centroid distance (cD)

Black: original (ground truth)White 1: global MI

White 2: conditional MI

Page 43: Plugin-Complete MI Tutorial

Clinical CT/MR

Black: original (ground truth)White 1: global MI

White 2: conditional MI

Incorporating voxel label information

E. D’Agostino

A viscous fluid model for NRR

• u = deformation field, v = deformation velocity• F = force field, derived from similarity measure• λ, μ = material parameters (μ=1,λ=0)

( ) ( ) ( ) 0,2 =+∇∇++∇ uxFvv μλμ

Christensen 1998ii

i xuv

tu

dtduv

∂∂

+∂∂

== ∑=

3

1

Approximate solution

• Solve for v by spatial convolution of F with Gaussian kernel• Integrate over time to solve for u• Recompute F(x,u) and iterate until convergence• Regridding to enforce Jacobian of u to be positive• Main parameter: σ (width of spatial smoothing kernel)

( ) ( ) ( )( )3

1( ) ( ) ( )

1

kk k kk k k

ii i

uR v v tu u Rx

+

=

⎡ ⎤∂= − ⇒ = +⎢ ⎥∂⎣

Δ⎦

( ) *spaceF MI v Fσψ= ∇ − ⇒ =

A force field for NRR using MMI

( ) *spaceF MI v Fσψ= ∇ − ⇒ =

(ψh = Parzen window kernel)

Hermosillo 2002

D’Agostino 2003

Motivation for label information Patient image Atlas Deformed atlas

Intensity-based matching is confused in case intensities are ambigousNeed more specific information

Lesions

Page 44: Plugin-Complete MI Tutorial

Introducing label information

Model-based tissue classification

From iso-intensity objects... ... to more relvant anatomical objects

Van Leemput 1999

Matching class labels instead of intensitiesTemplate image

class labelsReference image

class labelsAssume segmentations are available for both images, as probabilistic tissue maps

i, j(i,T) = indices of corresponding voxels, T = deformation field

k = class labele.g. WM,GM,CSF,OTHER

cik, cik = class probality

Actual correspondence of class labels

= fuzzy overlap of class labels in images R and T (voxel-wise averaged)

Assessing correspondence of class labels

= mutual information of labels k1 in image R with labels k2 in image T

But:

this does not exploit the fact that correspondence of labels k1 and k2 is known

Ideal correspondence of class labels

= fuzzy overlap of class labels in images R and T (voxel-wise averaged), assuming T ideally to be identical to R

Assessing correspondence of class labels

= Kullback-Leibler divergence between actual and ideal overlap of labels k1 in image R with labels k2 in image T

Page 45: Plugin-Complete MI Tutorial

Force field for class-class matching

Partial volume Interpolation:wi,jn(i) = PV weights

Force field for class-class matching

D = 0.038 D = 0.00022

Matching class labels to intensitiesReference image

intensitiesAssume segmentations are available for only one of the images, as probabilistic tissue maps

i, j(i,T) = indices of corresponding voxels, T = deformation field

k = class labele.g. WM,GM,CSF,OTHER

cjk = class probalityr = intensity value

Template image class labels

Class-to-intensity matching

= minimizes the conditional entropy of IR given the labels CT

If a Gaussian mixture model is adopted for IR, this is equivalent to minimizing the intensity variance within each class

Joint class-intensity histogramPartial Volume interpolation:

WM

GM

CSF

Other

(pWM,pGM,pCSF,pO)T4 (pWM,pGM,pCSF,pO)T3

(pWM,pGM,pCSF,pO)T2(pWM,pGM,pCSF,pO)T1

iRj

Reference

Template

w2

w3 w4

w1

iRj

wk = Σ wj pkj

+wWM

+wGM

+wCSF

+wO

Page 46: Plugin-Complete MI Tutorial

Force field for class-intensity matching

(PV interpolation)

Summarizing

• Use of Parzen estimator

• Continuous histogram model

• Dependent on image gradient

• Use of PV interpolation

•Discrete histogram model

• No image gradient!

II

CC

IC

Viscous fluid regularizer

),().()( uxFvv =∇∇++Δ μλμ

),( RT IIMI−∇=

),( RT ICMI−∇=

),( realideal ppD∇=

Experiment 1: Atlas to study image matching

template image (Brainweb atlas)

n reference images

(10 normal brains)

Preprocessing: rigid registration to the atlas + skull-strippingResolution: [91 109 91] voxels, voxel size: [2 2 2] mmValidation: overlap of WM/GM/CSF after registration

NRR using II, IC, CC

Experiment 1: Atlas to study image matching

II IC CC

63,882,881,5CC

55,278,978,9IC

41,573,276,4II

CSFGMWMOverlap values

Experiment 2: Recovering simulated deformations

template image (Brainweb atlas)

reference image (artificially deformed atlas)

Resolution: [91 109 91] voxels, voxel size: [2 2 2] mmValidation: RMSE, overlap of WM/GM/CSF after registration

NRR using II, IC, CC

Page 47: Plugin-Complete MI Tutorial

Experiment 2: Recovering simulated deformations

0,370,680,94RMSE

CCICII

87,593,793,8CC

74,786,987,1IC

56,181,986,3II

CSFGMWMOverlap

Experiment 3: Inter-subject registration

template image(subject 1)

reference image (subject 2)

NRR using II, IC, CC

Resolution: [91 109 91] voxels, voxel size: [2 2 2] mmValidation: overlap of WM/GM/CSF after registration

Experiment 3: Inter-subject registration

64,083,881,9CC

53,879,979,1IC

48,178,477,7II

CSFGMWMOverlap

Conclusion: incorporating label information improves registration result, especially when label information is available for both images (CC)

But: requires segmentation …

Combined segmentation and registration

PROBLEM: • Atlas-guided tissue class segmentation requires atlas registration…• Tissue class-guided registration requires tissue segmentation…

SOLUTION: • Merge atlas registration and tissue segmentation in a single algorithm

RELATED WORK:Wyatt 2003, Chen MICCAI 2004, Ashburner 2005, Pohl 2006, D’Agostino WBIR 2006

Joint segmentation and registration

Model-based tissue classification using the EM algorithm:

Parameters: },...1,,{ CKkkk == σμθ

posterior atlas prior(fixed)

Gaussian mixture model(with bias field correction)

Van Leemput 1999

Joint segmentation and registration

Extended EM algorithm (EEM): include registration

Parameters: },,...1,,{ TCKkkk == σμθ

posterior atlas prior(deformable)

Gaussian mixture model(with bias field correction)

Page 48: Plugin-Complete MI Tutorial

Joint segmentation and registration

E-step:

M-step (EM + EEM):

Joint segmentation and registration

• M-step (EEM):

(trilinear interpolation of the atlas)

Joint segmentation and registration: EEM

This defines a force field in each voxel

Optimize T in each iteration of the EM algorithm using (a few iterations of) the viscous fluid regularizer NRR

Example: EM vs EEM

EM (affine) EEM (non-rigid)

References

• J. Ashburner and K.J. Friston. Unified segmentation. NeuroImage, 26(2005), 839-851, 2005• X. Chen and M. Brady and D. Rueckert. Simultaneous segmentation and registration for medical

image. Medical Image Computing and Computer-Assisted Intervention (MICCAI'04), volume 3216 of Lecture Notes in Computer Science, pages 663{670, Saint-Malo, France, September 2004. Springer-Verlag, Berlin.

• Christensen, G., Rabbitt, R., Miller, M., 1996b. Deformable templates using large deformation kinetics. IEEE Transactions on Image Processing 5 (10), 1435–1447.

• E. D'Agostino, F. Maes, D. Vandermeulen, and P. Suetens. A viscous fluid model for multimodal non-rigid image registration using mutual information. Medical Image Analysis, 7(4):565-575, 2003.

• E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, Non-rigid atlas-to-image registration by minimization of class-conditional image entropy, Lecture notes in computer science, vol. 3216, pp. 745-753, 2004 (MICCAI 2004)

• E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, An information theoretic approach for non-rigid image registration using voxel class probabilities, Medical image analysis, vol. 10, no. 3, pp. 413-431, 2006

• E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, A unified framework for atlas based brain image segmentation and registration, Lecture notes in computer science, vol. 4057, pp. 136-143, 2006 (WBIR 2006)

• G. Hermosillo, C. Chef d'Hotel, and O. Faugeras. Variational methods for multimodal image matching. International Journal of Computer Vision, 50(3):329-343, 2002.

References

• D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information, LNCS, vol. 4584, pp. 725-737, 2007 (IPMI 2007)

• D. Loeckx, P. Slagmolen, F. Maes, D. Vandermeulen, P. Suetens, Nonrigid image registration using conditional mutual information, IEEE transactions on medical imaging, 2009 (in press)

• Meyer C.R., J.L. Boes, B. Kim, P.H. Bland, et al.: Demonstration of accuracy and clinical versatility of mutual information for automatic multimodality image fusion using affine and thin plate spline warped geometric deformations. Medical Image Analysis 1(3):195-206, 1997.

• Pohl KA, Fisher J, Grimson WEL, et al., A Bayesian model for joint segmentation and registration, NEUROIMAGE, 31(1):228-239, 2006

• D. Rueckert, L.I. Sonoda, C. Hayes, D. Hill, M.O. Leach, and D.J. Hawkes. Nonrigid registration using free-form deformations: application to breast MR images. IEEE Transactions on Medical Imaging, 18(8):712-721, 1999.

• Studholme C, Drapaca C, Iordanova B, et al., Deformation-based mapping of volume change from serial brain MRI in the presence of local tissue contrast change, IEEE TRANSACTIONS ON MEDICAL IMAGING, 25(5):626-639, 2006

• K. Van Leemput, F. Maes, D. Vandermeulen, and P. Suetens. Automated model-based tissue classification of MR images of the brain. IEEE Trans. Med. Img., 18(10):897-908, 1999.

• Paul P. Wyatt and J. Alison Noble. MAP MRF joint segmentation and registration of medical images. Medical Image Analysis, 7(4):539-552, 2003

Page 49: Plugin-Complete MI Tutorial

1

Incorporating spatial information in mutual information-based registration

Josien Pluim

Image Sciences InstituteUniversity Medical Center UtrechtThe Netherlands

Outline

• Brief overview of suggested methods• α-Entropy and entropic graphs• An example application: multifeature mutual information

for registration of cervical MR images

Lack of spatial information Lack of spatial information

In theory, the mutual information of two images does not take spatial information into account.

In practice, it plays a very slight role, through interpolation and through blurring in multiscale optimization.However, explicitly incorporating spatial information may lead to much better registration results.

Spatial information: labelling

• Studholme 1996• Knops 2004• D’Agostino 2006

Combining MI with a labelling of (one of) the images.For instance, Studholme suggests

),,(),()(),,( JJJ LJIHLJHIHLJIMI −+=

Spatial information: local intensity

• Rueckert 2000

Higher-order mutual information:Use probabilities of neighbouring intensities (i,j) in images (co-occurrence matrix).

Joint entropy requires 4D histogram.

∑∑−=i j

jipjipXH ),(log),()(2

Page 50: Plugin-Complete MI Tutorial

2

Spatial information: local intensity

• Bardera 2006

Extends Rueckert’s idea to groups of 3 neighbours,positioned on random lines through the image volume.

• Russakoff 2004

Extends Rueckert’s idea to 3x3 neighbourhood.Vector per pixel.

Image courtesy: Russakoff

Spatial information: local structure

• Pluim 2000

Combination of gradient and intensity information.

∑∩∈

∇∇=)(I)(

))J(,)I(min( ))(( J)G(I,Jx,Tx

x,Tx Txxw σσσα

J)MI(I, J)G(I, J)(I,MInew =

1

π

w

Spatial information: local structure

• Holden 2004

Adds first or second derivative information, similarly toStudholme.

• Gan 2008

‘Maximum distance-gradient’ of a voxel is the maximum ofthe gradients to all other voxels.

)',,',()',()',()',',,( JJIIHJJHIIHJIJIMI −+=

Spatial information: local structure

• Tomaževič 2004

Defines a vector of k features for each voxel, with intensityas first feature and the gradient at a single scale as secondfeature.

• Rodríguez 1998

Jumarie entropy: entropy on partial derivatives

Spatial information: local structure

• Luan 2008

‘Quantitative-qualitative MI’.Include utility of voxels (regional saliency; local entropy).

∑=)()(

),(log),(),(),(Txx

TxxTxxTxx JpIp

JIpJIpJIuJIQMI

Computing higher-dimensional MI

Problem: so-called ‘Curse of dimensionality’.

Higher-dimensional histogram becomes too sparsely populated for reliable estimation of densities.

A 4D histogram is feasible with a small number of bins.

Page 51: Plugin-Complete MI Tutorial

3

Computing higher-dimensional MI

Assumption of normal distribution of densities.Entropy of a normally distributed set of points in Rd withcovariance matrix Σd:

• Tomaževič 2004• Russakoff 2004

))det()2log(()( 21

2dd

deH Σ=Σ π

Computing higher-dimensional MI

• Zhang 2005

Computation of entropy of M-dimensional distribution inO(N), with N the number of samples.For each sample i, define with the bin value of sample i in dimension j.Then compute the entropy of set Ci.

Miiii BBBC K21=

jiB

α-Entropy measures and entropic graphs

References: Hero 2002, Neemuchwala 2005, 2007

α-Entropy

α-Entropy:

Can be estimated with entropic graphs.

∫−= dzzffH )(log

11)( α

α α

α-Entropy through entropic graphs

Given a set of vectors in d-dimensional feature space.

Construct a minimal graph through zi.L(Zn) = total edge length of the graph.Log of normalized L(Zn) converges to α-entropy as n → ∞.

{ }nn zzΖ ,,1 K=

( ) ( ) cfHn

L n

n+=⎟

⎠⎞

⎜⎝⎛ Ζ− −

∞→ ααα log)1(lim 1

Example

Uniformly distributed pointshigh graph length, high entropy

Image courtesy: Neemuchwala

Page 52: Plugin-Complete MI Tutorial

4

Example

Normally distributed pointslower graph length, lower entropy

Image courtesy: Neemuchwala

Convergence

Total normalized graph length

Image courtesy: Neemuchwala

Total graph length

Multifeature MI for registration of cervical MRI

Reference: Staring 2009

Registration of cervical MRI

Challenges:• Intensity inhomogeneity• Anatomical variation (e.g. bladder filling, deformation)• Highly anisotropic resolution

Features

Features describing local structure.Cartesian image structure invariants, at σ=1 and σ=2.Plus intensity.Total: d = 15 features.

LijLjkLki

LijLji

Lii

LiLijLjkLk

LiLijLj

LiLi

LEinstein notation

Features: examples

L LiLi LiLijLj

Lii LijLji LijLjkLki

LiLijLjkLk

Page 53: Plugin-Complete MI Tutorial

5

Computing multifeature MI

We have a feature vector z(xi) for every point xi. Define

features in reference image

in transformed moving image

joint features

)( ir xz

))(( im xTz μ

{ }))((, imrrm xTzzz μ=

Computing multifeature MI

Entropy estimation using kNN graph.

Length function is distance to k neighbours:

Similarly for and

∑=

−=k

pip

ri

rri xzxzL

1

)()(

miL rm

iL

Computing multifeature MI

α-MI for reference image R and moving image M is

μ : deformationα : user-definedγ : d(1-α)n : number of samples

∑= ⎟

⎜⎜

n

imi

ri

rmi

LLL

n 1

2

)()(1log

11

γ

α μμ

α

Computing multifeature MI

α = 0.99n = 5000k = 5

PCA on features to reduce computation time.First 6 PCs a good trade-off.

Experiments

Data:Follow-up MR T2 for radiotherapy treatment. 36 image pairs of 19 patients.

Evaluation:Manual delineations of CTV (clinical target volume), bladder and rectum.Deformed delineations compared to manual ones.

MI vs α-MI: Dice overlap

Page 54: Plugin-Complete MI Tutorial

6

MI vs α-MI: distance error

Distances between segmentation surfaces.Mollweide projection, like map of the globe.

uterus

anus

left rectum right bladder left

MI vs α-MI: distance error

MI

α-MI

CTV (median error) bladder (median error)

Conclusions

• Multifeature α-MI outperforms standard MI• Better overlap, smaller distance errors• Downside is computation time: 28 vs 1 minute.

References

1. E. D'Agostino, F. Maes, D. Vandermeulen, P. Suetens, An information theoretic approach for non-rigid image registration using voxel class probabilities, Med. Image Anal. 10(3):413-431, 2006

2. A. Bardera, M. Feixas, I. Boada, M. Sbert, High-dimensional normalized mutual information for image registration using random lines, WBIR, LNCS 4057:264-271, Springer, 2006

3. R. Gan, A.C.S. Chung, S. Liao, Maximum distance-gradient for robust image registration, Med. Image Anal. 12(4):452-468, 2008

4. A.O. Hero, B. Ma, O. Michel, J. Gorman, Applications of entropic spanning graphs, IEEE Signal Proc. Magazine, 19(5):85-95, 2002

5. M. Holden, L.D. Griffin, N. Saeed, D.L.G. Hill, Multi-channel mutual information using scale space, MICCAI, LNCS 3216:797-804, Springer, 2004

6. Z.F. Knops, J.B.A. Maintz, M.A. Viergever, J.P.W. Pluim, Registration using segment intensity remapping and mutual information, MICCAI, LNCS 3216:805-812, Springer, 2004

7. H. Luan, F. Qi, Z. Xue, L. Chen, D. Shen, Multimodality image registration by maximization of quantitative-qualitative measure of mutual information, Pattern Recognit. 41(1):285-298, 2008

References

8. H. Neemuchwala, A. Hero, P. Carson, Image matching using alpha-entropy measures and entropic graphs, Signal Process. 85(2):277-296, 2005

9. H. Neemuchwala, A. Hero, S. Zabuawala, P. Carson, Image registration methods in high dimensional space, Int. J. Imaging Syst. Technol., 16(5):130-145, 2007

10. J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Image registration by maximization of combined mutual information and gradient information, IEEE Trans. Med. Imaging 19(8):809-814, 2000

11. C.E. Rodríguez-Carranza and M.H. Loew, A weighted and deterministic entropy measure for image registration using mutual information, SPIE Medical Imaging: Image Processing, Proc SPIE 3338:155-166, 1998

12. D. Rueckert, M.J. Clarkson, D.L.G. Hill, D.J. Hawkes, Non-rigid registration using higher-order mutual information, SPIE Medical Imaging: Image Processing, Proc. SPIE 3979:438-447, 2000

13. D.B. Russakoff, C. Tomasi, T. Rohlfing, C.R. Maurer, Jr., Image similarity using mutual information of regions, ECCV, LNCS 3023:596-607, Springer, 2004

14. M. Staring, U.A. van der Heide, S. Klein, M.A. Viergever, J.P.W. Pluim, Registration of cervical MRI using multifeature mutual information, IEEE Trans. Med. Imaging 28(9):1412-1421, 2009

References

15. C. Studholme, D.L.G. Hill, D.J. Hawkes, Incorporating connected region labellinginto automated image registration using mutual information, MMBIA:23-31, 1996

16. D. Tomaževič, B. Likar, F. Pernuš, Multi-feature mutual information, SPIE Medical Imaging: Image Processing. Proc. SPIE 5370:143-154, 2004

17. J. Zhang and A. Rangarajan, Multimodality image registration using an extensible information metric and high dimensional histogramming, IPMI, LNCS 3565:725-737, Springer, 2005