SynopsisThis poster shows results from our development of an extended MATLAB image processingtoolbox, which implements some useful optical image processing and mosaicking algorithmsfound in the literature. We surveyed and selected algorithms from the field which showedpromise in application to the underwater environment. We then extended these algorithms toexplicitly deal with the unique constraints of underwater imagery in the building of our toolbox.As such, the algorithms implemented include:
1. Contrast limited adaptive histogram specification (CLAHS) to deal with the inherent non-uniform lighting in underwater imagery
2. Fourier based methods for scale, rotation, and translation recovery which provide robust-ness against dissimilar image regions
3. Local normalized correlation for image registration to handle the unstructured environmentof the seafloor
4. Multiresolution pyramidal blending of images to form a composite seamless mosaic with-out blurring or loss of detail near image borders
Keeping in theme with the global view of CenSSIS, “Diverse Problems, Similar Solutions,” manyof the algorithms are useful to the rest of the CenSSIS community. Take a look at the normal-ized correlation section of the poster to see some recent applications of our algorithm tomedical imaging.
Contrast Limited Adaptive HistogramSpecification
Figure 1: (top left) Original imagery collected by the AUV ABE acquired at a geological site of interest in the East Pacific Rise.(top right) Adaptive histogram equalized imagery. While this technique compensates very well for the nonuniform lighting pattern, itcannot (as seen in the lower right corner of the image) compensate for parts of the image where the light intensity levels are of the orderof the sensitivity of the camera. (bottom left) Histogram of the original imagery. (bottom right) Histogram of the AHE imagery.
The propagation of light underwater suffers from rapid attenuation and extreme scattering. These, in combinationwith the limited camera-to-light separation available on most underwater imaging platforms, places severe limita-tions on underwater imagery. To deal with the lighting artifacts of nonuniform illumination and low contrastunderwater imagery, we utilize the classical techniques associated with contrast limited adaptive histogram equal-ization (CLAHE) (Zuiderveld 1994). With this technique the image is broken up into sub-regions. The optimal grayscale distribution is calculated for each of these sub-regions, based upon its histogram and a previously deter-mined transfer function, which is based upon the desired histogram of the sub-region. Then, each pixel of theimage is adjusted based upon interpolation between the manipulated histograms of the neighboring sub-regions.Our extensive work upon underwater imagery has suggested that the model of a Raleigh distribution is most suitedfor underwater imagery.
Contact InfoWe’d be happy to supply you with a copy ofour MATLAB image processing toolkit. Wealso have extended underwater imagery datasets which are available for research work.Please contact either of us:
Dr. Hanumant Singh(508) 289-3270, [email protected] Eustice(508) 289-3269, [email protected]
ReferencesBrown, L. G. (1992). “A Survey of Image Registration Tech-niques.” ACM Computing Surveys 24(4): 325—376.
Burt, P. J. and E. H. Adelson (1983). “A Multiresolution Splinewith Application to Image Mosaics.” ACM Transactions of Graph-ics 2(4): 217—236.
Eustice, R., O. Pizarro, et al. (2002). UWIT: Underwater ImagingToolbox for Optical Image Processing and Mosaicking in MATLAB.Proceedings of the Third Underwater Technology Symposium,2002, Tokyo, Japan. (to be presented)
Irani, M. and P. Anandan (1996). Robust Multi-Sensor ImageAlignment. Sixth International Conference on Computer Vision,1998.
Mandelbaum, R., G. Salgian, et al. (1999). Correlation-BasedEstimation of Ego-Motion and Structure from Motion and Stereo. Pro-ceedings of the Seventh IEEE International Conference on Com-puter Vision, 1999, Kerkyra, Greece.
Reddy, B. S. and B. N. Chatterji (1996). “An FFT-Based Tech-nique for Translation, Rotation, and Scale-Invariant Image Reg-istration.” IEEE Transactions on Image Processing 5(8): 1266—1271.
Zuiderveld, K. (1994). Contrast Limited Adaptive HistogramEqualization. Graphics Gems IV. P. Heckbert. Boston, AcademicPress. IV: 474—485.
Fourier Based Image Translation, Scale,and Rotation RecoveryMany image processing problems involve the fundamental task of registration of a pair of images. Methods rangefrom: 1) correlation methods which use pixel values directly; 2) fast Fourier transform methods which use fre-quency domain information; and 3) feature based methods which use low-level features such as edges and cor-ners. This particular algorithm is based upon Fourier domain methods for scale, rotation, and translation recoveryby making use of the phase shift property of Fourier transforms (Reddy 1996).
Let ),(2 yxf be a shifted version of the image ),(1 yxf :
),(),( 0012 yyxxfyxf −−=By the shifting property of the Fourier Transform:
)(211212
0291),(),( yxjeFF ωωωωωω +−=The translational offsets ),( 00 yx can be recovered by locating the impulse associated with the inverse transform of
the cross-power spectrum of the two images:
),(),(),(
),(),(00
)(
*212211
*212211
1
0201 yyxxeFF
FF yxj ++==⋅⋅ −ℑ
+ δωωωωωωωω ωω
This same property can be exploited for images which are rotated and scaled by representing them in a coordinate
system where scale and rotation appear as shifts. For example, when ),(2 yxf is a rotated version of ),(1 yxf :
)cossin,sincos(),( 000012 θθθθ yxyxfyxf +−+=Their Fourier transforms are related by:
)cossin,sincos(),( 020102011212 θωθωθωθωωω +−+= FFLooking at the magnitudes of the Fourier transforms and converting to polar coordinates we see that the rotationscan be written as shifts:
),(),( 012 θθρθρ −=MM
)arctan( 2122
21 ωωθωωρ =+=
Similarly, when two images are related by a scale factor, a , then their Fourier transforms are related by:
),(1
),( 2112212 aaFa
F ωωωω =Taking the logarithm of the frequency results in the scale appearing as a shift:
)loglog,log(log1
)log,(log 2112212 aaFa
F −−= ωωωωCombing all of these properties we see that the magnitudes of two translated, scaled, and rotated images arerelated by:
),(),( 022 θθρθρ −= aMMAfter taking the log of the radius, rotation and scale are now both represented as shifts:
),loglog(),( 022 θθρθρ −−= aMM
Filtered Input image.
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Magnitude of spectrum. Filtered Input image.
100 200 300 400 500 600
100
200
300
400
500
600
Logpolar Magnitude of Spectrum. Input Image.
h*log(rho)t, rho in pixel s
thet
a [d
eg]
100 200 300 400 500 600
0
20
40
60
80
100
120
140
160
Figure 5: (top) Control image. (middle) Input image.(bottom) Registered image.
Control image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Input image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Registered Input image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Control image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Input image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Registered Input image
50 100 150 200 250 300 350 400 450 500 550
50
100
150
200
250
300
350
Figure 4: (Left top) The underwater imagery is firstadaptive histogram equalized, windowed, and then Laplacianof Gaussian filtered. (above) Fourier magnitude spectra offiltered image. (left bottom) Log polar of the annulus of fre-quencies of the filtered image.
Figure 6: (top) Control image. (middle) Input image.(bottom) Registered image.
Figure 2: (left) Original imagery of the whale remains imaged by the DSV Alvin off of San Diego at a depth of 1700m.Limitations in power led to overall intensity levels that are quite low even though the image is uniformly illuminated. (right) Adap-tive histogram equalized imagery.
Figure 3: (left) Original color imagery at the base of a hydrothermal vent in the Guaymas Basin at depth of 2200m. (right)Adaptive histogram equalized imagery as applied to the Y component of the imagery in YIQ color space.
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
4000
4500
0 50 100 150 200 250
0
500
1000
1500
2000
2500
3000
3500
Multiresolution Pyramidal Based BlendingDue to the rapid attenuation of light underwater, the only way to get a large scale view of the seafloor is to build upa mosaic from smaller local images, such as in Figure 7. The mosaic technique is used to construct an image witha far larger field of view and level of resolution than could be obtained with a single photograph.
Once the mosaic is generated, a technical problem in image representation is joining image borders so that theedge between them is not visible. The two images to be joined may be considered as two surfaces, where the imageintensity I(x,y) is viewed as the elevation above the (x,y) plane. The problem then is how to gently distort theimages near their common border so that the seam is smooth?
We implement a multiresolution pyramidal blending approach where the two images are decomposed into differ-ent band-pass frequency components, merged on those levels, and then reassembled into a single seamless com-posite image (Burt, 1983). The idea is that with this technique the transition zone between band-pass imagecomponents can be appropriately chosen to match the scale of features in that band-pass component.
First, a Gaussian pyramid is constructed for each image where the base level in the pyramid, G0, is the original
image. Each successive level is a low-pass filtered and down-sampled by factor of two version of the previous level
(i.e. for an appropriately chosen kernel w(m,n)). Next, the different band-pass
components are formed by generating the Laplacian pyramid. The Laplacian pyramid is generated from the Gaussianpyramid by expanding the image at the next higher level in the pyramid to the resolution of the current level and
then subtracting them (i.e. ). This results in each level of the Laplacian pyramid
containing a separate band-pass component of the original image. The two Laplacian pyramids are then mergedat each level of the pyramid and the resulting new seamless image is constructed from the different pyramid levels
via where N is the number of
pyramid levels and the notation Ll,l impliesexpansion of the level Ll,l times to the reso-lution of G0.
Local Normalized CorrelationNormalized correlation is a practical measure of similarity (Brown, 1992). Normalized correlation of two signals isinvariant to local changes in mean and contrast. When two signals are linearly related, their normalized correlationis 1. When the two signals are not linearly related, but do contain similar spatial variations, normalized correlationwill still yield a value close to unity (Irani, 1996).
The lack of rich features in underwater imagery precludes indirect feature based methods, and experimental evi-dence suggests that direct correlation based methods yield good results. We employ a dense local normalizedcorrelation to determine correspondence between images. The shape of the local normalized correlation surfaceswill be concave and have a prominent peak at the correct displacement. We fit a quadratic surface near the surfacepeak and analytically check for concavity (Mandelbaum, 1999) as a method of outlier rejection.
Figure 7: A sequence of images which was mosaicked together in a globallyconsistent manner utilizing all available cross-linked image pair correspondences.To obtain good registration, especially along edges, we must compensate for radialdistortion. The similarity metric used for point correspondence was local normalizedcorrelation.
Figure 8: George Chen, fromMGH, asked us to look into the registra-tion of different frames of CT-Scan imag-ery to track patient organ motion. Aboveare the first stab results of applying ourstandard local normalized correlation al-gorithm to determine point correspon-dences. (top) The green dots represent theinitial base points which were selected atsoft tissue boundaries by applying an edgedetection filter to the image. (bottom) Theyellow dots represent the correspondingmatch for each base point. The red vectorat each point represents the motion mag-nitude of that point between frames.
Figure 9: (top) A two image mosaic with seam. The top image isoverlaid over the bottom image. (bottom) The final blended result.