ph.d. synopsis · combination of various data sources like in image fusion, change detection, and...
TRANSCRIPT
![Page 1: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/1.jpg)
1
Multi-View Image Registration in the Presence
of Occlusion
Ph.D. Synopsis
Submitted To:
Gujarat Technological University
For The Award
Of
Ph. D. Degree
In
Computer Engineering
By:
Darshana Mistry
Enrolment No: 119997107004, Reg. No:6014
(Computer Engineering)
Supervisor:
Dr. Asim Banerjee,
Professor,
Dhirubhai Ambani Institute
of Information and
Communication,
Gandhinagar,
India.
Co-Supervisor:
Dr. Shishir Shah,
Professor,
Computer Science
Department,
University of Houston,
USA.
![Page 2: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/2.jpg)
2
Index
1. Abstract .................................................................................................................................. 3
2. Brief description on the state of the art of the research topic and Literature Review ........... 4
3.Definition of the Problem……………………………………………………………………5
4.Scope of the Work……………….…………………………………………………………………6
5. Original contribution by the thesis. ........................................................................................ 7
6. Methodology of Research, Results / Comparisons ................................................................ 7
7. Achievements with respect to objectives ............................................................................. 16
8. Conclusion ........................................................................................................................... 17
9. Copies of papers published and a list of all publications arising from the thesis ................ 17
10. References .......................................................................................................................... 21
![Page 3: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/3.jpg)
3
Multi- View Image Registration in Presence of Occlusions
1. Abstract
Image Registration (IR) is a process of aligning of two or more images which are taken at
different time, from different sensors, different views of same scene. Image registration is a
fundamental step in all images learning system in which final information are gained from the
combination of various data sources like in image fusion, change detection, and multichannel
restoration. In multi-view analysis, images are taken of the scene from different viewpoints.
The aim behind this methodology is to gain a larger 2D view or 3D representation of the
scanned scene. It is used in remote sensing (mosaicking of images of surveyed area),
computer vision (shape recovery stereo), tracking of an object etc. One of the main
challenges to multi-view image registration is occlusion in an image. Scene’s features points
visible in one view may be occluded in another view by the objects in the scene. The issues
of registering multi view images in the presence of occlusion are investigated in this doctoral
work.
There are three basic steps for image registration: i) feature detection ii) feature matching
iii)transform model estimation. Features are distinctive parts of the image scene such as
points, lines, edges, corners, blobs, T-junctions etc. It is necessary that detected feature points
are invariant to rotation, scaling, translation, illumination change. There are SURF(Speeded
Up Robust Feature), SIFT(Scale Invariant Feature Transform), ORB(Oriented Fast and
Rotated Brief), Brisk(Binary Robust Invariant Scalable Key points), FAST(key points are
detected)+BRIEF(descriptor based on BRIEF(Binary Robust Independent Elementary
Features)), Harris+NCC(feature points are detected based on Harris corner and using NCC
corners are matched) and AKAZE(Accelerated KAZE) methods are used for feature finding.
Pablo, et. al.2012, found KAZE feature which is robust method in noise, blur compared to
SIFT and SURF. It describes 2D features in non-linear scale space. AKAZE is scale and
rotation invariant and use low storage. It provides good result in presence of noise and blur of
image. False matching points are detected and removed by the RANSAC (Random Sample
and Consensus) algorithm. Homography is very useful to estimate the image transform. The
transformation between two views is a projective transformation and it is applied through
homography transformation matrix. Occlusion of image is applied through structured noise(
by providing different size of blocks/ patterns in images). AKAZE provides good result in
![Page 4: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/4.jpg)
4
presence of structured/Gaussian noise compared to other methods because it use Fast Explicit
Diffusion(FED) and modified Local difference binary descriptor used which provide good
result in nonlinear scale space.
2. Brief description on the state of the art of the research topic and Literature Review
Image registration is widely used in remote sensing, medical imaging, computer vision etc.
(Zitova, et al., 2003). In general, its applications can be divided into four main groups
according to the manner of the image acquisition: difference times (multi temporal analysis),
different sensors (multimodal analysis), different viewpoints (multi view analysis) and scene
to modal registration.
There are three basic steps for image registration: 1) feature detection 2) feature matching and
3) Transformation model estimation. Key points and descriptor of feature are done through
SIFT by David Lowe, et al. 2006, SURF by Herbert Bay, et al. 2008, SIFT and its variants by
Jian Wu, et al., 2013., FAST by Edward Rostan, et. al. 2006, BRIEF by Michael Calonder, et.
al. 2010, ORB by Ethan Rublee, et al., 2011, BRISK by Stefan Leutenegger, et al., 2011,
AKAZE by Pablo Alcantarillaalso,et al., 2013 etc. Lua Jaun, et al., 2009 compared different
data set with SIFT, SURF and PCA-SIFT and conclude that SURF provides good result in
illumination changes than others. Utsav, et al., 2014, explained satellite image registration
using SURF for feature detection, remove false matching pairs using Euclidean distance and
transformed images using affine transformation. SURF gives results those are comparable to
SIFT but at a much lower computation cost, and hence is faster than SIFT and its variants.
Manjusha, et al., 2011 matched location of maximum match using pixel based method,
Fourier based method, mutual information method. Michael,et al., 2015 registered aerial
images using SIFT for landmark detection, RANSAC to eliminate outlier points of an image
and homography matrix. He mentioned that it is very critical to do multi view image
registration in presence of occlusion because sufficient ground areas do not appear in image
due to occlusion or lack of contrast.
From different literature surveys, there are main challenges come to multi-view image
registration in presence of occlusion because correspondence matching pairs not found from
sensed and registered images.
For checking of occlusion effect, Gaussian and structured/block noise are added in image.
Key points and descriptors of features are found using of SIFT, SURF, BRISK,
FAST+BRIEF(key points using FAST and descriptors are based on BRIEF), ORB, Harris
![Page 5: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/5.jpg)
5
corner+NCC (features are detected by Harris corner and matched by NCC), AKAZE. After
finding corresponding matching points of sensed and reference images. Homography
transformation matrix is found using RANSAC estimator which removes corresponding
outlier points and perspective image is generated based on homography transformation
matrix.
Sensed Image Reference Image
R
Registered Image Fig. 1. Flow of proposed doctoral work of multi-view image registration in presence of occlusion.
3.Definition of the Problem
Based on literature survey, the need for development algorithm for multi-view image
registration in presence of occlusion was selected as the problem to address in the Doctoral
work because the feature points are not easily matched in occluded parts of the images. Multi
view image registration is done in different level of Gaussian noise and structured noise. This
is useful for generating 3D image from 2D images, object tracking, image mosaicing etc.
4. Scope of the work
Main goal of my research is to develop algorithm(s) to register multi-view images in
presence of occlusion (Gaussian and Structured noise).
Occlusion is representing in image as:
Compare Frobenious norm difference of homography transformation matrix of
dataset and method/s for multi view images in presence of occlusions.
RANSAC (Random Sample Consensus) is used to find best homography
transformation matrix
Find perspective image on first (sensed) image using homography
transformation matrix
Feature key points and descriptor are found using SURF, SIFT, ORB, FAST+BRIEF,
BRISK, Harris+NCC, AKAZE.
Features are matched using Brute Force/FLANN algorithm.
![Page 6: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/6.jpg)
6
1. Gaussian noise is added in image by changing of standard deviation value and mean
value constant as zero.
2. Structured noises are added in image and different pattern are generated in an image.
Graffiti-Multi view data set(http://www.robots.ox.ac.uk/~vgg/data/data-aff.html) is used.
(a)SD=10
(b)SD=50
(c)SD=100
(d)SD=150
(e)SD=200
(f)SD=250
(g)SD=300
(h)SD=350
(i)SD=500
(j)SD=600
Fig.1. Different levels of Gaussian noise are added in image1 of Graffiti multi view data set
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
(o)
Fig.2. Different levels of structured noise in image1 of Graffiti multi view data set from (a) to (j). Structure noise generated
different area in image1 and image2 to image5 of Graffiti multi view data set.
5. Original contribution by the thesis.
The presented research work addresses for multi-view image registration in presence of
occlusion. The novel contribution and achievements of this thesis are:
Generate occlusion dataset with adding of Gaussian and structured noise of images.
A simple yet powerful technique for multi-view occlusion image registration
![Page 7: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/7.jpg)
7
Techniques are used for registration of different multi-view, scale, rotation, light
condition, and noise (Gaussian and Structured).
Fast extraction and matching of feature points.
Results are found based on
a. Noise is added in only sense image(i.e. image1) and reference image without
noise
b. Noise is added in only reference images(i.e. image2 to image5) and sense
image without noise
c. Noise is added to sense and references images.
6. Methodology of Research, Results / Comparisons
Features are found on sensed and reference images using SURF, SIFT, ORB, FAST+BRIEF,
BRISK, Harris + NCC, AKAZE. There are two basic steps for finding feature points in
image:1) finding key points 2)finding descriptor. First image is converted into integral
images. Box filter is convoluted with integral image.
In SURF, Hassian matrix is found based on Laplacian of Gaussian and key points are
detected from determinant of Hassian matrix. In scale space, filter size become double in
every octave instead of double size of image. So, SURF’s key points are scale independent.
For rotation invariant, haar wavelet response in x and y direction with in circular
neighbourhood of 6s around the interest point. For extraction of descriptor, constructing
square region centre around the interest point and oriented along the orientation selected in 6s
region. The size of window is 20s. This region is split into 4x4 square sub-regions. The
wavelet responses dx and dy are summed up from set of entries in the feature vector. Each
sub-region has a four dimensional descriptor vector (∑ ∑ ∑ ∑ Features
are matched based on Euclidean distance of descriptor vector of sensed and reference images
using brute force method.
In SIFT, scale space extrema detect by Difference of Gaussian (DoG). The scale space of an
image is convoluted of variable scale Gaussian. Filter size is fixed and image resolution is
double at every octave for scale invariant feature detection. Key points are detected by its
stability. The key point descriptor allows for significant shift in gradient positions by creating
orientation histograms over 4 x 4 sample regions. For each orientation histogram, with the
length of each arrow corresponding to the magnitude of the histogram entry. Total size of
descriptor is 128 bits.
![Page 8: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/8.jpg)
8
FAST method find features from accelerated segment test and it generate circle of sixteen
pixels around the corner candidate. But it is not work good if circle of pixels less than 12 and
it does not produce multi scale features. BRIEF descriptor performs similar result as SIFT,
but it sensitive to in-plane rotation. ORB use oriented FAST detector and rotated BRIEF
overcome of limitation of FAST and BRIEF, but it does not provide good result in multi
scale.
BRISK use a novel scale-space FAST-based detector in combination with the bit-string
descriptor from intensity comparisons retrieved by dedicated sampling of each key point
neighbourhood.
Since corner feature represents a variation in the gradient in the image. Harris corner
detection use Sobel operator and smallest Eigen value is used for corner detection.
KAZE method find 2D multi scale feature detection and descriptors in nonlinear scale space
by Additive Operator Splitting (AOS) and variable conductance diffusion. It provide good
results in noise and blur than SIFT, SURF and other methods. It expensive than SURF but
good than SIFT. AKAZE(Accelerated KAZE) use Fast Explicit Diffusion (FED) embedded
in a pyramidal framework to dramatically speed-up feature detection in nonlinear scale
spaces and a Modified-Local Difference Binary (M-LDB) descriptor that is
highlyefficient,exploitsgradientinformationfromthenonlinearscalespace,isscaleand rotation
invariant and has low storage requirements.
Based on corresponding matching feature points, homography matrix is found using
RANSAC(Random Sample and Consensus) which is used direct linear transform to remove
false matching pairs. Sense image is warped based on homography matrix and warped sensed
image and reference images are blending based on alphabeta blending method. To check our
proposed homography matrix, it compare with homography matrix of graffiti
dataset(http://www.robots.ox.ac.uk/~vgg/data/data-aff.html) as Table 1. OpenCV
2.4.9(Image Processing library) is configured with Microsoft visual studio 13 using
cmake2.8(cross platform support tool) on Windows 10 OS.
![Page 9: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/9.jpg)
9
Table 1. Comparison of dataset’s homography marix and proposed method’s homography matrix.
Method Sense Image1 Reference
Image2
Homography
Transformation
matrix from dataset
Estimated
Homography
Trasformation
Matrix
Frobenious
norm
difference of
two
homograph
y
transformat
ion matrix
SURF
0.879769, 0.324543, -39.431;
-0.18390, 0.938472, 153.159;
0.0001964, -0.000016, 1;
0.87917, 0.31560, -40.199;
-0.18304, 0.93333, 154.46;
0.000196, -0.000017242, 1;
1.4567
SIFT
0.879769, 0.324543, -39.431;
-0.18390, 0.938472, 153.159;
0.0001964, -0.000016, 1;
0.8797, 0.324543, -39.43;
-0.1839,0.938472, 153.16
0.0001964, -0.0000160, 1
0.2322
ORB
0.879769, 0.324543, -39.431;
-0.18390, 0.938472, 153.159;
0.0001964, -0.000016, 1;
0.8797, 0.324543, -39.43
-0.1839,0.9385, 153.15784
0.0001964, -0.000016, 1
1.3803
BRISK
0.879769, 0.324543, -39.431;
-0.18390, 0.938472, 153.159;
0.0001964, -0.000016, 1;
0.9104, 0.3322 -49.162;
-0.1714, 0.9655, 147.440;
0.000232, 7.44E-06 1
2.7305
BRIEF
0.879769, 0.324543, -39.431;
-0.18390, 0.938472, 153.159;
0.0001964, -0.000016, 1;
0.92141, 0.2953, -45.410;
-0.1634,0.9299, 152.394;
0.000279, -7.87E-05, 1
0.8634
AKAZE
0.879769, 0.324543, -39.431;
-0.18390, 0.938472, 153.159;
0.0001964, -0.000016, 1;
0.88303,0.3154, -40.746;
-0.1828,0.9398, 152.9;
0.000199, -1.49E-05 1
0.0867
Harris
+ NCC
0.879769, 0.324543, -39.431;
-0.18390, 0.938472, 153.159;
0.0001964, -0.000016, 1;
-0.6706 0.21908 -58.487
-0.15334 -0.6299 90.7334
0.000125,-5.092e-05 -0.627
0.9516
![Page 10: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/10.jpg)
10
Fig.2. Graph for Graffiti multi-view dataset’s homography transformation matrix comparison with different feature key points
finding methods.
The Frobenius norm, sometimes also called the Euclidean norm (a term unfortunately also
used for the vector -norm), is matrix norm of an matrix defined as the square root of
the sum of the absolute squares of its elements,
The difference of frobenius norm of dataset’s homography transformation matrix and
different feature finding method’s homography transformation method are measured in table
1 and Fig. 2 in case of without adding any noise. If difference is near to zero, homography
transformation matrix is similar to dataset homography transformation matrix and that
method is good for image registration. AKAZE provides good result and FN difference is less
than to other methods.
For occluded multi-view image registration, Gaussian noise (mean value is zero and standard
derivation value is changed) and structured noise are added in different images.
SURF SIFT ORB BRISKFAST+BRIEF
AKAZEHarris
corner+NCC
img1 to img2 1.4567 0.2322 1.3803 2.7305 0.8634 0.0867 0.9516
img1 to img3 1.7923 3.6402 2.2611 83.0214 20.2316 0.5715 16.2073
img1 to img4 58.7529 388.0239 585.3934 543.192 423.5331 2.9907 434.6938
img1 to img5 147.6418 104.6867 63.7113 278.9678 40.3463 6.5684 235.8676
img1 to img6 287.596 173.5486 272.5032 712.1027 585.0277 68.4733 6.3434
0100200300400500600700800
Fro
be
niu
s n
orm
dif
fere
nce
Comparison of homography transformation matrix of dataset and feature detection methods
![Page 11: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/11.jpg)
11
Gaussian Noise comparison result:
Table 2. Homography transformation matrix comparison(Gaussian noise(standard deviation=10) in
image1)
Method Gaussian noise
sense Image1
Reference
Image2
Estimated
Homography
Trasformation
Matrix
Warp
transformed
image based on
homography
transformation
matrix
Frobenious
norm
difference of
two
homography
transformation
matrix
SURF
Img1(SD=100)
0.00437 -0.08342 43.4580;
0.05564 -1.09774 572.991;
9.71E-05 -0.00192 1
416.4776
SIFT
Img1(SD=350)
-0.50842 -0.0507 301.8041;
-0.29967 -0.0312 178.242;
-0.0017 -0.00017 1
192.349
ORB
Img1(SD=370)
0.87717 0.3120 -39.4250;
-0.18245 0.9335 153.09;
0.000195 -2.318e-05 1
392.9627
BRISK
Img1(SD=40)
-6.04023 8.526373 190.572;
-3.3182 4.677737 105.787;
-0.03215 0.045464 1
60.1318
BRIEF
Img1(SD=50)
0.82965 -0.91072 259.21;
0.2242 -1.16815 496.688;
0.000466 -0.00234 1
402.0998
AKAZE
Img1(SD=540)
-124.165 115.842 -9112.396;
-72.8067 6.07424 10343.44;
-0.23585 0.140151 0.9999
13627.9447
Harris
+ NCC
Img1(SD=150)
Image registration is failed
![Page 12: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/12.jpg)
12
(a)
(b)
0
2000
4000
6000
8000
10000
12000
14000
16000
0
15
30
45
60
90
12
0
15
0
18
0
20
0
21
5
24
0
26
0
29
0
32
0
35
0
38
0
41
0
43
0
46
0
49
0
52
0
54
0
57
0
60
0
FN d
iffe
ren
ce
Gaussian noise's standard deviation levle
Img2
Img3
Img4
Img5
Img6
0
5000
10000
15000
0
15
30
45
60
90
12
0
15
0
18
0
20
0
21
5
24
0
26
0
29
0
32
0
35
0
38
0
41
0
43
0
46
0
49
0
52
0
54
0
57
0
60
0
FN d
iffe
ren
ce
Gaussian noise's standard deviation variation level
FN difference comparison for AKAZE method
Img2
0
1000
2000
0 15 30 45 60 90 120 150 180 200 215 240 260 290 320 350 380 410 430
FN d
iffe
ren
ce
Gaussian noise's standard deviation variation level
FN difference comparison for AKAZE method
Img3
0
1000
0 10 20 30 40 50 60 80 100 120 140 160 180 195 205 215 230 250 260
FN d
iffe
ren
ce
Gaussian noise's standard deviation variation level
FN difference comparison for AKAZE method
Img4
![Page 13: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/13.jpg)
13
(c)
(d)
(e)
Fig. 3. (a) to (e) graph represent of FN difference of imag1 to image2, image3, image4, image5 and image6
correspond. If FN difference value greater than 10, then image registered failed/difficult.
0
500
1000
0 10 20 30 40 50 60 80 100 120 140 160 180 195 205 215 230 250
FN d
iffe
ren
ce
Gaussian noise's standard deviation variation level
FN difference comparison for AKAZE method
Img5
0
500
1000
1500
2000
2500
3000
3500
0 10 20 30 40 50 60 80 100 120 140 160 180 195 205 215 230 250
FN d
iffe
ren
ce
Gaussian noise's standard deviation variation level
FN difference comparison for AKAZE method
Img6
![Page 14: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/14.jpg)
14
Fig.4. Graph for at what level image registration is failed/difficult for different feature finding methods.
Fig.5. Graph for at what level image registration is failed/difficult for different feature finding methods.
As figure 3 represent at which level of Gaussian, image registration is failed/difficult. Based
on figure3, figure 4 graph generated for AKAZE method at which level image registration
failed/difficult. Other methods comparison is also represented in figure4 at which level of
Gaussian, image registration is failed/difficult for Graffiti dataset. From practical data of
Frobenious norm (FN) difference and checking of warped image manual, if FN difference
SURF SIFT ORB BRISK BRIEF AKAZEHarris+
HoG
Feature finding and matching keypoints method
Guassian noise image1 toimg2
100 350 370 40 50 540 150
Guassian noise image1 toimg3
100 20 180 10 30 415 120
Guassian noise image1 toimg4
5 5 20 5 5 200 10
Guassian noise image1 toimg5
5 5 5 5 5 10 5
Guassian noise image1 toimg6
5 5 5 5 5 5 5
0100200300400500600
FN d
iffe
ren
ce
Comparison of at what level of Gaussian noise, image registration is failed/difficult(Gaussian noise is added image1)
SURF SIFT ORB BRISK BRIEF AKAZE
img1 to img2 130 110 210 30 200 270
img1 to img3 120 20 140 20 80 260
img1 to img4 5 5 5 5 5 210
img1 to img5 5 5 5 5 5 35
img1 to img6 5 5 5 5 5 5
050
100150200250300
FN d
iffe
ren
ce
Comparison of at what level of Gaussian noise, image registration is failed/difficult(Gaussian noise is added all images)
![Page 15: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/15.jpg)
15
>10(threshold) value then image registration of image1 (Gaussian noise image) with image2
to image6. In table 2 and figure 3, Gaussian noise is added in image1 only. Key points are
detected and matched by different methods. In figure 5, Gaussian noises are added in all
images. Based on checking of Frbenious norm(FN) difference and warped image, it represent
at which level of image registration is failed/difficult. AKAZE provide good result compare
to other methods in figure 4 and figure 5.
Structured noise comparison result:
Table 3. Homography transformation matrix comparison (Structured noise in image1)
Structured noise added
in sense image1
(a)
(b)
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
Img2 0.7838 0.9459 0.8235 0.7736 1.2644 1.1553 46.7107
Img3 0.3145 1.9153 9.7172 2.4093 28.8768 20.2316 32.8552
Img4 2.0739 317.842 524.027 3178.23 530.86 240.997 1431.71
Img5 1.288 240.258 114.395 269.348 281.312 613.213 114.352
Img6 102.193 98.2396 103.657 63.5861 2.3158 203.378 266.477
0
500
1000
1500
2000
2500
3000
3500
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, Structured noise(250 to 350 block size)
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
Img2 0.2216 2.3319 0.7378 0.4206 11207.6 189.541 55.3778
Img3 0.4986 0.5718 318.907 2.1557 304.431 20.2316 50.7625
Img4 2.845 229.115 488.517 458.236 412.129 330.561 541.799
Img5 393.495 287.688 211.566 11342 67.6257 255.308 1344.2
Img6 318.274 104.061 102.542 72.5764 321.991 168.361 112.634
02000400060008000
1000012000
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, structured noise(250 to 450 block size)
![Page 16: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/16.jpg)
16
(c)
(d)
(e)
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
Img2 0.4384 5.4584 1.122 0.5971 340.074 775.164 26.0983
Img3 0.2185 6.6906 8.3632 325.854 543.438 373.097 32.579
Img4 0.47 229.015 487.577 450.138 610.084 625.007 1631.41
Img5 2562.99 234.677 171.261 142.839 52.4374 309.954 175.637
Img6 276.072 118.837 96.4783 265.813 285.916 397.821 1828.01
0500
10001500200025003000
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, structured noise(150 to 550 block size)
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
Img2 347.539 15.8095 0.238 361.431 685.859 369.037 96.8614
Img3 528.503 414.911 41.7409 13.7578 545.314 181.262 883.395
Img4 440.641 229.066 413.331 288.012 610.1 278.169 135.822
Img5 544.154 285.252 112.7 317.65 51.1359 687.727 1173.11
Img6 394.937 8.8291 210.328 329.039 322.82 228.883 201.462
0200400600800
100012001400
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, structured noise(50 to 650 block size)
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
img2 0.6321 3.5866 0.6678 0.1389 685.933 1.5495 51.6238
Img3 2.7199 1.2867 262.422 208.946 546.781 27.6354 52.7299
Img4 5.5264 170.508 295.138 494.753 610.028 405.463 1059.5
Img5 73.3148 240.232 114.458 48.0047 51.2985 351.355 31.0364
Img6 79.1124 275.3 84.1031 270.062 324.449 273.328 140.08
0200400600800
10001200
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, Pattern1
![Page 17: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/17.jpg)
17
(f)
(g)
(h)(periodic stationary)
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
img2 0.3023 3.8596 0.2998 1.8367 685.798 404.097 144.388
Img3 1.4888 287.411 47.8186 121.2 545.929 727.553 48.9259
Img4 502.254 94.5823 152.026 523.04 610.092 519.576 296.484
Img5 517.216 287.473 116.355 66.079 51.6598 914.936 487.511
Img6 205.682 276.918 131.841 108.145 324.451 120.973 1135.11
0200400600800
10001200
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, Pattern2
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
img2 0.857 505.692 0.862 361.448 685.831 12.1203 431.426
Img3 4.0476 434.835 930.069 120.811 545.324 118.978 153.373
Img4 73.7101 380.815 258.764 405.376 610.097 504.34 1559.2
Img5 22.0295 287.643 111.755 314.373 51.6134 390.933 1199.8
Img6 194.846 120.442 111.52 107.803 324.444 402.093 124.339
0200400600800
10001200140016001800
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, Pattern3
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
img2 0.5509 1.7922 0.2072 1.4435 685.785 16.4868 51.4254
Img3 2.1044 107.844 71.1856 515.775 545.314 336.637 51.934
Img4 373.93 467.942 326.23 458.495 610.094 4.7421 1160.45
Img5 553.621 287.68 192.873 518.576 51.0769 248.914 341.501
Img6 669.678 292.199 93.3779 275.016 324.454 151.648 862.714
0200400600800
100012001400
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, Pattern4
![Page 18: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/18.jpg)
18
(i)(periodic stationary)
Structured noise is added in image1 as table 3. AKAZE provide good result compared to
other methods. Graph represent how block size is changed, patterns changed, other methods
effects to image registration.
Fig. 6. Image1 (sense image) is with structured noise. image2 to image6(reference images) are without
noise. Results are generated based on AKAZE.
AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG
img2 0.7432 1.5181 0.7716 8.4816 1.2751 44.9296 285.387
Img3 4.3305 109.975 1.7098 14.6876 545.316 386.948 2319.28
Img4 369.24 467.994 304.019 503.417 610.095 435.031 6.9256
Img5 547.004 287.634 121.298 546.704 51.1745 597.891 55.1183
Img6 214.06 291.355 282.795 265.691 324.261 128.192 30.6119
0500
1000150020002500
Fro
bin
ou
s N
orm
Dif
fere
nce
Image1, Pattern5
blocksize-
250 to350(I,j)(
800 x640)
blocksize-
250 to450(I,j)
blocksize-50
to550(I,j)
blocksize-50to
600/650(I,j)
pattern1
pattern2
pattern3
pattern4
pattern5
block1
img2 0.7838 0.2216 0.4384 347.539 0.6321 0.3023 0.857 0.5509 0.7432 1.6217
img3 0.3145 0.4986 0.2185 528.503 2.7199 1.4888 4.0476 2.1044 4.3305 1.4579
img4 2.0739 2.845 0.47 440.641 5.5264 502.254 73.7101 373.93 369.24 2.4967
img5 1.288 393.495 2562.99 544.154 73.3148 517.216 22.0295 553.621 547.004 890.466
img6 102.193 318.274 276.072 394.937 79.1124 205.682 194.846 669.678 214.06 315.15
0
500
1000
1500
2000
2500
3000
Fro
be
nio
us
no
rm d
iffe
ren
ce
Structured noise is added to image1(sensed image). image2 to image6(reference images)are without noise. Results are generated
using AKAZE
![Page 19: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/19.jpg)
19
Fig. 7. Image1(sense image) is without noise, structured noise is added in image2 to image6(reference
image). FN comparison result of AKAZE method.
blocksize-
250 to350(I,j)(
800 x640)
blocksize-
250 to450(I,j)
blocksize-50
to550(I,j)
blocksize-50to
600/650(I,j)
pattern1
pattern2
pattern3
pattern4
pattern5
block1
img2 0.3959 0.6095 4.2098 667.278 0.5772 0.1073 0.111 0.3877 2.1129 0.8321
img3 0.4742 0.7433 44.8584 34.6319 0.1534 0.8284 0.7677 0.0349 0.9056 0.6872
img4 0.7832 0.5641 258.367 571.849 0.6675 483.054 63.9039 5.9698 4.8196 1.1126
img5 364.098 643.399 544.413 547.196 326.554 403.298 3538.24 644.642 205.107 7.6661
img6 40.1158 5.7294 324.774 315.242 193.576 238.003 66.2365 91.4454 93.4409 143.507
0
500
1000
1500
2000
2500
3000
3500
4000
Fro
be
nio
us
no
rm d
iffe
ren
ce
Structured noise is added to img2 to img6(refence images), image1(sense image) is without noise. Results are generated using
AKAZE
blocksize-
250 to350(I,j)(800 x640)
blocksize-
250 to450(I,j)
blocksize-50
to550(I,j)
blocksize-50to
600/650(I,j)
pattern1
pattern2
pattern3
pattern4
pattern5
block1
img2 0.6652 0.7139 5.2335 5852.1 1.9636 1.0753 132.5 1.2817 1.8241 1.1694
img3 1.7108 0.5872 623.33 156.63 0.4409 40.141 514.77 0.03 1.3823 0.7444
img4 0.2595 1.4459 511.03 81.893 38.87 407.42 582.15 304.16 495.98 442.62
img5 150 374.38 159.23 443.9 310.01 157.35 662.32 1097.5 312.64 427.05
img6 126.72 66.444 202.49 337.56 191.32 107.44 70.546 97.385 20.08 230.12
01000200030004000500060007000
Fro
be
nio
us
no
rm d
iffe
ren
ce
Structured noise is added to all images, Result are generated using AKAZE
![Page 20: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/20.jpg)
20
Fig. 8. Structured noise is added in all images. FN comparison result of AKAZE method.
As figure 6, 7 and 8, represents how structured noise effects to results. There are 3 cases for
checking of structured noise effects as:
1. Sensed image (i.e. image1) with structured noise. Reference images (image2
to image5) are without noise. (Result represent in figure6).
2. Sensed image (i.e. image1) without noise. Reference images (image2 to
image5) are with noise (result represent in figure7).
3. Structured noise are presence in all images.
Table 4. Homography transformation matrix comparison (Structured noise in all images as block)
Sense
Image1
Reference
Image2
Img2
Img3
Img4
Img5
Img6
7. Achievements with respect to objectives
Multi-view images are registered in presence of occlusion and non-occlusion.
Gaussian noise and structured noise are added in image for occlusion creation.
Images may have different scale, rotation, translation, illumination, noise etc.
AKAZE
SURF SIFT ORBBRIS
KBRIE
F
Harris+
HoG
img2, block1 1.1694 11.469 0.6667 34.948 572.45 28.37 53.409
img3, block1 0.7444 2.672 3.5597 0.8764 672.61 170.48 83.008
img4, block1 442.62 434.02 423.81 455.35 484.8 317.95 86.274
img5, block1 427.05 410.97 171.69 508.93 435.72 542.67 2722.2
img6, block1 230.12 41.356 141.51 158.93 151.99 298.33 696.49
0
500
1000
1500
2000
2500
3000
Fro
bin
ou
s N
orm
Dif
fere
nce
block are added in different areas of sense and reference images
![Page 21: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/21.jpg)
21
Frobenius norm is measured of homography transformation matrix of dataset and
different methods’ homography transformation matrix.
Results are compared:
o Noise is added in sensed image(i.e. image1) and reference image without
noise.
o Noise is added in reference images(i.e. image 2 to image6) and sensed image
without noise.
o Noise is added in sensed and reference images.
8. Conclusion
Key points and descriptors of features are found based on SURF, SIFT, ORB, FAST+BRIEF,
BRISK, Harris+NCC, AKAZE. Based on feature points, using of Brute force/FLANN
corresponding feature points of sense and reference images are found. RANSAC is used for
estimation of homography transformation matrix based on inlier points.
Gaussian and structured noise is added in images. For Gaussian noise, standard deviation
value is change. Results are found about what level of image registration is difficult/ failed in
Gaussian noise. For structured noise, different sizes of block are added in image, different
pattern are generated. AKAZE provide good result in presence of Gaussian/Structured noise
compared to other feature finding methods.
9. Copies of papers published and a list of all publications arising from the thesis
The publication detail for the work is as under for reference.
1. Darshana Mistry, Asim Banerjee, “Review: Image Registration”, International Journal
of Graphics & Image Processing, Volume 2, Issue 1, Pages 18-22, February 2012.
2. Darshana Mistry, Asim Banerjee, “Image Similarity based on Joint Histogram”,
International Conference on Advances in Engineering and Technology, SKN
Sinhghad Institute of Technology and Science. 2013.
3. Darshana Mistry, Asim Banerjee, Aditya Tatu, “Image Similarity based on Intensity
using Mutual Information”, International Journal of Computer Science and
Engineering Research and Development, Volume 3, Issue 2, Pages 1-8, 2013.
![Page 22: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/22.jpg)
22
4. Darshana Mistry, Asim Banerjee, “Comparison of Feature Detection and Matching
Approaches: SIFT and SURF”, Global Research and Development Journal of
Engineering, Volume 2, Issue 4, Pages 7-13, 2017.
10. References
Barbara Zitova, Jan Flusser, “Image Registration methods: a survey”, Image and vision computing 21, pp. 977-
1000, 2003. ,
Michael Linger, Ardeshir Goshtasby, “Aerial Image Registration for Tracking”, IEEE Transaction on
Geoscience and Remote Sensing, Volume 53, Issue 4, pp. 2137-2145, April 2015.
M.V.Wyavahare, P.M.Patil, H.K.Abhyankar, “Image Registration Techniques: An overview”, International
Journal of Signal Processing, Image Processing, and Pattern Reorganization, Vol. 2, No. 3,2009.
Hui Lin; Peijun Du; Weichang Zhao; Lianpeng Zhang; Huasheng Sun, “Image registration based on corner
detection and affine transformation”, International conference on Image and Signal Processing(CISP),pp. 2184-
2188,2010
David Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer
Vision, Volume 60, Issue 2, pp. 91-110, 2006.
Herbert Bay, Andreas E., Tinne T. and Luc Van G., “Speeded up Robust Feature (SURF)”, Journal of Computer
vision and image understanding, Volume 110, Issue 3, pp. 346-359, 2008.
Herbert Bay, Tinee T., and Luc Van G., “SURF: Speeded Up Robust Features”, Computer Vision- ECCV, pp.
404-417, 2008.
Pablo F. Aclantarilla, Adrian Bartoli, and Andrew J. Davison, “KAZE Features”, Springer European
Conference on Computer Vision(ECVV) 2012, Part VI, PP 214-227, 2012.
Pablo F. Aclantarilla, Jesus Nuevo, and Adrian Bartoli, “Fast Explicit Diffusion for Accelerated Features in
Nonlinear Scale Spaces”, British Machine Vision Conference(BMVC), September 2013.
Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011, November). ORB: An efficient alternative to SIFT
or SURF. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 2564-2571). IEEE.
Leutenegger, S., Chli, M., & Siegwart, R. Y. (2011, November). BRISK: Binary robust invariant scalable
keypoints. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 2548-2555). IEEE.
Calonder, M., Lepetit, V., Strecha, C., & Fua, P. (2010). Brief: Binary robust independent elementary
features. Computer Vision–ECCV 2010, 778-792.
Lue Juan, Oubong Gwun, “A Comparison of SIFT, PCA-SIFT and SURF”, International Journal of Image
Processing (IJIP), Volume 3, Issue 4, pp. 143-152, 2009.
Jian Wu, Zhiming Cui, Victor S. Sheng, Pengpeng Zhao, Dongliang Su, Shengrong Gong, “A Comparative
Study of SIFT and its Variants”, Measurement Science Review, Volume 13, Issue 3, pp. 122-131, 2013.
Rosten, E., & Drummond, T. (2006). Machine learning for high-speed corner detection. Computer vision–
ECCV 2006, 430-443.
Difference between SIFT and SURF, “https://www.quora.com/Image-Processing/Difference-between-SURF-
and-SIFT-where-and-when-to-use-this-algo” accessed on 23/11/2015.
![Page 23: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/23.jpg)
23
Utsav Shah, Darshana Mistry and Asim Banerjee, “Image Registration of Multi-View Satellite Images Using
Best Feature Points Detection and Matching Methods from SURF, SIFT and PCA-SIFT”, Journal of Emerging
Technologies and Innovative Research, Volume 1, Issue 1, pp. 8-18, 2014.
HKHUST video data, “https://www.youtube.com/watch=OOUOPnLbjkI” accessed on 6/5/15
Multi-View Image Dataset, “http://www.robots.ox.ac.uk/~vgg/data/data-aff.html” accessed on 7/7/16
Z. Zhang, G. Medioni and S.B. Kang, “Camera Calibration”, chapter 2, Emerging Topics in Computer Vision,
Prentice Hall Professional Technical Reference, pp. 4-43,2004.
C. Shu, “Geometric Model of Camera”, Comp 4900C, winter 2008.
C. Shu, “Epipolar Geometry”, Comp 4900C, winter 2008.
E. Malis, M. Vargas, “Deeper understanding of the homography decomposition for vision-based control”,
Research Report (RR) 6303, INRIA, 2007.
“Decomposition of Homography Transformation”, https://gist.github.com/inspirit/740979
“Camera Calibration”, https://prateekvjoshi.com/2014/05/31/understanding-camera-calibration/
B. Frank, C. Stachniss, G. Grisetti, K. Arras, and W. Burgrad, “Robotics 2 Camera Calibration”.
“Decompose Homography into Rotation matrix & Translation vector”, E. Zatepyakin,
https://gist.github.com/inspirit/740979
“Homography”, Chapter 5, http://shodhganga.inflibnet.ac.in/bitstream/10603/28874/11/11_chapter%205.pdf
David Kriegman, “Homography Estimation”, Computer Vision I,CSE 252A, Winter 2007
Darshana Mistry, Asim Banerjee, “Review:Image Registration”, International Journal of Graphics and Image
Processing, Volume 2, Issue 1, February 2012.
Ruhina Karani, Tanuja Sarode, “Image Registration using Discrete Cosine Transformation and Normalized
Cross Correlation”, International conference & Workshop on Recent Trends in Technology, 2012.
Manjusha Deshmukh, Udhav Bhosle, “A Survey of Image Registration”, International Journal of Image
Processing (IJIP), Volume 5, Issue 3,pp 245-269, 2011.
Taejung Kim and Yong-Jo Im, “Automatic Satellite Image Registration by Combination of Matching and
Random Sample Consensus”, IEEE Transactions on Geosciences and Remote Sensing, Volume 41,
Issue 5,PP: 1111-1117, MAY 2003
H.B. Kekre, Tanuja Sarode, Ruhina Karnai, “2D Satellite Image Registration Using Transform Based and
Correlation Based Methods”, International Journal of Advanced Computer Science and Applications, Volume
3,Issue 5, pp.66-72, 2011
Rong Zhang, “Automatic Computation of a Homography by RANSAC Algorithm”, ECE661 Computer Vision
Homework 4.
Konstantinos G. Derpanis, “Overview of the RANSAC Algorithm”,2010.
Paul Hackerbert, Fundamentals s of Texture Mapping and Image Warping, Master thesis, University of
California, 1989.
“Homography Transformation”, http://www.corrmap.com/features/homography_transformation.php, accessed
on 28th
September 2015.
![Page 24: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from](https://reader030.vdocuments.us/reader030/viewer/2022040211/5e6bfee55439ce71207a88a1/html5/thumbnails/24.jpg)
24
“Difference between Fundamental, Essential and Homography matrices”,
http://stackoverflow.com/questions/16088301/difference-between-fundamental-essential-and-homography-
matrices accessed on 16h March 2017
“Frobenius norm”, http://mathworld.wolfram.com/FrobeniusNorm.html.