ph.d. synopsis · combination of various data sources like in image fusion, change detection, and...

24
1 Multi-View Image Registration in the Presence of Occlusion Ph.D. Synopsis Submitted To: Gujarat Technological University For The Award Of Ph. D. Degree In Computer Engineering By: Darshana Mistry Enrolment No: 119997107004, Reg. No:6014 (Computer Engineering) Supervisor: Dr. Asim Banerjee, Professor, Dhirubhai Ambani Institute of Information and Communication, Gandhinagar, India. Co-Supervisor: Dr. Shishir Shah, Professor, Computer Science Department, University of Houston, USA.

Upload: others

Post on 12-Mar-2020

7 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

1

Multi-View Image Registration in the Presence

of Occlusion

Ph.D. Synopsis

Submitted To:

Gujarat Technological University

For The Award

Of

Ph. D. Degree

In

Computer Engineering

By:

Darshana Mistry

Enrolment No: 119997107004, Reg. No:6014

(Computer Engineering)

Supervisor:

Dr. Asim Banerjee,

Professor,

Dhirubhai Ambani Institute

of Information and

Communication,

Gandhinagar,

India.

Co-Supervisor:

Dr. Shishir Shah,

Professor,

Computer Science

Department,

University of Houston,

USA.

Page 2: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

2

Index

1. Abstract .................................................................................................................................. 3

2. Brief description on the state of the art of the research topic and Literature Review ........... 4

3.Definition of the Problem……………………………………………………………………5

4.Scope of the Work……………….…………………………………………………………………6

5. Original contribution by the thesis. ........................................................................................ 7

6. Methodology of Research, Results / Comparisons ................................................................ 7

7. Achievements with respect to objectives ............................................................................. 16

8. Conclusion ........................................................................................................................... 17

9. Copies of papers published and a list of all publications arising from the thesis ................ 17

10. References .......................................................................................................................... 21

Page 3: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

3

Multi- View Image Registration in Presence of Occlusions

1. Abstract

Image Registration (IR) is a process of aligning of two or more images which are taken at

different time, from different sensors, different views of same scene. Image registration is a

fundamental step in all images learning system in which final information are gained from the

combination of various data sources like in image fusion, change detection, and multichannel

restoration. In multi-view analysis, images are taken of the scene from different viewpoints.

The aim behind this methodology is to gain a larger 2D view or 3D representation of the

scanned scene. It is used in remote sensing (mosaicking of images of surveyed area),

computer vision (shape recovery stereo), tracking of an object etc. One of the main

challenges to multi-view image registration is occlusion in an image. Scene’s features points

visible in one view may be occluded in another view by the objects in the scene. The issues

of registering multi view images in the presence of occlusion are investigated in this doctoral

work.

There are three basic steps for image registration: i) feature detection ii) feature matching

iii)transform model estimation. Features are distinctive parts of the image scene such as

points, lines, edges, corners, blobs, T-junctions etc. It is necessary that detected feature points

are invariant to rotation, scaling, translation, illumination change. There are SURF(Speeded

Up Robust Feature), SIFT(Scale Invariant Feature Transform), ORB(Oriented Fast and

Rotated Brief), Brisk(Binary Robust Invariant Scalable Key points), FAST(key points are

detected)+BRIEF(descriptor based on BRIEF(Binary Robust Independent Elementary

Features)), Harris+NCC(feature points are detected based on Harris corner and using NCC

corners are matched) and AKAZE(Accelerated KAZE) methods are used for feature finding.

Pablo, et. al.2012, found KAZE feature which is robust method in noise, blur compared to

SIFT and SURF. It describes 2D features in non-linear scale space. AKAZE is scale and

rotation invariant and use low storage. It provides good result in presence of noise and blur of

image. False matching points are detected and removed by the RANSAC (Random Sample

and Consensus) algorithm. Homography is very useful to estimate the image transform. The

transformation between two views is a projective transformation and it is applied through

homography transformation matrix. Occlusion of image is applied through structured noise(

by providing different size of blocks/ patterns in images). AKAZE provides good result in

Page 4: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

4

presence of structured/Gaussian noise compared to other methods because it use Fast Explicit

Diffusion(FED) and modified Local difference binary descriptor used which provide good

result in nonlinear scale space.

2. Brief description on the state of the art of the research topic and Literature Review

Image registration is widely used in remote sensing, medical imaging, computer vision etc.

(Zitova, et al., 2003). In general, its applications can be divided into four main groups

according to the manner of the image acquisition: difference times (multi temporal analysis),

different sensors (multimodal analysis), different viewpoints (multi view analysis) and scene

to modal registration.

There are three basic steps for image registration: 1) feature detection 2) feature matching and

3) Transformation model estimation. Key points and descriptor of feature are done through

SIFT by David Lowe, et al. 2006, SURF by Herbert Bay, et al. 2008, SIFT and its variants by

Jian Wu, et al., 2013., FAST by Edward Rostan, et. al. 2006, BRIEF by Michael Calonder, et.

al. 2010, ORB by Ethan Rublee, et al., 2011, BRISK by Stefan Leutenegger, et al., 2011,

AKAZE by Pablo Alcantarillaalso,et al., 2013 etc. Lua Jaun, et al., 2009 compared different

data set with SIFT, SURF and PCA-SIFT and conclude that SURF provides good result in

illumination changes than others. Utsav, et al., 2014, explained satellite image registration

using SURF for feature detection, remove false matching pairs using Euclidean distance and

transformed images using affine transformation. SURF gives results those are comparable to

SIFT but at a much lower computation cost, and hence is faster than SIFT and its variants.

Manjusha, et al., 2011 matched location of maximum match using pixel based method,

Fourier based method, mutual information method. Michael,et al., 2015 registered aerial

images using SIFT for landmark detection, RANSAC to eliminate outlier points of an image

and homography matrix. He mentioned that it is very critical to do multi view image

registration in presence of occlusion because sufficient ground areas do not appear in image

due to occlusion or lack of contrast.

From different literature surveys, there are main challenges come to multi-view image

registration in presence of occlusion because correspondence matching pairs not found from

sensed and registered images.

For checking of occlusion effect, Gaussian and structured/block noise are added in image.

Key points and descriptors of features are found using of SIFT, SURF, BRISK,

FAST+BRIEF(key points using FAST and descriptors are based on BRIEF), ORB, Harris

Page 5: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

5

corner+NCC (features are detected by Harris corner and matched by NCC), AKAZE. After

finding corresponding matching points of sensed and reference images. Homography

transformation matrix is found using RANSAC estimator which removes corresponding

outlier points and perspective image is generated based on homography transformation

matrix.

Sensed Image Reference Image

R

Registered Image Fig. 1. Flow of proposed doctoral work of multi-view image registration in presence of occlusion.

3.Definition of the Problem

Based on literature survey, the need for development algorithm for multi-view image

registration in presence of occlusion was selected as the problem to address in the Doctoral

work because the feature points are not easily matched in occluded parts of the images. Multi

view image registration is done in different level of Gaussian noise and structured noise. This

is useful for generating 3D image from 2D images, object tracking, image mosaicing etc.

4. Scope of the work

Main goal of my research is to develop algorithm(s) to register multi-view images in

presence of occlusion (Gaussian and Structured noise).

Occlusion is representing in image as:

Compare Frobenious norm difference of homography transformation matrix of

dataset and method/s for multi view images in presence of occlusions.

RANSAC (Random Sample Consensus) is used to find best homography

transformation matrix

Find perspective image on first (sensed) image using homography

transformation matrix

Feature key points and descriptor are found using SURF, SIFT, ORB, FAST+BRIEF,

BRISK, Harris+NCC, AKAZE.

Features are matched using Brute Force/FLANN algorithm.

Page 6: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

6

1. Gaussian noise is added in image by changing of standard deviation value and mean

value constant as zero.

2. Structured noises are added in image and different pattern are generated in an image.

Graffiti-Multi view data set(http://www.robots.ox.ac.uk/~vgg/data/data-aff.html) is used.

(a)SD=10

(b)SD=50

(c)SD=100

(d)SD=150

(e)SD=200

(f)SD=250

(g)SD=300

(h)SD=350

(i)SD=500

(j)SD=600

Fig.1. Different levels of Gaussian noise are added in image1 of Graffiti multi view data set

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

(m)

(n)

(o)

Fig.2. Different levels of structured noise in image1 of Graffiti multi view data set from (a) to (j). Structure noise generated

different area in image1 and image2 to image5 of Graffiti multi view data set.

5. Original contribution by the thesis.

The presented research work addresses for multi-view image registration in presence of

occlusion. The novel contribution and achievements of this thesis are:

Generate occlusion dataset with adding of Gaussian and structured noise of images.

A simple yet powerful technique for multi-view occlusion image registration

Page 7: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

7

Techniques are used for registration of different multi-view, scale, rotation, light

condition, and noise (Gaussian and Structured).

Fast extraction and matching of feature points.

Results are found based on

a. Noise is added in only sense image(i.e. image1) and reference image without

noise

b. Noise is added in only reference images(i.e. image2 to image5) and sense

image without noise

c. Noise is added to sense and references images.

6. Methodology of Research, Results / Comparisons

Features are found on sensed and reference images using SURF, SIFT, ORB, FAST+BRIEF,

BRISK, Harris + NCC, AKAZE. There are two basic steps for finding feature points in

image:1) finding key points 2)finding descriptor. First image is converted into integral

images. Box filter is convoluted with integral image.

In SURF, Hassian matrix is found based on Laplacian of Gaussian and key points are

detected from determinant of Hassian matrix. In scale space, filter size become double in

every octave instead of double size of image. So, SURF’s key points are scale independent.

For rotation invariant, haar wavelet response in x and y direction with in circular

neighbourhood of 6s around the interest point. For extraction of descriptor, constructing

square region centre around the interest point and oriented along the orientation selected in 6s

region. The size of window is 20s. This region is split into 4x4 square sub-regions. The

wavelet responses dx and dy are summed up from set of entries in the feature vector. Each

sub-region has a four dimensional descriptor vector (∑ ∑ ∑ ∑ Features

are matched based on Euclidean distance of descriptor vector of sensed and reference images

using brute force method.

In SIFT, scale space extrema detect by Difference of Gaussian (DoG). The scale space of an

image is convoluted of variable scale Gaussian. Filter size is fixed and image resolution is

double at every octave for scale invariant feature detection. Key points are detected by its

stability. The key point descriptor allows for significant shift in gradient positions by creating

orientation histograms over 4 x 4 sample regions. For each orientation histogram, with the

length of each arrow corresponding to the magnitude of the histogram entry. Total size of

descriptor is 128 bits.

Page 8: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

8

FAST method find features from accelerated segment test and it generate circle of sixteen

pixels around the corner candidate. But it is not work good if circle of pixels less than 12 and

it does not produce multi scale features. BRIEF descriptor performs similar result as SIFT,

but it sensitive to in-plane rotation. ORB use oriented FAST detector and rotated BRIEF

overcome of limitation of FAST and BRIEF, but it does not provide good result in multi

scale.

BRISK use a novel scale-space FAST-based detector in combination with the bit-string

descriptor from intensity comparisons retrieved by dedicated sampling of each key point

neighbourhood.

Since corner feature represents a variation in the gradient in the image. Harris corner

detection use Sobel operator and smallest Eigen value is used for corner detection.

KAZE method find 2D multi scale feature detection and descriptors in nonlinear scale space

by Additive Operator Splitting (AOS) and variable conductance diffusion. It provide good

results in noise and blur than SIFT, SURF and other methods. It expensive than SURF but

good than SIFT. AKAZE(Accelerated KAZE) use Fast Explicit Diffusion (FED) embedded

in a pyramidal framework to dramatically speed-up feature detection in nonlinear scale

spaces and a Modified-Local Difference Binary (M-LDB) descriptor that is

highlyefficient,exploitsgradientinformationfromthenonlinearscalespace,isscaleand rotation

invariant and has low storage requirements.

Based on corresponding matching feature points, homography matrix is found using

RANSAC(Random Sample and Consensus) which is used direct linear transform to remove

false matching pairs. Sense image is warped based on homography matrix and warped sensed

image and reference images are blending based on alphabeta blending method. To check our

proposed homography matrix, it compare with homography matrix of graffiti

dataset(http://www.robots.ox.ac.uk/~vgg/data/data-aff.html) as Table 1. OpenCV

2.4.9(Image Processing library) is configured with Microsoft visual studio 13 using

cmake2.8(cross platform support tool) on Windows 10 OS.

Page 9: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

9

Table 1. Comparison of dataset’s homography marix and proposed method’s homography matrix.

Method Sense Image1 Reference

Image2

Homography

Transformation

matrix from dataset

Estimated

Homography

Trasformation

Matrix

Frobenious

norm

difference of

two

homograph

y

transformat

ion matrix

SURF

0.879769, 0.324543, -39.431;

-0.18390, 0.938472, 153.159;

0.0001964, -0.000016, 1;

0.87917, 0.31560, -40.199;

-0.18304, 0.93333, 154.46;

0.000196, -0.000017242, 1;

1.4567

SIFT

0.879769, 0.324543, -39.431;

-0.18390, 0.938472, 153.159;

0.0001964, -0.000016, 1;

0.8797, 0.324543, -39.43;

-0.1839,0.938472, 153.16

0.0001964, -0.0000160, 1

0.2322

ORB

0.879769, 0.324543, -39.431;

-0.18390, 0.938472, 153.159;

0.0001964, -0.000016, 1;

0.8797, 0.324543, -39.43

-0.1839,0.9385, 153.15784

0.0001964, -0.000016, 1

1.3803

BRISK

0.879769, 0.324543, -39.431;

-0.18390, 0.938472, 153.159;

0.0001964, -0.000016, 1;

0.9104, 0.3322 -49.162;

-0.1714, 0.9655, 147.440;

0.000232, 7.44E-06 1

2.7305

BRIEF

0.879769, 0.324543, -39.431;

-0.18390, 0.938472, 153.159;

0.0001964, -0.000016, 1;

0.92141, 0.2953, -45.410;

-0.1634,0.9299, 152.394;

0.000279, -7.87E-05, 1

0.8634

AKAZE

0.879769, 0.324543, -39.431;

-0.18390, 0.938472, 153.159;

0.0001964, -0.000016, 1;

0.88303,0.3154, -40.746;

-0.1828,0.9398, 152.9;

0.000199, -1.49E-05 1

0.0867

Harris

+ NCC

0.879769, 0.324543, -39.431;

-0.18390, 0.938472, 153.159;

0.0001964, -0.000016, 1;

-0.6706 0.21908 -58.487

-0.15334 -0.6299 90.7334

0.000125,-5.092e-05 -0.627

0.9516

Page 10: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

10

Fig.2. Graph for Graffiti multi-view dataset’s homography transformation matrix comparison with different feature key points

finding methods.

The Frobenius norm, sometimes also called the Euclidean norm (a term unfortunately also

used for the vector -norm), is matrix norm of an matrix defined as the square root of

the sum of the absolute squares of its elements,

The difference of frobenius norm of dataset’s homography transformation matrix and

different feature finding method’s homography transformation method are measured in table

1 and Fig. 2 in case of without adding any noise. If difference is near to zero, homography

transformation matrix is similar to dataset homography transformation matrix and that

method is good for image registration. AKAZE provides good result and FN difference is less

than to other methods.

For occluded multi-view image registration, Gaussian noise (mean value is zero and standard

derivation value is changed) and structured noise are added in different images.

SURF SIFT ORB BRISKFAST+BRIEF

AKAZEHarris

corner+NCC

img1 to img2 1.4567 0.2322 1.3803 2.7305 0.8634 0.0867 0.9516

img1 to img3 1.7923 3.6402 2.2611 83.0214 20.2316 0.5715 16.2073

img1 to img4 58.7529 388.0239 585.3934 543.192 423.5331 2.9907 434.6938

img1 to img5 147.6418 104.6867 63.7113 278.9678 40.3463 6.5684 235.8676

img1 to img6 287.596 173.5486 272.5032 712.1027 585.0277 68.4733 6.3434

0100200300400500600700800

Fro

be

niu

s n

orm

dif

fere

nce

Comparison of homography transformation matrix of dataset and feature detection methods

Page 11: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

11

Gaussian Noise comparison result:

Table 2. Homography transformation matrix comparison(Gaussian noise(standard deviation=10) in

image1)

Method Gaussian noise

sense Image1

Reference

Image2

Estimated

Homography

Trasformation

Matrix

Warp

transformed

image based on

homography

transformation

matrix

Frobenious

norm

difference of

two

homography

transformation

matrix

SURF

Img1(SD=100)

0.00437 -0.08342 43.4580;

0.05564 -1.09774 572.991;

9.71E-05 -0.00192 1

416.4776

SIFT

Img1(SD=350)

-0.50842 -0.0507 301.8041;

-0.29967 -0.0312 178.242;

-0.0017 -0.00017 1

192.349

ORB

Img1(SD=370)

0.87717 0.3120 -39.4250;

-0.18245 0.9335 153.09;

0.000195 -2.318e-05 1

392.9627

BRISK

Img1(SD=40)

-6.04023 8.526373 190.572;

-3.3182 4.677737 105.787;

-0.03215 0.045464 1

60.1318

BRIEF

Img1(SD=50)

0.82965 -0.91072 259.21;

0.2242 -1.16815 496.688;

0.000466 -0.00234 1

402.0998

AKAZE

Img1(SD=540)

-124.165 115.842 -9112.396;

-72.8067 6.07424 10343.44;

-0.23585 0.140151 0.9999

13627.9447

Harris

+ NCC

Img1(SD=150)

Image registration is failed

Page 12: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

12

(a)

(b)

0

2000

4000

6000

8000

10000

12000

14000

16000

0

15

30

45

60

90

12

0

15

0

18

0

20

0

21

5

24

0

26

0

29

0

32

0

35

0

38

0

41

0

43

0

46

0

49

0

52

0

54

0

57

0

60

0

FN d

iffe

ren

ce

Gaussian noise's standard deviation levle

Img2

Img3

Img4

Img5

Img6

0

5000

10000

15000

0

15

30

45

60

90

12

0

15

0

18

0

20

0

21

5

24

0

26

0

29

0

32

0

35

0

38

0

41

0

43

0

46

0

49

0

52

0

54

0

57

0

60

0

FN d

iffe

ren

ce

Gaussian noise's standard deviation variation level

FN difference comparison for AKAZE method

Img2

0

1000

2000

0 15 30 45 60 90 120 150 180 200 215 240 260 290 320 350 380 410 430

FN d

iffe

ren

ce

Gaussian noise's standard deviation variation level

FN difference comparison for AKAZE method

Img3

0

1000

0 10 20 30 40 50 60 80 100 120 140 160 180 195 205 215 230 250 260

FN d

iffe

ren

ce

Gaussian noise's standard deviation variation level

FN difference comparison for AKAZE method

Img4

Page 13: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

13

(c)

(d)

(e)

Fig. 3. (a) to (e) graph represent of FN difference of imag1 to image2, image3, image4, image5 and image6

correspond. If FN difference value greater than 10, then image registered failed/difficult.

0

500

1000

0 10 20 30 40 50 60 80 100 120 140 160 180 195 205 215 230 250

FN d

iffe

ren

ce

Gaussian noise's standard deviation variation level

FN difference comparison for AKAZE method

Img5

0

500

1000

1500

2000

2500

3000

3500

0 10 20 30 40 50 60 80 100 120 140 160 180 195 205 215 230 250

FN d

iffe

ren

ce

Gaussian noise's standard deviation variation level

FN difference comparison for AKAZE method

Img6

Page 14: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

14

Fig.4. Graph for at what level image registration is failed/difficult for different feature finding methods.

Fig.5. Graph for at what level image registration is failed/difficult for different feature finding methods.

As figure 3 represent at which level of Gaussian, image registration is failed/difficult. Based

on figure3, figure 4 graph generated for AKAZE method at which level image registration

failed/difficult. Other methods comparison is also represented in figure4 at which level of

Gaussian, image registration is failed/difficult for Graffiti dataset. From practical data of

Frobenious norm (FN) difference and checking of warped image manual, if FN difference

SURF SIFT ORB BRISK BRIEF AKAZEHarris+

HoG

Feature finding and matching keypoints method

Guassian noise image1 toimg2

100 350 370 40 50 540 150

Guassian noise image1 toimg3

100 20 180 10 30 415 120

Guassian noise image1 toimg4

5 5 20 5 5 200 10

Guassian noise image1 toimg5

5 5 5 5 5 10 5

Guassian noise image1 toimg6

5 5 5 5 5 5 5

0100200300400500600

FN d

iffe

ren

ce

Comparison of at what level of Gaussian noise, image registration is failed/difficult(Gaussian noise is added image1)

SURF SIFT ORB BRISK BRIEF AKAZE

img1 to img2 130 110 210 30 200 270

img1 to img3 120 20 140 20 80 260

img1 to img4 5 5 5 5 5 210

img1 to img5 5 5 5 5 5 35

img1 to img6 5 5 5 5 5 5

050

100150200250300

FN d

iffe

ren

ce

Comparison of at what level of Gaussian noise, image registration is failed/difficult(Gaussian noise is added all images)

Page 15: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

15

>10(threshold) value then image registration of image1 (Gaussian noise image) with image2

to image6. In table 2 and figure 3, Gaussian noise is added in image1 only. Key points are

detected and matched by different methods. In figure 5, Gaussian noises are added in all

images. Based on checking of Frbenious norm(FN) difference and warped image, it represent

at which level of image registration is failed/difficult. AKAZE provide good result compare

to other methods in figure 4 and figure 5.

Structured noise comparison result:

Table 3. Homography transformation matrix comparison (Structured noise in image1)

Structured noise added

in sense image1

(a)

(b)

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

Img2 0.7838 0.9459 0.8235 0.7736 1.2644 1.1553 46.7107

Img3 0.3145 1.9153 9.7172 2.4093 28.8768 20.2316 32.8552

Img4 2.0739 317.842 524.027 3178.23 530.86 240.997 1431.71

Img5 1.288 240.258 114.395 269.348 281.312 613.213 114.352

Img6 102.193 98.2396 103.657 63.5861 2.3158 203.378 266.477

0

500

1000

1500

2000

2500

3000

3500

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, Structured noise(250 to 350 block size)

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

Img2 0.2216 2.3319 0.7378 0.4206 11207.6 189.541 55.3778

Img3 0.4986 0.5718 318.907 2.1557 304.431 20.2316 50.7625

Img4 2.845 229.115 488.517 458.236 412.129 330.561 541.799

Img5 393.495 287.688 211.566 11342 67.6257 255.308 1344.2

Img6 318.274 104.061 102.542 72.5764 321.991 168.361 112.634

02000400060008000

1000012000

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, structured noise(250 to 450 block size)

Page 16: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

16

(c)

(d)

(e)

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

Img2 0.4384 5.4584 1.122 0.5971 340.074 775.164 26.0983

Img3 0.2185 6.6906 8.3632 325.854 543.438 373.097 32.579

Img4 0.47 229.015 487.577 450.138 610.084 625.007 1631.41

Img5 2562.99 234.677 171.261 142.839 52.4374 309.954 175.637

Img6 276.072 118.837 96.4783 265.813 285.916 397.821 1828.01

0500

10001500200025003000

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, structured noise(150 to 550 block size)

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

Img2 347.539 15.8095 0.238 361.431 685.859 369.037 96.8614

Img3 528.503 414.911 41.7409 13.7578 545.314 181.262 883.395

Img4 440.641 229.066 413.331 288.012 610.1 278.169 135.822

Img5 544.154 285.252 112.7 317.65 51.1359 687.727 1173.11

Img6 394.937 8.8291 210.328 329.039 322.82 228.883 201.462

0200400600800

100012001400

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, structured noise(50 to 650 block size)

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

img2 0.6321 3.5866 0.6678 0.1389 685.933 1.5495 51.6238

Img3 2.7199 1.2867 262.422 208.946 546.781 27.6354 52.7299

Img4 5.5264 170.508 295.138 494.753 610.028 405.463 1059.5

Img5 73.3148 240.232 114.458 48.0047 51.2985 351.355 31.0364

Img6 79.1124 275.3 84.1031 270.062 324.449 273.328 140.08

0200400600800

10001200

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, Pattern1

Page 17: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

17

(f)

(g)

(h)(periodic stationary)

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

img2 0.3023 3.8596 0.2998 1.8367 685.798 404.097 144.388

Img3 1.4888 287.411 47.8186 121.2 545.929 727.553 48.9259

Img4 502.254 94.5823 152.026 523.04 610.092 519.576 296.484

Img5 517.216 287.473 116.355 66.079 51.6598 914.936 487.511

Img6 205.682 276.918 131.841 108.145 324.451 120.973 1135.11

0200400600800

10001200

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, Pattern2

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

img2 0.857 505.692 0.862 361.448 685.831 12.1203 431.426

Img3 4.0476 434.835 930.069 120.811 545.324 118.978 153.373

Img4 73.7101 380.815 258.764 405.376 610.097 504.34 1559.2

Img5 22.0295 287.643 111.755 314.373 51.6134 390.933 1199.8

Img6 194.846 120.442 111.52 107.803 324.444 402.093 124.339

0200400600800

10001200140016001800

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, Pattern3

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

img2 0.5509 1.7922 0.2072 1.4435 685.785 16.4868 51.4254

Img3 2.1044 107.844 71.1856 515.775 545.314 336.637 51.934

Img4 373.93 467.942 326.23 458.495 610.094 4.7421 1160.45

Img5 553.621 287.68 192.873 518.576 51.0769 248.914 341.501

Img6 669.678 292.199 93.3779 275.016 324.454 151.648 862.714

0200400600800

100012001400

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, Pattern4

Page 18: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

18

(i)(periodic stationary)

Structured noise is added in image1 as table 3. AKAZE provide good result compared to

other methods. Graph represent how block size is changed, patterns changed, other methods

effects to image registration.

Fig. 6. Image1 (sense image) is with structured noise. image2 to image6(reference images) are without

noise. Results are generated based on AKAZE.

AKAZE SURF SIFT ORB BRISK BRIEFHarris+ HoG

img2 0.7432 1.5181 0.7716 8.4816 1.2751 44.9296 285.387

Img3 4.3305 109.975 1.7098 14.6876 545.316 386.948 2319.28

Img4 369.24 467.994 304.019 503.417 610.095 435.031 6.9256

Img5 547.004 287.634 121.298 546.704 51.1745 597.891 55.1183

Img6 214.06 291.355 282.795 265.691 324.261 128.192 30.6119

0500

1000150020002500

Fro

bin

ou

s N

orm

Dif

fere

nce

Image1, Pattern5

blocksize-

250 to350(I,j)(

800 x640)

blocksize-

250 to450(I,j)

blocksize-50

to550(I,j)

blocksize-50to

600/650(I,j)

pattern1

pattern2

pattern3

pattern4

pattern5

block1

img2 0.7838 0.2216 0.4384 347.539 0.6321 0.3023 0.857 0.5509 0.7432 1.6217

img3 0.3145 0.4986 0.2185 528.503 2.7199 1.4888 4.0476 2.1044 4.3305 1.4579

img4 2.0739 2.845 0.47 440.641 5.5264 502.254 73.7101 373.93 369.24 2.4967

img5 1.288 393.495 2562.99 544.154 73.3148 517.216 22.0295 553.621 547.004 890.466

img6 102.193 318.274 276.072 394.937 79.1124 205.682 194.846 669.678 214.06 315.15

0

500

1000

1500

2000

2500

3000

Fro

be

nio

us

no

rm d

iffe

ren

ce

Structured noise is added to image1(sensed image). image2 to image6(reference images)are without noise. Results are generated

using AKAZE

Page 19: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

19

Fig. 7. Image1(sense image) is without noise, structured noise is added in image2 to image6(reference

image). FN comparison result of AKAZE method.

blocksize-

250 to350(I,j)(

800 x640)

blocksize-

250 to450(I,j)

blocksize-50

to550(I,j)

blocksize-50to

600/650(I,j)

pattern1

pattern2

pattern3

pattern4

pattern5

block1

img2 0.3959 0.6095 4.2098 667.278 0.5772 0.1073 0.111 0.3877 2.1129 0.8321

img3 0.4742 0.7433 44.8584 34.6319 0.1534 0.8284 0.7677 0.0349 0.9056 0.6872

img4 0.7832 0.5641 258.367 571.849 0.6675 483.054 63.9039 5.9698 4.8196 1.1126

img5 364.098 643.399 544.413 547.196 326.554 403.298 3538.24 644.642 205.107 7.6661

img6 40.1158 5.7294 324.774 315.242 193.576 238.003 66.2365 91.4454 93.4409 143.507

0

500

1000

1500

2000

2500

3000

3500

4000

Fro

be

nio

us

no

rm d

iffe

ren

ce

Structured noise is added to img2 to img6(refence images), image1(sense image) is without noise. Results are generated using

AKAZE

blocksize-

250 to350(I,j)(800 x640)

blocksize-

250 to450(I,j)

blocksize-50

to550(I,j)

blocksize-50to

600/650(I,j)

pattern1

pattern2

pattern3

pattern4

pattern5

block1

img2 0.6652 0.7139 5.2335 5852.1 1.9636 1.0753 132.5 1.2817 1.8241 1.1694

img3 1.7108 0.5872 623.33 156.63 0.4409 40.141 514.77 0.03 1.3823 0.7444

img4 0.2595 1.4459 511.03 81.893 38.87 407.42 582.15 304.16 495.98 442.62

img5 150 374.38 159.23 443.9 310.01 157.35 662.32 1097.5 312.64 427.05

img6 126.72 66.444 202.49 337.56 191.32 107.44 70.546 97.385 20.08 230.12

01000200030004000500060007000

Fro

be

nio

us

no

rm d

iffe

ren

ce

Structured noise is added to all images, Result are generated using AKAZE

Page 20: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

20

Fig. 8. Structured noise is added in all images. FN comparison result of AKAZE method.

As figure 6, 7 and 8, represents how structured noise effects to results. There are 3 cases for

checking of structured noise effects as:

1. Sensed image (i.e. image1) with structured noise. Reference images (image2

to image5) are without noise. (Result represent in figure6).

2. Sensed image (i.e. image1) without noise. Reference images (image2 to

image5) are with noise (result represent in figure7).

3. Structured noise are presence in all images.

Table 4. Homography transformation matrix comparison (Structured noise in all images as block)

Sense

Image1

Reference

Image2

Img2

Img3

Img4

Img5

Img6

7. Achievements with respect to objectives

Multi-view images are registered in presence of occlusion and non-occlusion.

Gaussian noise and structured noise are added in image for occlusion creation.

Images may have different scale, rotation, translation, illumination, noise etc.

AKAZE

SURF SIFT ORBBRIS

KBRIE

F

Harris+

HoG

img2, block1 1.1694 11.469 0.6667 34.948 572.45 28.37 53.409

img3, block1 0.7444 2.672 3.5597 0.8764 672.61 170.48 83.008

img4, block1 442.62 434.02 423.81 455.35 484.8 317.95 86.274

img5, block1 427.05 410.97 171.69 508.93 435.72 542.67 2722.2

img6, block1 230.12 41.356 141.51 158.93 151.99 298.33 696.49

0

500

1000

1500

2000

2500

3000

Fro

bin

ou

s N

orm

Dif

fere

nce

block are added in different areas of sense and reference images

Page 21: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

21

Frobenius norm is measured of homography transformation matrix of dataset and

different methods’ homography transformation matrix.

Results are compared:

o Noise is added in sensed image(i.e. image1) and reference image without

noise.

o Noise is added in reference images(i.e. image 2 to image6) and sensed image

without noise.

o Noise is added in sensed and reference images.

8. Conclusion

Key points and descriptors of features are found based on SURF, SIFT, ORB, FAST+BRIEF,

BRISK, Harris+NCC, AKAZE. Based on feature points, using of Brute force/FLANN

corresponding feature points of sense and reference images are found. RANSAC is used for

estimation of homography transformation matrix based on inlier points.

Gaussian and structured noise is added in images. For Gaussian noise, standard deviation

value is change. Results are found about what level of image registration is difficult/ failed in

Gaussian noise. For structured noise, different sizes of block are added in image, different

pattern are generated. AKAZE provide good result in presence of Gaussian/Structured noise

compared to other feature finding methods.

9. Copies of papers published and a list of all publications arising from the thesis

The publication detail for the work is as under for reference.

1. Darshana Mistry, Asim Banerjee, “Review: Image Registration”, International Journal

of Graphics & Image Processing, Volume 2, Issue 1, Pages 18-22, February 2012.

2. Darshana Mistry, Asim Banerjee, “Image Similarity based on Joint Histogram”,

International Conference on Advances in Engineering and Technology, SKN

Sinhghad Institute of Technology and Science. 2013.

3. Darshana Mistry, Asim Banerjee, Aditya Tatu, “Image Similarity based on Intensity

using Mutual Information”, International Journal of Computer Science and

Engineering Research and Development, Volume 3, Issue 2, Pages 1-8, 2013.

Page 22: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

22

4. Darshana Mistry, Asim Banerjee, “Comparison of Feature Detection and Matching

Approaches: SIFT and SURF”, Global Research and Development Journal of

Engineering, Volume 2, Issue 4, Pages 7-13, 2017.

10. References

Barbara Zitova, Jan Flusser, “Image Registration methods: a survey”, Image and vision computing 21, pp. 977-

1000, 2003. ,

Michael Linger, Ardeshir Goshtasby, “Aerial Image Registration for Tracking”, IEEE Transaction on

Geoscience and Remote Sensing, Volume 53, Issue 4, pp. 2137-2145, April 2015.

M.V.Wyavahare, P.M.Patil, H.K.Abhyankar, “Image Registration Techniques: An overview”, International

Journal of Signal Processing, Image Processing, and Pattern Reorganization, Vol. 2, No. 3,2009.

Hui Lin; Peijun Du; Weichang Zhao; Lianpeng Zhang; Huasheng Sun, “Image registration based on corner

detection and affine transformation”, International conference on Image and Signal Processing(CISP),pp. 2184-

2188,2010

David Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer

Vision, Volume 60, Issue 2, pp. 91-110, 2006.

Herbert Bay, Andreas E., Tinne T. and Luc Van G., “Speeded up Robust Feature (SURF)”, Journal of Computer

vision and image understanding, Volume 110, Issue 3, pp. 346-359, 2008.

Herbert Bay, Tinee T., and Luc Van G., “SURF: Speeded Up Robust Features”, Computer Vision- ECCV, pp.

404-417, 2008.

Pablo F. Aclantarilla, Adrian Bartoli, and Andrew J. Davison, “KAZE Features”, Springer European

Conference on Computer Vision(ECVV) 2012, Part VI, PP 214-227, 2012.

Pablo F. Aclantarilla, Jesus Nuevo, and Adrian Bartoli, “Fast Explicit Diffusion for Accelerated Features in

Nonlinear Scale Spaces”, British Machine Vision Conference(BMVC), September 2013.

Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011, November). ORB: An efficient alternative to SIFT

or SURF. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 2564-2571). IEEE.

Leutenegger, S., Chli, M., & Siegwart, R. Y. (2011, November). BRISK: Binary robust invariant scalable

keypoints. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 2548-2555). IEEE.

Calonder, M., Lepetit, V., Strecha, C., & Fua, P. (2010). Brief: Binary robust independent elementary

features. Computer Vision–ECCV 2010, 778-792.

Lue Juan, Oubong Gwun, “A Comparison of SIFT, PCA-SIFT and SURF”, International Journal of Image

Processing (IJIP), Volume 3, Issue 4, pp. 143-152, 2009.

Jian Wu, Zhiming Cui, Victor S. Sheng, Pengpeng Zhao, Dongliang Su, Shengrong Gong, “A Comparative

Study of SIFT and its Variants”, Measurement Science Review, Volume 13, Issue 3, pp. 122-131, 2013.

Rosten, E., & Drummond, T. (2006). Machine learning for high-speed corner detection. Computer vision–

ECCV 2006, 430-443.

Difference between SIFT and SURF, “https://www.quora.com/Image-Processing/Difference-between-SURF-

and-SIFT-where-and-when-to-use-this-algo” accessed on 23/11/2015.

Page 23: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

23

Utsav Shah, Darshana Mistry and Asim Banerjee, “Image Registration of Multi-View Satellite Images Using

Best Feature Points Detection and Matching Methods from SURF, SIFT and PCA-SIFT”, Journal of Emerging

Technologies and Innovative Research, Volume 1, Issue 1, pp. 8-18, 2014.

HKHUST video data, “https://www.youtube.com/watch=OOUOPnLbjkI” accessed on 6/5/15

Multi-View Image Dataset, “http://www.robots.ox.ac.uk/~vgg/data/data-aff.html” accessed on 7/7/16

Z. Zhang, G. Medioni and S.B. Kang, “Camera Calibration”, chapter 2, Emerging Topics in Computer Vision,

Prentice Hall Professional Technical Reference, pp. 4-43,2004.

C. Shu, “Geometric Model of Camera”, Comp 4900C, winter 2008.

C. Shu, “Epipolar Geometry”, Comp 4900C, winter 2008.

E. Malis, M. Vargas, “Deeper understanding of the homography decomposition for vision-based control”,

Research Report (RR) 6303, INRIA, 2007.

“Decomposition of Homography Transformation”, https://gist.github.com/inspirit/740979

“Camera Calibration”, https://prateekvjoshi.com/2014/05/31/understanding-camera-calibration/

B. Frank, C. Stachniss, G. Grisetti, K. Arras, and W. Burgrad, “Robotics 2 Camera Calibration”.

“Decompose Homography into Rotation matrix & Translation vector”, E. Zatepyakin,

https://gist.github.com/inspirit/740979

“Homography”, Chapter 5, http://shodhganga.inflibnet.ac.in/bitstream/10603/28874/11/11_chapter%205.pdf

David Kriegman, “Homography Estimation”, Computer Vision I,CSE 252A, Winter 2007

Darshana Mistry, Asim Banerjee, “Review:Image Registration”, International Journal of Graphics and Image

Processing, Volume 2, Issue 1, February 2012.

Ruhina Karani, Tanuja Sarode, “Image Registration using Discrete Cosine Transformation and Normalized

Cross Correlation”, International conference & Workshop on Recent Trends in Technology, 2012.

Manjusha Deshmukh, Udhav Bhosle, “A Survey of Image Registration”, International Journal of Image

Processing (IJIP), Volume 5, Issue 3,pp 245-269, 2011.

Taejung Kim and Yong-Jo Im, “Automatic Satellite Image Registration by Combination of Matching and

Random Sample Consensus”, IEEE Transactions on Geosciences and Remote Sensing, Volume 41,

Issue 5,PP: 1111-1117, MAY 2003

H.B. Kekre, Tanuja Sarode, Ruhina Karnai, “2D Satellite Image Registration Using Transform Based and

Correlation Based Methods”, International Journal of Advanced Computer Science and Applications, Volume

3,Issue 5, pp.66-72, 2011

Rong Zhang, “Automatic Computation of a Homography by RANSAC Algorithm”, ECE661 Computer Vision

Homework 4.

Konstantinos G. Derpanis, “Overview of the RANSAC Algorithm”,2010.

Paul Hackerbert, Fundamentals s of Texture Mapping and Image Warping, Master thesis, University of

California, 1989.

“Homography Transformation”, http://www.corrmap.com/features/homography_transformation.php, accessed

on 28th

September 2015.

Page 24: Ph.D. Synopsis · combination of various data sources like in image fusion, change detection, and multichannel restoration. In multi-view analysis, images are taken of the scene from

24

“Difference between Fundamental, Essential and Homography matrices”,

http://stackoverflow.com/questions/16088301/difference-between-fundamental-essential-and-homography-

matrices accessed on 16h March 2017

“Frobenius norm”, http://mathworld.wolfram.com/FrobeniusNorm.html.