i
IMPLEMENTATION OF DIAMOND SEARCH (DS) ALGORITHM FOR
MOTION ESTIMATION USING MATLAB SOFTWARE
SIOW CHAN HOEL
This report is submitted in partial fulfillment of the requirements for the award of
Bachelor of Electronic Engineering (Telecommunication Electronics) With Honours
Faculty of Electronic and Computer Engineering
University Technical Malaysia Malacca
APRIL 2010
ii
UNIVERSTI TEKNIKAL MALAYSIA MELAKA FAKULTI KEJURUTERAAN ELEKTRONIK DAN KEJURUTERAAN KOMPUTER
BORANG PENGESAHAN STATUS LAPORAN
PROJEK SARJANA MUDA II
Tajuk Projek :
Sesi Pengajian
:
2009/2010
Saya SIOW CHAN HOEL mengaku membenarkan Laporan Projek Sarjana Muda ini disimpan di
Perpustakaan dengan syarat-syarat kegunaan seperti berikut:
1. Laporan adalah hakmilik Universiti Teknikal Malaysia Melaka.
2. Perpustakaan dibenarkan membuat salinan untuk tujuan pengajian sahaja.
3. Perpustakaan dibenarkan membuat salinan laporan ini sebagai bahan pertukaran antara institusi
pengajian tinggi.
4. Sila tandakan ( √ ) :
SULIT*
(Mengandungi maklumat yang berdarjah keselamatan atau
kepentingan Malaysia seperti yang termaktub di dalam AKTA
RAHSIA RASMI 1972)
TERHAD*
(Mengandungi maklumat terhad yang telah ditentukan oleh
organisasi/badan di mana penyelidikan dijalankan)
Disahkan oleh:
…………………………………………
(TANDATANGAN PENULIS)
…………………………………………
(COP DAN TANDATANGAN PENYELIA)
Alamat Tetap: No. 8,
Jalan Udang Gantung 8,
Taman Megah Kepong
52100 K.L.
Tarikh: April 2010 Tarikh: April 2010
TIDAK TERHAD
Implementation of Diamond Search (DS) Algorithm for
Motion Estimation using MATLAB software
iii
“I hereby declare that this report is the result of my own work except for quotes as
cited in the references.”
Signature :
Author :
Date :
iv
“I hereby declare that I have read this report and in my opinion this report is
sufficient in terms of the scope and quality for the award of bachelor of Electronic
Engineering (Computer Engineering) With Honours.”
Signature :
Supervisor’s Name :
Date :
vi
ACKNOWLEDGEMENT
.
I would like to express gratitude and thanks to my supervisor, Mr. Redzuan
Bin Abdul Manap for his support and patience throughout the duration of the project.
His encouragement and guidance are truly appreciated. Otherwise, this project has
not been possible. I have learnt a lot under his guidance. In addition to that, I also
would like to thanks my friends: Kong Khee Kien and Wong Cheong Lun. They are
always help me when I face problem in this project. I am also grateful to my all
friends who help me and giving me opinion during implementation of this project.
vii
ABSTRACT
In order to achieve high compression ratio in video coding as proposed in this
project, a technique known as Block Matching Motion Estimation has been widely
adopted in various coding standards. This technique is implemented conventionally
by exhaustively testing all the candidate blocks within the search window .This type
of implementation, called Full Search (FS) Algorithm, gives the optimum solution.
However, substantial amount of computational workload is required in this algorithm.
To overcome this drawback, many fast Block Matching Algorithm (BMAs) have
been proposed and developed .Different search patterns and strategies are exploited
in these algorithms in order to find the optimum motion vector with minimal number
of required search points. The objectives of this project are to develop and implement
Diamond Search (DS) algorithm in MATLAB. Besides, the obtained result is
compared to FS algorithm as well as other common fast BMAs. Finally, a functional
MATLAB program code is produced.
viii
ABSTRAK
Projek ini bertujuan untuk mencapai nisbah mampatan video yang tinggi
dalam pengaturcaraan video. Satu teknik dikenali sebagai Block Matching Motion
Estimation secara konvenskyen telah menggunakan satu algoritma yang dikenali
sebagai Full Search untuk menguji setiap blok dalam tingkap pencarian. Walaupun
algoritma ini memberi kualiti video yang optima, namum, ianya mewujudkan beban
pemprosesan yang banyak dan seterusnya melambatkan pemprosesan tersebut. Bagi
mengatasi masalah pemprosesan itu, banyak algoritma telah dikaji dan dibangunkan.
Pelbagai corak dan strategi telah dieksploitasi dalam algoritma-algoritma ini untuk
vektor pengerakan yang optima dengan titik pencarian yang minimum. Tujuan utama
projek ini ialah membangunkan dan melaksanakan algoritma yang bernama
algoritma Diamond Search dalam MATLAB. Hasil prestasi algoritma akan
dianalisiskan dan dibezakan dengan algoritma Full Search serta algoritma-algoritma
yang lain. Akhirnya, satu kod pengaturcaran yang berfungsi dihasilkan.
ix
TABLE OF CONTENTS
CHAPTER TITLE PAGE
ACKNOWLEDGEMENT vi
ABSTRACT vii
ABSTRAK viii
TABLE OF CONTENTS ix
LIST OF TABLES xii
LIST OF FIGURES xiii
LIST OF ABBREVIATION xv
LIST OF APPENDIX xvii
I INTRODUCTION
1.1 PROJECT INTRODUCTION 1
1.2 PROBLEM STATEMENTS 2
1.3 OBJECTIVES 2
1.4 SCOPES OF WORK 3
1.5 THESIS STRUCTURE 3
II LITERATURE REVIEW
2.1 VIDEO CODING 5
2.2 MOTION ESTIMATION 6
2.3 BLOCK MATCHING ALGORITHM (BMA) 7
2.2.1 DIAMOND SEARCH ALGORITH 10
x
2.3.2 FOUR STEP SEARCH ALGORITHM 13
2.3.3 NEW THREE STEP SEARCH ALGORITHM 17
2.4 VIDEO SEQUENCE FORMAT 19
III METHODOLOGY
3.1 METHODOLOGY 20
3.1.1 PROJECT PLANNING 20
3.1.2 LITERATURE REVIEW 21
3.1.3 VIDEO UPLOADING USING MALTAB 21
3.1.4 FRAME EXTRACTION 22
3.1.5 BLOCK CONSTRUCTION 22
3.1.6 IMPLEMENTATION OF DIAMOND
SEARCH ALGORITHM 23
3.1.7 RECONSTRUCTION OF PREDICTED FRAME 24
3.1.8 PERFORMANCE ANALYSIS 25
3.1.8.1 PERFORMANCE ANALYSIS
PARAMETER 25
3.1.9 PRESENTATION OF RESULT 26
IV RESULT AND DISCUSSION
4.1 INTRODUCTION 27
4.2 FIRST STAGE RESULT 28
4.3 SECOND STAGE RESULT 29
4.4 THIRD STAGE RESULTS 37
4.5 PREDICTED FRAME 45
4.6 DISCUSSION 47
xii
LIST OF TABLES
NO TITLE PAGE
3.1 Video sequence being used 23
4.1 Average points for single frame simulation 28
4.2 Average PSNR for single frame simulation 28
4.3 Elapsed time for single frame simulation 29
4.4 Average points for 30 frames simulation 29
4.5 Average PSNR for 30 frames simulation 30
4.6 Elapsed time for 30 frames simulation 30
4.7 Average points for 100 frames simulation 37
4.8 Average PSNR for 100 frames simulation 37
4.9 Elapsed time for 100 frames simulation 38
xiii
LIST OF FIGURES
NO TITLE PAGE
2.1 Block based motion estimation 9
2.2 An appropriate search pattern support-circular area with radium 10
of 2 pixels.
2.3 LDSP 10
2.4 SDSP 10
2.5 Three possible cases 11
2.6 Search path example 12
2.7 Search patterns of the FSS. 14
2.8 Two large motion search paths of four step search algorithm. 15
2.9 Two small search paths of four step search algorithm. 16
2.10 Example of search pattern of NTSS 18
3.1 Flow chart of Diamond Search algorithm 24
4.1 Average Point for Akiyo Video (30 frames) 31
4.2 Average PSNR for Akiyo Video (30 frames) 31
4.3 Average Point for Salesman Video (30 frames) 32
4.4 Average PSNR for Salesman Video (30 frames) 32
4.5 Average Point for Foreman Video (30 frames) 33
4.6 Average PSNR for Foreman Video (30 frames) 33
4.7 Average Point for Coastguard Video (30 frames) 34
4.8 Average PSNR for Coastguard Video (30 frames) 34
4.9 Average Point for News Video (30 frames) 35
4.10 Average PSNR for News Video (30 frames) 35
4.11 Average Point for Tennis Video (30 frames) 36
4.12 Average PSNR for Tennis Video (30 frames) 36
xiv
4.13 Average Point for Akiyo Video (100 frames) 38
4.14 Average PSNR for Akiyo Video (100 frames) 39
4.15 Average Point for Salesman Video (100 frames) 39
4.16 Average PSNR for Salesman Video (100 frames) 40
4.17 Average PSNR for Foreman Video (100 frames) 40
4.18 Average PSNR for Foreman Video (100 frames) 41
4.19 Average Point for Coastguard Video (100 frames) 41
4.20 Average PSNR for Coastguard Video (100 frames) 42
4.21 Average Point for News Video (100 frames) 42
4.22 Average PSNR for News Video (100 frames) 43
4.23 Average Point for Tennis Video (100 frames) 43
4.24 Average PSNR for Tennis Video (100 frames) 44
4.25 Original image 45
4.26 FS predicted frame 45
4.27 TSS predicted frame 45
4.28 FSS predicted frame 45
4.29 NTSS predicted frame 46
4.30 DS predicted frame 46
xv
LIST OF ABBREVIATION
BDM - Block Distortion Measure
BMA - Block Matching Algorithm
CIF - Common Intermediate Format
DS - Diamond Search
FS - Full Search
FSS - Four Step Search
LDSP - Large Diamond Search Pattern
MAD - Mean Absolute Difference
MATLAB - Matrix Laboratory
MBD - Minimum Block Distortion
ME - Motion Estimation
MPEG - Moving Picture Experts Group
MSE - Mean Squared Error
NTSS - New Three Step Search
PSNR - Peak Signal-To-Noise Ratio
QCIF - Quarter Common Intermediate Format
SDSP - Small Diamond Search Pattern
1
CHAPTER I
INTRODUCTION
1.1 PROJECT INTRODUCTION
A technique known as Block Matching Motion Estimation has been widely
adopted in various coding standards to achieve high compression ratio in video
coding. This technique is implemented conventionally by exhaustively testing all the
candidate blocks within the search window. This type of implementation, called Full
Search (FS) Algorithm, gives the optimum solution. However, substantial amount of
computational workload is required in this algorithm. To overcome this drawback,
many fast Block Matching Algorithm (BMAs) have been proposed and developed.
Different search patterns and strategies are exploited in these algorithms in order to
find the optimum motion vector with minimal number of required search points.
One of these fast BMA’s, which is proposed to be implemented in this project,
is called Diamond Search (DS) Algorithm. The student is required to implement the
algorithm in MATLAB and then compared its performance to FS algorithm as well
as to other fast BMA’s in terms of the peak signal-to-noise ratio (PSNR), number of
required search points and computational complexity.
2
1.2 PROBLEM STATEMENTS
In recent years, several video compression standards had been proposed for
different applications such as CCITT H.261, MPEG-1 and MPEG-2. Generally,
video data constitutes most of the multimedia data. Efficient coding of video is
important for effectual usage of limited bandwidth and storage medium. Temporal
correlation between successive image frames enables high amount of compression.
Motion estimation is an important tool for exploiting temporal correlation. Block
based motion estimation with non-overlapping rectangular blocks is used in many
video coding standards. In this case, image frames are divided into non-overlapping
blocks and the best match is searched around a pre-defined search range using all
possible positions for each block.
Though this FS method provides optimal quality it significantly suffers from
computational load. FS method matches all possible displaced candidate block within
the search area in the reference frame in order to find the block with minimum
distortion, so this FS algorithm have large motion and more searching point to do the
blocks matching and thus the computational may be too complex.
1.3 OBJECTIVES
The main objective of this project is to implement one of the fast BMAs,
namely DS algorithm to overcome the problem encountered by FS Algorithm.
Besides, the aims are also:
a) To develop and implement DS algorithm in MATLAB
b) To compare and analyze the performance of DS algorithm to FS algorithm as
well as other common fast BMAs.
c) To produce a functional MATLAB program code.
3
1.4 SCOPES OF WORK
The scopes of works in this project are:
a) Data and theory acquisition on image processing, motion estimation, BMAs
and Diamond Search algorithm.
b) Implementation of DS algorithm on MATLAB.
c) Performance comparison of the algorithm to other available BMAs.
1.5 THESIS STRUCTURE
Chapter 1 Introduction
General description on the project idea, clarification on the scope of the project,
reviews of problem statement which introduces this project and thus the objectives of
doing this project.
Chapter 2 Literature Review
This chapter includes the study on the conventional video coding algorithm and
the project video coding algorithm. The algorithms are described. The study includes
Full Search algorithm, Diamond Search algorithm, New Three Step Search algorithm
and Four Step Search algorithm.
Chapter 3 Methodology
This chapter shows the project planning. The project is divided into nine steps
and each step is being described.
4
Chapter 4 Result and Discussion
This chapter shows presentation of the result obtained and discussion is made
base on the result. The result is analyzed and then compared with result from other
algorithm.
Chapter 5 Conclusion and Suggestion
This chapter gives an overall comment on the project and any suggestion to
upgrade the project is given.
5
CHAPTER II
LITERATURE REVIEW
2.1 VIDEO CODING
Video compression is the reduction of amount of data or frame which are used
to carry visual images. During video transmission, the important element is the fast
transmission of video and at the same time, the quality of the video remains good.
Video is a form of sequence of images that are play at a rate. Among two
consequences sequences, there could be a lot pixels which are remain unchanged and
thus they are redundant and can be eliminated in order for faster data transmission.
By identifying the difference of the pixel between the two frames, the video can be
reconstructed at the receiver by just sending the differences from the transmitter.
Nowadays, most of the video are digital video. File size is an important concern
because digital video files tend to take up a lot of storage space on the hard drive.
By compressing the video, it is made easier to be stored.
Digital video can be compressed without impacting the perceived quality of the
final product because it affects only the parts of the video that humans may not really
detect.
6
The compressed video can effectively reduce the bandwidth required and
thus its application include to transmit video via terrestrial broadcast, via cable TV,
or via satellite TV services [1].
Video compression typically operates on square-shaped groups of neighboring
pixels, often called macroblocks. These pixel groups or blocks of pixels are
compared from one frame to the next and the video compression codec
(encode/decode scheme) sends only the differences within those blocks.
2.2 MOTION ESTIMATION
Motion estimation is the process of determining motion vectors that describe
the transformation from one two dimensional image to another; usually from
adjacent frames in a video sequence. The idea of motion estimation based video
compression is to save on bits by sending encoded difference images which
inherently have less energy and can be highly compressed as compared to sending a
full frame.
The motion in the current frame is estimated with respect to a previous frame.
Motion information is used in video compression to find best matching block in
reference frame to calculate low energy residue.
The aim is to obtain motion vector which may relate to the whole image or
specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel.
This technique eliminates the temporal redundancy due to high correlation between
consecutive frames.
Motion estimation is a process of analyzing previous frames and next frames
to identify blocks that have not changed or have moved location. The motion
estimation module will create a model for the current frame by modifying the
reference frames such that it is a very close match to the current frame. This
estimated current frame is then motion compensated and the compensated residual
image is then encoded and transmitted. Its application includes scan rate conversion
7
to generate temporally interpolated frames. It is also used in applications such motion
compensated de-interlacing, video stabilization, motion tracking etc.
2.3 BLOCK MATCHING ALGORITHM (BMA)
Successive video frames may contain the same objects (stationary or moving).
Motion estimation examines the movement of objects in an image sequence to try to
obtain motion vectors representing the estimated motion. Motion compensation uses
the knowledge of object motion so obtained to achieve data compression.
In real video scenes, motion can be a complex combination of translation and
rotation. Such motion is difficult to be estimated and may require large quantity of
processing. However, translational motion is easily estimated and has been used
successfully for motion compensated coding.
Block matching estimation algorithm assumes the objects are rigid and move
in the translational movement for at least a few frame and occlusion of one object by
another and with an uncovered background which is neglected
BMA is the block-based search technique and the idea behind BMA is to
divide the current frame into a matrix of macro blocks that are then compared
with corresponding block and its adjacent neighbors in the previous frame to create
a vector that stipulates the movement of a macro block from one location to
another in the previous frame.
This movement calculated for all the macro blocks comprising a frame,
constitutes the motion estimated in the current frame. A search window with size
equal to the rectangular block is placed on those equally divided block to find out the
displacement of the best matched block from previous frame as the motion vector to
the block in the current frame. Usually the macro block is taken as a square of side
16 pixels.