surveillance within the department through image …

SURVEILLANCE WITHIN THE DEPARTMENT

THROUGH

IMAGE PROCESSING

Ahmad IJAZ AMIN

&

Ufuk INCE

Undergraduate Project Report

submitted in partial fulfillment of

the requirements for the

degree of

Bachelor of Science (B.S.)

In

Electrical and Electronic Engineering Department

Eastern Mediterranean University

T.R.N.C

June 2008

Approval of the Electrical and Electronic Engineering Department

______________________________

Assoc. Prof. Dr. Aykut HOCANIN

Chairman

This is to certify that we have read this thesis and that in our opinion it is fully adequate, in

cope and quality, as an Undergraduate Project.

Assoc. Prof. Dr. Erhan INCE

_________________________________ ______________________________

…….…………………….. …….……………………..

Co-Supervisor Supervisor

Members of the examining committee

Name Signature

1. Assoc. Prof. Dr. Erhan INCE …………………………………….

2. Asst. Prof. Dr. Hasan DEMIREL …………………………………….

3. Prof. Dr. Dervis DENIZ ..…………………………………...

4. Assoc. Prof. Dr. Mustafa UYGUROGLU …………………………………….

5. Assoc. Prof. Dr. Huseyin OZKARAMANLI ….…………………………………

Date: …….……………………………..

SURVEILLANCE WITHIN THE DEPARTMENT

THROUGH

IMAGE PROCESSING

By

Ahmad IJAZ AMIN

&

Ufuk INCE

Electrical and Electronic Engineering Department

Eastern Mediterranean University

Supervisor: Assoc. Prof. Dr. Erhan INCE

Keywords: Background Subtraction, Foreground Objects, Dilation, Connected

component analysis, Connectivity, Median Filtering, Structuring Elements

Faculty of Engineering, Eastern Mediterranean University i

ABSTRACT

This project was carried out for our department‟s security. The classroom section

on the second floor will be monitored; special wireless camera was bought from USA

especially for this project. Video acquired from the camera will be transferred to the

security workstation where the frames extracted from the video sequence will be

processed to grab those images where the person has been detected in our region of

interest at unusual timings. Background subtraction and connected component analysis

were the most involved techniques used in our project. After retrieving the coordinates of

the person‟s location, our program will examine whether the person is in our region of

interest or outside, if the person has been located inside our region of interest then this

image would be thrown into another file by the program. Later the video file can be

viewed for the detected persons only rather than viewing the whole video. Similarly this

process is also used for detecting multiple personalities. The report concludes the results

of our image processing techniques which show the effectiveness of our project.

Faculty of Engineering, Eastern Mediterranean University ii

ACKNOWLEDGMENTS

We would like to deeply thank our Project Supervisor Assoc. Prof. Dr. Erhan

INCE for his full support and feedback throughout this project. He provided us with all

the useful assistance. Special thank to the Chair of the department Assoc. Prof. Dr. Aykut

HOCANIN who made necessary reimbursement for our wireless camera and without his

permission we wouldn‟t be able to fix up the camera in the department. Thanks to all

those people who discussed with us our project and helped us in making a movie file on

which we worked on. Special appreciation also goes to Ali CAKMAK for his quick

service for mounting the camera on the wall according to our planning.

Faculty of Engineering, Eastern Mediterranean University iii

TABLE OF CONTENTS

LIST OF FIGURES ………………………………………………………………...…...v

INTRODUCTION …………………………………………………………………..…..1

MATLAB WITH IMAGE ACQUISITION DEVICE ...................................................2

Camera Used in our project ………………………………………………….....3

Approach used in our Project …………………………………………………..4

MATLAB ……………………………………………………………………...…4

Assumptions ..........................................................................................................4

METHODOLOGY …………………………………………………………………...…5

Median Filtering …………………………………………………………..……..6

Background Subtraction ………………………………………………………..7

Image Formats ………………………………………………………………..…………8

Truecolor image ………………………………………………….……….8

Grayscale image …………………………………………………….…….9

Dilation ……………………………………………………………………….…10

Understanding Structuring Elements ……………………………………11

Morphological dilation of binary image …………………………...……11

Morphological dilation of grayscale image ……………………………..12

Processing pixels at image borders (Padding Behavior) ………….……..12

Connected Component Analysis ………………………………………………………13

Equivalence class resolution ………………………………….…………16

Connectivity ……………………………………………..………………17

RESULTS ………………………………………………………………...…………….23

CONCLUSION …………………………………………………………...……………29

REFERENCES ................................................................................................................30

Faculty of Engineering, Eastern Mediterranean University iv

LIST OF FIGURES

Figure 1.1 Matlab interfacing .............................................................................................2

Figure 2 Wireless Camera used in our project ..................................................................3

Figure 3.1 Noise added to Background Image ...................................................................6

Figure 3.2 Noise removed through Median Filtering .........................................................6

Figure 4.1 Background Image ……………………………………………………………8

Figure 4.2 Background Image with Person ........................................................................8

Figure 4.3 Result of Background Subtraction ……………………………………………8

Figure 5.1 Color image ………………………………………………………………….10

Figure 5.2 Gray image …………………………………………….…………………….10

Figure 6 Morphological dilation of binary image ………………………………………11

Figure 7 Morphological dilation of grayscale image …………………………………...12

Figure 8.1 4-connected .....................................................................................................12

Figure 8.2 8-connected…………………………...………………………………………12

Figure 9.1 Input image matrix …………………………………………………………..17

Figure 9.2 4 connected labeled image …………………………………………………..18

Figure 9.3 8 connected labeled image …………………………………………………..18

Figure 10.1 Showing 4 connectivity …………………………………………………….19

Figure 10.2 Showing 8 connectivity …………………………………………………….19

Figure 11.1 Binary image ……………………………………………………………….21

Figure 11.2 Depicting coordinates ……………………………………………………...21

Figure 12.1 Background Image ……………………………….………………………………....23

Figure 12.2 Current Image ………………………………………...………….…………...23

Figure 13.1 Background image with person …………………………….……………..…...24

Figure 13.2 Result of Background Subtraction ……………………………………………...…..24

Figure 14.1 Grayscale Image (1) ………………………..…………………………………25

Figure 14.2 Image before Dilation (1) ………………………………………………...……25

Figure 14.3 Image after Dilation (1) …………………………………………..…………...25

Figure 15.1 Grayscale Image (2)…………………………………………………………………26

Figure 15.2 Image before Dilation (2) …………………………………………...…………26

Faculty of Engineering, Eastern Mediterranean University v

Figure 15.3 Image after Dilation (2) ………………...……………………………………..26

Figure 16.1 Single person Boxed inside our region ……………………………….………...27

Figure 16.2 Multiple persons Boxed inside our region ………………………………….…..27

Faculty of Engineering, Eastern Mediterranean University 1

INTRODUCTION

Background subtraction is a technique for segmenting out objects of interest in a scene

for applications such as surveillance. This technique involves comparing an observed

image with a static image which is known as background image. The areas of the image

plane where there is a significant difference between the observed and background

images indicate the location of the objects of interest. The simple technique of

subtracting the observed image from the estimated image and thresholding the result to

generate the objects of interest is involved in this project. Foreground extraction is done

by connected component analysis which is done on binary images. Retrieving the

coordinates after component analysis and processing images, would acquaint us about our

objects/persons location.

The purpose of this report is to articulate the techniques involved in our project

for departments security that how it will catch the person if he tries to break the entry into

our region of interest. The techniques we have used in our project is quiet important in

this filed of surveillance. This report will give the explanations to the image processing

concepts used in this project and will provide step by step methods of our techniques and

corresponding results.

The four parts of this report will talk over (1) about interfacing MATLAB with

our wireless camera, (2) the methodology, (3) results and experiments. The interfacing of

MATALB section will talk briefly about the MATALB, the camera which was used in

our project and its specifications. What is the approach in our project and the assumptions

we have made, will be described in this section. The methodology section will handle the

techniques used in this project in detailed explanation with images shown for readers to

have a visual explanation. Finally, the results section will show the images after

performing each image processing technique.


MATLAB INTERFACING

MATLAB with Image Acquisition Device

As from the introduction it is known that in this project we will be processing

images taken by a wireless camera and this wireless camera is mounted in the department

of electrical and electronic engineering. Video acquired from the camera will be

transferred to the security workstation where the frames extracted from the video

sequence will be processed through intelligent software „MALTLAB‟. Thus the classroom

area in the department will be monitored for vandalism and breaking entry.

The hardware & software part of the project mainly include 3 main parts which are as follows:

Image acquisition setup: Where we have a wireless camera which will be connected to

the PC Processor: A PC where the frames acquired from the wireless camera will be

processed.

Image Analysis: Will be done particularly in our project through MATALB to analyze

the contents in the image acquired through wireless camera.

The following Figure demonstrates what has been said:

Figure 1.1 Matlab interfacing

To carry out this project we bought a modern IR wireless camera from USA. It

transmits images wirelessly to the wireless transmitter which is connected to the PC.

Description of the camera is given below.


Camera Used in our project

Name of the Camera: vc36

Figure 2. Wireless Camera used in our project

*High powered (1500 ft. range)

*2.4Ghz wireless weatherproof video/audio security IR installed allows 60 ft. viewing in

total

darkness.

*4-channel receiver and included software allows for direct USB connection to computer.

*36 high power IR illuminators and Sony 1/3” CCD, 430 LOR

*Features auto scan mode for sequencing up to 4 cameras.

Set Includes:

Outdoor Camera and Mount

(2) 12 VDC Power Adaptors

USB 2.0 Connector Cable

VersaCam Recording Software CD

4 Channel USB/RCA Receiver


Approach used in our Project

There are many human detection techniques developed and still being actively

researched. Extensive amount of papers are written on human detection techniques but

the most suitable technique is still to be discovered. Due to different constraints people

prefer different human detection techniques in their projects. We have selected the

method is usually known as background subtraction which is a technique for segmenting

out objects of interest in a scene for applications such as surveillance. This technique

involves comparing an observed image with a static image which is known as

background image. The areas of the image plane where there is a significant difference

between the observed and background images indicate the location of the objects of

interest. The name “background subtraction" comes from the simple technique of subtracting the

observed image from the estimated image and thresholding the result to generate the objects of

interest.

MATLAB

For our project we used MATLAB as the programming language which according

to [1] is a widely used tool in the electrical engineering community. It can be used for

simple mathematical manipulations with matrices, for understanding and teaching basic

mathematical and engineering concepts. Image Processing Toolbox in MATLAB is being

used by over 4000 companies and universities worldwide across a broad spectrum of

disciplines. The program environment has an interactive command window that allows

users to test and experiment with the code line by line. Users can also save their codes

into an M-file and run the program. The MATLAB Help Navigator is also very useful.

Further information about this sophisticated software can be retrieved from the official sit

of MATLAB www.mathwroks.com

Assumptions

Two major assumptions are important in this project. The location should be

known beforehand and the background of the video must be static without containing any

person or object that would be removed later on.

http://www.mathwroks.com/


METHODOLOGY

As mentioned earlier the camera is mounted in the department and is operational

for trespassing. The video (recorded video) will be recorded in frames where these frames

will be processed later on to find out specifically those frames from the complete video

(recorded video) where any person was detected inside our predefined region; our

predefined region is the region protecting the classroom section on the second floor of the

department. The main purpose here is to process the video and perform background

subtraction on every frame. After applying background subtraction on the frames and

thresholding those resultant frames will acquaint us whether the present frame consist of

any person or not with the help of connected component analysis. If we detect a person in

the frame after doing connected-component analysis, then we will apply further

techniques on that frame to find out where exactly that person is located in the frame. As

soon as the exact coordinates of the person are known then we will analyze if the person

has crossed or entered our predefined region or not. If the person is detected inside our

predefined region, the respective frame will be added to another file (detected video).

This file will only consist of persons detected in our pre-defined region. The same

criterion is used for multiple people‟s detection.

Initially the recorded video will have 5 to 8 background frames and we are

averaging these frames because of change in lighting, to get a more accurate background

frame. Then we will apply median filtering on this average background frame to denoise

it. Median filtering is one of those techniques which we have used in our project and will

be explained in detail later in this report. Also we are applying median filtering to all

other frames of recorded video file. After we have completed this process, we will start

doing background subtraction in the loop which will run according to the number of

frames contained in the recorded video file.

Foreground extraction is done by connected component analysis which is done on

binary images, so after background subtraction we will convert our frames to binary

frames and apply connected component analysis. Detailed explanation about connected

component analysis would come later in this report.


Median Filtering

Syntax

B = medfilt2(A, [m n])

Description

Median filtering according to [2] is a nonlinear operation often used in image

processing to reduce "salt and pepper" noise. A median filter is more effective than

convolution when the goal is to simultaneously reduce noise and preserve edges. B =

medfilt2 (A, [m n]) performs median filtering of the matrix A in two dimensions. Each

output pixel contains the median value in the m-by-n neighborhood around the

corresponding pixel in the input image. medfilt2 pads the image with 0's on the edges, so

the median values for the points within [m n]/2 of the edges might appear distorted. B =

medfilt2 (A) performs median filtering of the matrix A using the default 3-by-3

neighborhood (www.mathworks.com).

We have used our background image to show the effect of median filtering in 2

dimensions. To make it easy for the users to understand about median filtering, we have

added noise to the background image in figure 3.1 and then in the second figure we have

removed that noise using median filtering.

Figure 3.1 Noise added to background image Figure 3.2 Noise removed through

Median Filtering


Background Subtraction

One of the most important concepts in our project is background subtraction.

Background subtraction, takes each frame in the video and subtracts it from a static

background that is known prior to the extraction process. The resultant image is the

extracted foreground objects/persons which will go under further image processing to

find out their exact locations. The most crucial reason this approach was adopted is

because of its practicality and suitability. Background subtraction algorithms are

generally less complicated than other methods. Below is the syntax showing how

imsubtract works to subtract two images from each other.

Syntax

Z = imsubtract (X,Y)

Description

Z = imsubtract (X,Y) subtracts each element in array Y from the corresponding

element in array X and returns the difference in the corresponding element of the output

array Z. X and Y are real, nonsparse numeric arrays of the same size and class, or Y is a

double scalar. The array returned, Z, has the same size and class as X unless X is logical,

in which case Z is double [3] (www.mathworks.com).

Due to change in lighting, instead of considering 1 background frame as a

reference frame to subtract all other coming frames. We have taken first 5 frames without

any person and added all those 5 frames and found out the average by dividing it by 5.

Now this resultant frame will be considered our background frame from which we will

subtract all other coming frames to find out the foreground objects. Subtraction of frames

is done in RGB format which is also called true color images as explained on the next

page.


Image Formats

We have used different image formats in our projects as explained below by

Truecolor image

[4] A truecolor image is an image in which each pixel is specified by three values

— one each for the red, blue, and green components of the pixel's color. MATLAB store

truecolor images as an m-by-n-by-3 data array that defines red, green, and blue color

components for each individual pixel. Truecolor images do not use a colormap. The color

of each pixel is determined by the combination of the red, green, and blue intensities

stored in each color plane at the pixel's location (www.mathworks.com).

Figures below demonstrate the effect of subtracting two frames from each other in

RGB format. It will be easy for a reader to understand the concept of background

subtraction after looking at the following figures.

Figure 4.1 Background Image Figure 4.2 Background Image with the Person

Figure 4.3 Result of Background Subtraction


As it can be seen from the last figures that in figure 4.3, after the background

subtraction, what is left is the person only because that is the only object which was

added to the background image in figure 4.2 and now after subtracting the background

from this image we have extracted the foreground object (the person) only as shown in

figure in 4.3.

Grayscale image

So, following the subtraction process, we have converted our resultant frame to

gray image because we want to apply thresholding where those pixel values below a

certain value would be converted to zero because they are not related to our foreground

objects and pixel values above a certain value would be converted to a maximum value

which is 255 (if we are considering the class of unit 8) in gray scale and this value is

related to our object most probably. Below is a small description given about gray-scale

images and why we use them to store our images?

A grayscale according to [5] (or graylevel) image is simply one in which the only colors

are shades of gray. The reason for differentiating such images from any other sort of color image

is that less information needs to be provided for each pixel. In fact a `gray' color is one in which

the red, green and blue components all have equal intensity in RGB space, and so it is only

necessary to specify a single intensity value for each pixel, as opposed to the three intensities

needed to specify each pixel in a full color image. Often, the grayscale intensity is stored as an 8-

bit integer giving 256 possible different shades of gray from black to white. If the levels are

evenly spaced then the difference between successive gray level is significantly better than the

gray level resolving power of the human eye (www.mathworks.com).

Grayscale images are very common, in part because much of today's display and image

capture hardware can only support 8-bit images. In addition, grayscale images are entirely

sufficient for many tasks and so there is no need to use more complicated and harder-to-process

color images.

http://homepages.inf.ed.ac.uk/rbf/HIPR2/rgb.htm

http://homepages.inf.ed.ac.uk/rbf/HIPR2/colimage.htm


Figure 5.1 Color image Figure 5.2 Gray image

Figure 5.1 on the left shows true RGB image and its corresponding gray image is shown

in the opposite figure 5.2.

Dilation

[6] Morphology is a broad set of image processing operations that process images

based on shapes. Morphological operations apply a structuring element to an input image,

creating an output image of the same size. In a morphological operation, the value of

each pixel in the output image is based on a comparison of the corresponding pixel in the

input image with its neighbors. By choosing the size and shape of the neighborhood, we

can construct a morphological operation that is sensitive to specific shapes in the input

image (www.mathworks.com).

Dilation adds pixels to the boundaries of objects in an image; the number of pixels

added or removed from the objects in an image depends on the size and shape of the

structuring element used to process the image. In the morphological dilation operation,

the state of any given pixel in the output image is determined by applying a rule to the

corresponding pixel and its neighbors in the input image. Rule for dilation is that the

value of the output pixel is the maximum value of all the pixels in the input pixel's

neighborhood. In a binary image, if any of the pixels is set to the value 1, the output pixel

is set to 1 (www.mathworks.com).


Understanding Structuring Elements

An essential part of the dilation operation is the structuring element used to probe

the input image. A structuring element is a matrix consisting of only 0's and 1's that can

have any arbitrary shape and size. The pixels with values of 1 define the neighborhood.

The center pixel of the structuring element, called the origin, identifies the pixel of

interest -- the pixel being processed. The pixels in the structuring element containing 1's

define the neighborhood of the structuring element. These pixels are also considered in

dilation.

Morphological dilation of binary image

Input image Output image

Figure 6 Morphological dilation of binary image

Figure 7 illustrates this processing for a grayscale image. The figure shows the

processing of a particular pixel in the input image. Note how the function applies the rule

to the input pixel's neighborhood and uses the highest value of all the pixels in the

neighborhood as the value of the corresponding pixel in the output image.


Morphological dilation of grayscale image

Input image Output image

Figure 7 Morphological dilation of grayscale image

Processing pixels at image borders (Padding Behavior)

Morphological functions position the origin of the structuring element, its center

element, over the pixel of interest in the input image. For pixels at the edge of an image,

parts of the neighborhood defined by the structuring element can extend past the border

of the image. To process border pixels, the morphological functions assign a value to

these undefined pixels, as if the functions had padded the image with additional rows and

columns. Following describes the padding rules for dilation for binary and grayscale

images (www.mathworks.com).

Pixels beyond the image border are assigned the minimum value afforded by the data

type.

For binary images, these pixels are assumed to be set to 0. For grayscale images, the

minimum value for uint8 images is 0 (www.mathworks.com).


Connected Component Analysis

The most fundamental concept in our project is based on connected component

analysis as explained in [7].

Syntax

L = bwlabel (BW,n)

[L,num] = bwlabel (BW,n)

Description

L = bwlabel (BW,n) returns a matrix L, of the same size as BW, containing labels for the

connected objects in BW. n can have a value of either 4 or 8, where 4 specifies 4-

connected objects and 8 specifies 8-connected objects; if the argument is omitted, it

defaults to 8. Connected components labeling scans a binary image and groups its pixels

into components based on pixel connectivity, i.e. all pixels in connected component share

similar pixel intensity values and are in some way connected with each other

(www.mathworks.com).

I have taken some portion from MATLAB site which explains in an excellent

manner about connected component analysis in a simple detail which I would like to

share. This is explained by Steve Eddins who manages the Image & Geospatial

development team at The MathWorks.

Figure 8.1 4-connected Figure 8.2 8-connected

http://homepages.inf.ed.ac.uk/rbf/HIPR2/pixel.htm

http://homepages.inf.ed.ac.uk/rbf/HIPR2/connect.htm

http://homepages.inf.ed.ac.uk/rbf/HIPR2/value.htm

http://www.mathworks.com/


Two connectivity are used in connected component analysis:

4-connected, where each pixel has four neighbors and this corresponds to figure 8.1

8-connected, where each pixel has eight neighbors and this corresponds to figure 8.2

Consider this is an image with logical numbers showing it is a

binary image. Pixel 1 corresponds to foreground pixels and pixel 0 corresponds to

background pixels. Now whole of the image BW will be scanned along the columns,

changing rows. When the scan encounters a foreground pixel, look at that pixel's

neighbors that have already been encountered in the scan to check if any of the

foreground pixels has received any temporary label. So here‟s the first foreground pixel

encountered, shown with its already-scanned neighbors highlighted in color:

Because we have just found a foreground pixel in BW and its the first pixel we

have found as shown in the small square, we will label it as 1 to our corresponding pixel

in the output as shown in image L shown in small square.

As shown below the second foreground pixel has been found as we moved along the

rows and within the column.


If we notice above, this second encountered pixel‟s neighbor in BW has already received

a temporary label 1 in output matrix L, so this pixel in output matrix L will also receive

the same label on temporary basis.

Notice below, that foreground pixel in row 4, column 3 has already got a

temporary label in the output L and which is 1 because it‟s neighbor in BW has the same

label in L.

Now when the scan gets to row 2, column 4 pixel, none of that pixel's scanned neighbors

have been labeled, so that pixel gets assigned a new temporary label of 2.

The very next pixel, on row 3, column 4, is where things start to get more conceptual.


One of this pixel's scanned neighbors has already been assigned a label of 1, but another

of the neighbors has been assigned a label of 2. So the algorithm picks one of the labels

arbitrarily,

and then records the fact that temporary label 1 and temporary label 2 actually refer to the

same object.

This situation happens again on row 4, column 8 as shown in the next page:

So the pair of labels 3 and 4 is equivalence and goes into the equivalence table.

When the first pass is done, you have this matrix of labels:

And you have an equivalence table containing these pairs:

1 <--> 2 Means Label 1 and Label 2, actually refer to same object which is object 1.

3 <--> 4 Means Label 3 and Label 4, actually refer to same object which is object 2.

Equivalence class resolution

This is the process of determining subsets of the temporary labels actually

referring to the same object. From this we would compute that temporary labels 1 and 2


map to final label 1 which will be done in MATLAB automatically by a certain

command, and temporary labels 3 and 4 map to final label 2. Then you make a second

pass over the output matrix to relabel the pixels according to this mapping and it is shown

on the next page.

By now it should be clear how connected component analysis works; it is a very

useful technique for segmenting out objects of interest in a binary image. We have used

bwlabel for applying connected component analysis in our project and it supports 2-D

inputs only. So the objects in the label matrix will be differentiated through their different

labels and it‟s obvious that label matrix should be of the same size as the input matrix or

image. Connected component analysis uses pixel connectivity (described earlier briefly)

to determine where the boundaries between objects are in an image.

Connectivity

To describe pixel connectivity more clearly we have prepared a matrix and found

its output label matrix, firstly we have used 4 connectivity as shown in figure 9.1 and

then later we have used 8 connectivity as shown in figure 9.2.

Figure 9.1 Input image matrix


Figure 9.2 4 connected labeled image

Figure 9.3 8 connected labeled image

If we notice in above figures the foreground pixels are set to 1 which corresponds

to our objects and background pixels are set to 0. Going back to figure 9.1 which shows

the input matrix has some on pixels inside a black square, compare it with figure 9.2 and

figure 9.3. In figure 9.2 you will notice these pixels has changed to label 2 corresponding

a second object and in figure 9.3 the same box has a label 1. This difference is due to the

pixel connectivity, Connectivity defines which pixels are connected to other pixels. A set

of pixels in a binary image that form a connected group is called an object or a connected


component.

If we are using 4-connectivity, pixels are connected if their edges touch. This

means that a pair of adjoining pixels is part of the same object only if they are both on

and are connected along the horizontal or vertical direction.

Figure 10.1 Showing 4 connectivity

If we are using 8-connectivity, pixels are connected if their edges or corners

touch. This means that if two adjoining pixels are on, they are part of the same object,

regardless of whether they are connected along the horizontal, vertical, or diagonal

direction.

Figure 10.2 Showing 8 connectivity

So the reason the black square in figure 10.2 was labeled 2 is because it did not

took the pixel on 4rth row and 3rd

column into consideration because its connectivity is

defined as 4. Based on this connectivity this black square gets a new label and if we look

figure 10.3, the black square has no new label and its label is 1 and the reason is because

it has considered the pixel on 4rth row and 3rd

column as a part of it, based on 8-

connectivity and use the same label as 1.


After the subtraction process and performing the connected component analysis,

if we have found any object, the command ([ concomplabel,count ] = bwlabel

(grdiffim2,8)) will tell us the number of objects found in the image and the labels

different objects/persons have received. So “count” will tell us the number of

objects/persons in the image and “concomplabel” will have all the corresponding labels

for the objects. To check if someone has entered in the image, we will see if our count is

nil or if it has got any value.

The value in the count as I said would acquaint us with the number of person or

objects entered in an image. If the count is equal to zero after the background subtraction

and connected component analysis, that means no one has entered in our image or else on

the other hand if the count has been initiated then that would indicate that someone has

entered in our region and we will start to do further analysis on this image where we have

found the object. Based on the value of count, we can run our loop that many times to

capture all the objects. After finding an object or objects in the image, we will sum the

labels in a particular object, this process is for every object in an image and we will put

these objects in descending order according to the sum of labels they have. The reason we

are ordering it in descending order, so we will be considering objects with a substantial

size first for further processing.

So in conclusion of this explanation we can say that after we perform connected

component analysis on our binary image, we can find different objects based on the

different label numbers. MATLAB has a command to find the coordinates of our objects

which are labeled by any method; in our project we used bwlabel command to label the

objects. For example if we use the following command after we have noticed that count is

not equal to zero, [r,c] = find(L= =2); it will give the rows and columns of the foreground

object 2 or the object which is labeled as 2. Obviously if there is no such object 2, the

matrix will be empty. Same command can be used to find any object‟s rows and columns

coordinates. So based on the value of count we will calculate the rows and columns of

different objects. Subsequently we will consider these values to box our image. minr and

minc represent the values of minimum row and minimum column respectively, maxr and

maxc represent the values of maximum row and maximum column respectively. Below is

the figure depicting these values.


Figure 11.1 Binary image Figure 11.2 Depicting coordinates

In the above figure, h is the height and w is the width of the box, row changes

vertically along the box and column changes horizontally along the box. So as we go

along the box vertically, c will increase and ultimately it will become (c+w) till the end of

the box and as we go along the box vertically, r will increase and ultimately it will

become (r+h) till the end of the box. How rows and columns changes along the lines is

shown in the above figure.

As soon as we have found these values for the objects/persons, we will send these values

to our function Drawbox which will draw a box around our object for which we have find

the max row, max column, min row and min column. The values sent to the box function

are, minr, minc, width and height. Different colors can be substituted for each line of the

box drawn around the person or object. Please note that box around the person will be

drawn only if he steps in our pre defined region which is the region to be protected from

intruders.

So interesting question might arise over here that how the box is drawn when the

person is stepped inside the region and the box is gone as soon as he is out of the region.

The answer is as simple as making a box around the person. As we have retrieved

the rows and columns of the objects, we will put a condition that if our maxr (shown in

the figure above) is within our region for which we already know the coordinates then


send the image to get boxed otherwise does not box it. This condition is done by simple if

statement. We have created one movie file to store images which are found within our

predefined region. This movie file will only show us those images where the person

entered in our region at night for instance whether with a bad intention or for any other

purpose.


Results

Our work has been evaluated by detecting humans in the region, predefined

earlier which cover‟s class room section on the second floor of the department. The

camera bought for this purpose is quiet modern and wireless which was fixed in the

department where we intended to block the intruders from trespassing. Our software can

correctly detect and track foreground objects as well they will be put in a separate file if

detected inside our region.

Following are the results that were obtained; detailed explanation has been

excluded in this section because we have covered all the necessary details in the

methodology part. This section will only show the part of the results which will show the

effectiveness of our work on this project and how we have gone through steps of image

processing to capture the persons and place those images with person or persons detected

in our predefined region in a separate file. The resulting video file of images which were

detected within our region from the background subtraction meets our expectation after

working on this project. The objective required, was to track the person accurately,

specifically the foreground objects. Due to the fact that the algorithm uses a static model

of the background to execute the extraction, the environment becomes vital concept. So

for this project we were given the luxury of choosing the background environment and

through experiment it was concluded that there are few conditions that were better. Most

essentially the evenly source light is preferred. Else if the background has strong and

concentrated light, not only would the shadow create accuracy problems, extreme

changes in lighting intensity would also create noises in unexpected areas. In addition, a

background with similar color and intensity values would also help to distinct the

foreground object more accurately. Within these conditions, the foreground extraction

process proves to have much less noises.


As we promised above we would show the results of our projects and here they

are with the explanations for each figure.

Figure 12.1 Background Image Figure 12.2 Current Image

By now the reader would understand that figure 12.1 corresponds to our

background image, this image is the average of around 5 background images.

Background image is the static image and does not change over time and just taken once

initially and all the subtraction process is done through this image, so we want to involve

that image in our subtraction process which is an average of all the background images

and this is done due to lightning differences.

Figure 12.2 corresponds to our current image, though it is empty. After the

subtraction process of this current image with the background image, we will get only

black image.


Figure 13.1 Background image with person Figure 13.2 Result of Background

Subtraction

Figure 13.1 is the current image with person standing inside our region, as we

apply background subtraction on this image; the result is shown in figure 13.2. You can

analyze from figure 13.2 that only we see the foreground object/person and the other

parts are black which corresponds to the background.


Figure 14.1 Grayscale Image (1)

Figure 14.2 Image before Dilation (1) Figure 14.3 Image after Dilation (1)

Figure 14.1 is the gray image of the corresponding image shown in figure 13.2

Figure 14.2 is the result of figure 14.1 after we have applied thresholding.

Figure 14.3 is the result of performing dilation on the image shown in figure 14.2


More prominent result of dilation is shown below, for example in the above figure

14.3 the dilation has no serious effect and still without performing dilation we can

capture this person but in some situation dilation becomes a main processing technique in

images and thus in the figures below it shows how effective dilation can become.

Figure 15.1 Grayscale Image (2)

Figure 15.2 Image before Dilation (2) Figure 15.3 Image after Dilation (2)

If we do not perform dilation in above figure, the connected component analysis

wouldn‟t be effective at all because it will loose many part of the persons in a sense that it

will assign different labels to the parts corresponding to one particular person separated

by each other indicating those as different objects though they are part of the same

persons and hence we won‟t get the proper coordinates of the object. So dilation is an

important technique in image processing as shown from the above figures, for


explanation of dilation kindly please refer to our previous section of the report which

covers this topic in a great detail. The main idea in dilation is how the pixels are

extended. Now after this process we will label our objects with the help of connected

component analysis and by finding rows and coordinates we will box the objects/persons,

the results are shown below for a single person and multiple person trapped in the box.

Figure 16.1 Single person Boxed inside our Figure 16.2 Multiple persons Boxed

region inside our region


CONCLUSION

In this report we have tried our level best to explain the techniques used in our

project in detailed explanation. We have shown the result of every step with images and

how ultimately we are trapping a person in the box. The techniques we have used in our

project is most widely used in surveillance area. Based on the results achieved we can

justify the effectiveness of our project but due to some constraint like shadow a very

accurate result could not be achieved.


REFERENCES

[1] Mathworks (1994). The Language of Technical Computing. Retrieved from the

World Wide Web: www.mathworks.com/products/matlab/

[2] Mathworks (1994). Medfilt2. Retrieved from the World Wide Web:

www.mathworks.com/access/helpdesk/help/toolbox/image/medfilt2

[3] Mathworks (1994). imsubtract. Retrieved from the World Wide Web:

www.mathworks.com/access/helpdesk/help/toolbox/image/imsubtract

[4] Wikipedia (2008). Truecolor Images. Retrieved from the World Wide Web:

en.wikipedia.org/wiki/Truecolor

[5] Wikipedia (2008). GrayScale Images. Retrieved from the World Wide Web:

en.wikipedia.org/wiki/Grayscale

[6] Mathworks (1994). Dilation. Retrieved from the World Wide Web:

www.mathworks.com/access/helpdesk/help/toolbox/image/dilation

[7] Mathworks (1994). Connected Componenet analysis. Retrieved from the World Wide

Web:

http://blogs.mathworks.com/steve/2007/04/15/connected-component-labeling-part-4/

http://www.mathworks.com/products/matlab/

http://www.mathworks.com/access/helpdesk/help/toolbox/image/medfilt2

http://www.mathworks.com/access/helpdesk/help/toolbox/image/imsubtract

http://www.mathworks.com/access/helpdesk/help/toolbox/image/dilation

surveillance within the department through image …

Documents