eye controlled documentation 1

8/6/2019 Eye Controlled Documentation 1

1/57

PREAMBLE

In the past few years high technology has become more progressed, and less expensive.

With the availability of high speed processors and inexpensive webcams, more and more

people have become interested in real-time applications that involve image processing.

One of the promising fields in artificial intelligence is human computer interaction which

aims to use human features (e.g. face, hands) to interact with the computer. One way to

achieve that is to capture the desired feature with a webcam and monitor its action in

order to translate it to some events that communicate with the computer.


2/57

CHAPTER 5

TESTING

Testing is a process used to help identify the correctness, completeness and

quality of developed computer software. With that in mind, testing can never completely

establish the correctness of computer software it just find out the faults whatever we did

in our project.

Testing helps is verifying and Validating if the Software is working as it is intended to

be working. This involves using Static and Dynamic methodologies to Test the

application.

Software spends more time being maintained than being developed.

Things that were obvious in development are easily obscured.

Fixing bugs is easier than finding them and making sure nothing else

breaks. The real cost of software is programmer time and effort, not length of

code, licenses or required hardware.

There are mainly types of testing .They are:-

Acceptance testing

Integration testing

Performance testing

Functional testing

. Some tests were conducted with users wearing glasses, which exposed somewhat of a

limitation with the system. In some situations, glare from the computer monitor prevented

the eyes from being located in the monitor analysis phase.

Users were sometimes able to maneuver their heads and position their eyes in such a way

that the glare was minimized, resulting in successful location of the eyes, but this is not a

reasonable expectation for severely disabled that may be operating with the system.

41


3/57

The facial features are to be detected while changing the location of the face. But in

actual results the facial features are not exactly detected.The result involves the replacing

of the mouse movements with the facial features by clicking on the enable interface

button.

5.2 Functional Testing

This testing involves live video capture to test whether the live video is captured in

the captured image segment. These steps involves installing the webcam and

initializing the JMF.

The results involves detection of live video captured and all the connected devices to

the system which perform eye controlled computer interaction.

5.3 Integration Testing

This testing involves the face detection to test whether the face is detected. In this we

click on the detect face button to detect the face and its facial features.

The results involves the detection of the face candidates and facial features like nose

tip, eyebrows and eyes. They are detected by using the SSR Filter.

41


4/57

5.4 Performance Testing

This testing involves face tracking to track the face movements while changing the

location of the face. In this movement of the captured figure is necessary.


actual results the facial features are not exactly detected.

5.5 Acceptance Testing

This is the final step in which the accepted input is interacted using the enable

interface button. In this we enable the enable interface button where the mouse

movements are replaced with the facial features to interact with the computer.

The result involves the replacing of the mouse movements with the facial features by

clicking on the enable interface button.

CHAPTER 1

INTRODUCTION

1.1 Problem Statement

In traditional mouse or in any wireless mouse, movement of the hand is necessary to

select or deselect the object. This becomes difficult for those with hand disabilities.

1.2 The Solution

Our system is capable of replacing the traditional mouse with the human face - a new way

to interact with the computer. Facial features (nose tip and eyes) are used to interact with

computer. The nose tip acts as a mouse pointer and the left/right eye blinks fire left/right

mouse click events .Only external device that the user needs is a webcam that feeds the

program with the video stream .The people having hand disabilities are able to work.

41


5/57

1.3 Objectives of Project

Our system is capable of replacing the traditional mouse with the human

face a new way to interact with the computer. Facial features (nose tip and eyes) are used to interact with computer.

The nose tip acts as a mouse pointer and the left/right eye blinks fire

left/right mouse click events.

1.4 Advantages

The main advantage of this project is that only a simple webcam with moderate resolution

is sufficient to capture the live video. People with hand disabilities can use this project

effectively. Eye control can enable more efficient interfaces with lower cognitive

workload and operator monitoring for control stations. Eye control provide new game

experiences. Eye tracking can be integrated into the next generation of consumer

computers to create enhanced user experience and very strong product differentiation.

Eye control has revolutionized the lives of thousands of people with disabilities. Tobii is

working with partners to provide hands-free computers to sterile environments. Thus we

have developed a project capable of replacing mouse with facial features to interact with

computer.

1.5 Limitations of study

The disadvantage of this project is that involuntary eye blinks also trigger mouse clicks.

Also it may lead to strain in the eyes due to continuous blink. Some tests were conducted

with users wearing glasses, which exposed somewhat of a limitation with the system. In

some situations, glare from the computer monitor prevented the eyes from being located

in the monitor analysis phase.

Users were sometimes able to maneuver their heads and position their eyes in such a way

that the glare was minimized, resulting in successful location of the eyes, but this is not a

reasonable expectation for severely disabled that may be operating with the system.

41


6/57

1.6 Literature Survey

1.6.1 Existing System

HCI (Human Computer Interface) aims to use human features (e.g. face, hands) to

interact with the computer. One way to achieve that is to capture the desired feature with

a webcam and monitor its action in order to translate it to some events that communicate

with the computer.

Traditional Mouse

Wireless Mouse

Traditional Mouse

In computing, a mouse (plural mice or mouses) functions as a pointing device by

detecting two-dimensional motion relative to its supporting surface. Physically, a mouse

consists of a small case, held under one of the user's hands, with one or more buttons. It

sometimes features other elements, such a "wheels", which allow the user to perform

various system-dependent operations, or extra buttons or features can add more control or

dimensional input. The mouse's motion typically translates into the motion of a pointer on

a display, which allows for fine control of a Graphical User Interface.

The name mouse originated at the Stanford Research Institute derives from the

resemblance of early models (which had a cord attached to the rear part of the device,

suggesting the idea of a tail) to the common mouse.

41


7/57

Fig.1.6.1.1Wired mouse

Wireless Mouse

A Wireless Mouse is a computer mouse that communicates with the CPU via RF wireless

ID technology instead of via a plug-in cord.

Fig 1.6.1.2 Wireless mouse

Technical features of the mouse are

Touch-sensitive top shell

360 degree enabled clickable scroll ball

41


8/57

Force-sensing side "squeeze" buttons

Optical tracking in wired version or laser tracking in wireless version

Compatible with Macintosh and PCs

Programmable functions for the four buttons

Easter egg: when held parallel above a surface, the shape of the light

projected from the wired USB mouse resembles the head of a mouse

1.6.2 Proposed System

Our system is capable of replacing the traditional mouse with the human face - a new way

to interact with the computer. Facial features (nose tip and eyes) are used to interact with

computer. The nose tip acts as a mouse pointer and the left/right eye blinks fire left/right

mouse click events .Only external device that the user needs is a webcam that feeds the

program with the video stream .The people having hand disabilities are able to work.

1.7 Organisation of Dissertation

In the past few years high technology has become more progressed, and less expensive.

With the availability of high speed processors and inexpensive webcams, more and more

people have become interested in real-time applications that involve image processing.

One of the promising fields in artificial intelligence is human computer interaction which

aims to use human features (e.g. face, hands) to interact with the computer. One way to

achieve that is to capture the desired feature with a webcam and monitor its action in

order to translate it to some events that communicate with the computer.

The current evaluation of computer technologies has enhanced various

applications in human-computer interface. Face and gesture recognition is a part of this

field, which can be applied in various application such as in robotic, security system,

drivers monitor, image processing.

41


9/57

Since human face is a dynamic object and has a high degree of variability. Face

detection can be classified as two categories feature-based approach and image-based

approach. The techniques in the first category makes used of apparent properties of face

such as geometry, skin color, and motion. Even feature-based technique can achieve high

speed in face detection, but it also has problem in poor reliability under lighting

condition. For second category, the image-based approach takes advantage of current

advance in pattern recognition theory. Most of the image-based approach applies a

window scanning technique for detecting face, which requires large computation.

Therefore, by using only image-based approach is not suitable enough in real-time

application.

In order to achieve high speed and reliable face detection system, we propose the

method combine both feature-based and image-based approach to detect the point

between the eyes by using SSR[1].The proposed SSR filter, which is the rectangle

divided into 6 segments, operates by using the concept of bright-dark relation around

BTE [2] area. We select BTE [2] as face representative because it is common to most

people and easy to find for wide range of face orientation. BTE [2] has dark part (eyes

and eyebrows) on both sides, and has comparably bright part on upper side (forehead),

and lower side (nose and cheekbone).

41


10/57

CHAPTER 2

SOFTWARE REQUIREMENTS SPECIFICATION

2.1 Functional Overview

We use an intermediate representation of image called integral image to calculate sums of

pixel values in each segment of SSR [1] filter. Firstly, SSR [1] filter is scanned on the

image and the average gray level of each segment is calculated from integral image.

Then, the bright-dark relations between each segment are tested to see whether its center

can be a candidate point for BTE [2]. Next, the camera is used to find the distance

information and the suitable BTE [2] template size. Then, the BTE [2] candidates are

evaluated by using a template of BTE [2]. Finally the true BTE [2], nose tip can be

detected.

41


11/57

Using the co-ordinate of mouse tip as pointer and eye blink fires the left/right mouse

click events.

2.2 User Characteristics

The pc users input text simply by looking at an on-screen keyboard .When the user gazes

at a character for one second, the system which uses the external camera ,detects their line

of sight and inputs the appropriate character. This system will be useful to people with

disabilities and for a range of medical and social-welfare applications. In the early stages

of the systems development the positions of the outer corners of the eyes and of the

eyebrows were used. When the inner corners of the eyes and eyebrows were taken as the

coordinates, development of the system advanced by leaps and bounds.

2.3 Input Requirements

The input data is obtained from external web cams. Data is read from a source and passed

in buffers to the processing stage. The input may consist of reading data from a local

capture device (such as a webcam or TV capture card), a file on disk or stream from the

network. Media Handlers are registered for each type of file that JMF must be able to

handle. To support new file formats, a new Media Handler can be created. Input data of

the Eye encompasses various controls such as buttons and menu items The refresh and

stop buttons are used to refresh and stop receiving the video frames from the video input

device .The enable interface button is used to deactivate the mouse and act as a interface

for interacting with the system. The face detection button is used to detect the face

candidates (nose tip and eyes).The menu items such as file, help are used to perform

normal operations such as exit and to know the information about the project.

2.4 Output Requirements

The output may take the transformed data stream and pass it to a file on disk, output it to

the local video display or transmit it over the network. A VideoRenderer outputs the final

data to the screen, but another kind of renderer could output to different hardware, such as

a TV out card. Output generally refers to the results generated by the system and it is a

direct source of information to the user. For many end users, output is the main reason for

41


12/57

developing the system and the basis on which they evaluate the usefulness of the

application. The objective of a system finds its shape in terms of the output. The output

of this project tracks and identifies the human facial features (nose tip and eyes).

2.5 Software Requirements

Tools Java 5.0, Java Media Framework 2.1.1e

Platform Windows 98/2000, Windows XP

2.6 Hardware Requirements

Processor Pentium III 700MHz

CPU 500 MHz

RAM 128 MB

Monitor 17 Samsung color monitor

Hard Disk 40 GB

Keyboard Standard Keyboard with 104 Keys

Mouse Serial mouse

Camera Web Camera

2.7 Project Cycle

The project involves mainly 3 steps:-

Face Detection

Finding the nose tip

Motion Detection

Blink Detection

2.7.1 Face Detection

41


13/57

In this we propose a real-time face detection algorithm using six-segmented filter,

distance information, and template matching technique. Since human face is a dynamic

object and has a high degree of variability, we propose the method combine both feature-

based and image based approach to detect the point between the eyes by using SSR[1].

The proposed SSR filter, which is the rectangle divided into 6 segments, operates by

using the concept of bright-dark relation around BTE [2] area. We select BTE [2] as face

representative because it is common to most people and easy to find for wide range of

face orientation. BTE [2] has dark part (eyes and eyebrows) on both sides, and has

comparably bright part on upper side (forehead), and lower side (nose and cheekbone).

2.7.2 Finding the nose tip

Now that we located the eyes, the final step is to find the nose tip. From the

following figure we can see that the blue line defines a perfect square of the pupils and

outside corners of the mouth; the nose tip should fall inside this square, so this square

becomes our region of interest in finding the nose tip. At first we need to locate the nose

bridge and then we will find the nose tip on that bridge. As earlier mentioned the nose

bridge is brighter than surrounding features, so we will use this criterion to locate the

nose-bridge-point (NBP) on each line of the ROI. We will be using an SSR filter to locate

the NBP candidates in each ROI line. The width of the filter is set to the half of the

distance between the eyes, because the line (nose width) is equal to the half of the line

(distance between the eyes). After calculating the integral image of the ROI , each line of

it will be scanned with this filter; we remember that the nose bridge is brighter than the

regions to the left and right of it.

2.7.3 Motion Detection

To detect motion in a certain region we subtract the pixels in that region from the

same pixels of the previous frame, and at a given location(x,y); if the absolute value of

the subtraction was larger than a certain threshold ,we consider a motion at that pixel.

41


14/57

2.7.4 Blink Detection

We apply blink detection in the eyes ROI before finding the eyes new exact

location. The blink detection process is run only if the eye is not moving; because when aperson uses the mouse and wants to click, he moves the pointer to the desired location,

stops, and then clicks, so basically the same for using the face, the user moves the pointer

with the tip of the nose, stops then blinks. To detect a blink we apply motion detection in

the eyes ROI; if the number of motion pixels in the ROI is larger than a certain threshold

we consider that a blink was detected, because if the eye is still, and we are detecting a

motion in the eyes ROI, that means that the eyelid is moving which means a blink. In

order to avoid multiple blinks detection while they are a single blink, the user can set the

blinks length, so all blinks which are detected in the period of the first detected blink are

omitted.

CHAPTER 3

SYSTEM DESIGN

3.1 Project Description

The skeleton of the entire process is prepared in this phase

The scheduling and interactivity of the system is completed in designing

The work structure including look and feel is generated for the system

3.1.1 Input Design

41


15/57

Input design is the process of converting user-originated inputs to a computer

based format. Input design of the Eye encompasses various controls such as buttons and

menu items. The refresh and stop buttons are used to refresh and stop receiving the video

frames from the video input device .The enable interface button is used to deactivate the

mouse and act as a interface for interacting with the system. The face detection button is

used to detect the face candidates (nose tip and eyes).The menu items such as file, help

are used to perform normal operations such as exit and to know the information about the

project.

3.1.2 Output Design

Output design generally refers to the results generated by the system and it

is a direct source of information to the user. For many end users, output is the main

reason for developing the system and the basis on which they evaluate the usefulness of

the application. The objective of a system finds its shape in terms of the output.

The output of this project tracks and identifies the human facial features (nose tip and

eyes).

3.1.3 Modules

Module 1

Live Video Capture

Face Detection

Find Face Candidates

Module 2

Extract Between The Eyes

Find Nose Tip

Module 3

Face tracking

Module 4

Front end

41


16/57


17/57

to the products or develop new solutions. Implement reliable, embedded eye

controlling capabilities quickly and efficiently, at a nominal cost and without prior

eye controlling knowledge.

There are portable tablets with eye control input like Tobii C12 eye Tablet which aredesigned for demanding environments. They provide services in hospitals and other

sterile environments. They also provide services in public displays in shopping malls

and other public venues and assessments.

They are eye control systems like Tobii T60 which are ease-to-use for on-screen

research and also support a broad range of studies both in and out of the laboratories.

There are wide-screen eye control systems like Tobii T60XL for large stimulus

display they offers high quality

Control over wide-screen gaze angles and allows for studies of detailed stimuli.

There are mobile eye control systems like Tobii Glasses Eye Tracker for real world

research .They are discrete and ultra lightweight design. They are system-guided

procedures and automated data mapping and aggregation.

3.4 Utilities

The potential of eye control computer interaction is nearly everywhere in real time

applications. The main areas are:

Computer Interaction

Monitoring and Training Assessments, Diagnostics and Rehabilitation

3.4.1 Computer Interaction

Eye tracking adds unique ways to enhance computer user interfaces both to create

truly hands free computers for specific use and as a part of the multimodal and natural

user interfaces of the next generation standard computers. Eye control adds unique valueto natural Computer interfaces and can be used in many ways:

41


18/57

Close to mind reading of the user in real time.

Reduces the need to explicitly instruct the computer what to do.

Replicates natural human communication methods.

Improves speed and efficiency of the interaction and interface.

3.4.2 Monitoring and Training

The eyes reveal a persons attention, distraction, vigilance and workload in a very

direct and clear way. Eye tracking is used in a range of industries to ensure and improve

performance in real time or as a part of simulations and training

3.4.3 Assessments,Diagnostics and Rehabilitation

Our eyes directly reflect brain activity and cognition. The study of eye

movements and gaze patterns can therefore says a lot about the human brain and human

behavior. Today eye tracking is being used to measure acuity of toddlers, to rehabilitate

patients in intensive care, to detect if a person is lying or not, and to understand the

cognitive level of a non-verbal person.

3.5 Data Flow Diagram

41


19/57

Fig 3.5.1 Data Flow Diagram1

41


20/57


21/57

Fig 3.6.1 Flowchart for eye controlled interaction

CHAPTER 4

41

SSR FILTER

STERO CAMERA TO FIND THE

DISTANCE INFORMATION

AVERAGE BETWEEN THE EYE

TEMPLATE MATCHING

EYE DETECTION

BETWEEN THE EYE POINT

DETECTION

CANDIDATE POINT

BETWEEN THE EYE

TEMPLATE


22/57


23/57


24/57

Codecs/Effects

Codecs and Effects are components that take an input stream, apply a

transformation to it and output it. Codecs may have different input and output

formats, while Effects are simple transformations of a single input format to an output

stream of the same format.

Renderers

A renderer is similar to a Codec, but the final output is somewhere other than another

stream. A Video Renderer outputs the final data to the screen, but another kind of

renderer could output to different hardware, such as a TV out card.

Muxs / Demuxes

Multiplexers and Demultiplexers are used to combine multiple streams into a single

stream or vice-versa, respectively. They are useful for creating andreading a package of

audio and video for saving to disk as a single file, or transmitting over a network.

4.2 Implementation Details

In our project we overcome this by using facial features to interact with computer

i.e., nose tip is tracked and it is used as mouse pointer and left and right blinks fire left

and right respectively. Our project consist 3 modules.

4.2.1 Module1

Live Video Capture

The first module is to capture the Live Image using the Web Cam and to detect a face

in the captured Image Segment.

41


25/57

Face Detection Module

Face Detection can be categorized as Feature Based Method

Image Based Method

Feature Based Method

This techniques makes use of

apparent properties of face such as geometry, skin color, and motion

Lack of Robustness against head

rotation ,scaling and problem under lightning condition

Image Based Method

The image

of interest with a window that looks for faces at all scales and locations.

Face

detection implies pattern recognition, and achieves it with simple methods such as

template matching

Not suited

for real time application

In order to achieve the high speed and reliable face detection system, we combine

these two techniques by using Six-Segmented Rectangular filter (SSR Filter) in figure

4.2.1.1

41


26/57

Fig 4.2.1.1 SSR Filter

The sum of pixels in each sector is denoted as S along with the sector number.

The use of this filter will be explained in detail in the face detection algorithm.

Find Face Candidates Module

The human face is governed by proportions that define the different sizes and

distances between facial features. We will be using these proportions in our heuristics to

improve facial features detection and tracking.

Face detection general steps

We will be using feature based face detection methods to reduce the area in which we

are looking for the face, so we can decrease the execution time. To find face candidates

the SSR [1] filter will be used in the following way:At first we calculate the integral image by making a one pass over the video frame

using these equations:

41


27/57

Fig 4.2.1.2 Finding face candidate

(x, y) is the location of the filter (upper left corner). The plus sign is the center of the filter

which is the face candidate.

We can notice that in this ideal position the eyes fall in sectors S1 and S3, while

the nose falls in sector S5. Since the eyes and eye brows are darker than the BTE and the

cheek bones, we deduce that :

S1 < S2 && S2 > S3 (5.1)S1 < S4 && S3 < S6 (5.2)

So in order to find face candidates we place the upper left corner of the SSR filter

on each pixel of the image (only on pixels where the filter falls entirely inside the bounds

of the image) .For each location (x, y) we check equations (5.1,5.2); if the conditions are

fulfilled then the center of the filter will be considered as a face candidate if its values

fulfill equations 5.1 and 5.2 (skin pixel thresholds).Eventually the candidates will group

in clusters.

4.2.2 Module 2

Extract BTE (Between the Eyes) Module

In order to extract BTE templates we need to locate pupils candidates. Left and right

pupils candidates will fall in sectors S1, S3. For each of the earlier mentioned sectors the

41


28/57

following steps are applied to find the pupils. To find the pixels that belong to a dark area

(in our case the pupil); the sector is binarized with a certain threshold. The clusters of the

binarized sector are found.

1. If the thresholding produces only one cluster like sector S3 (right eye) , it is a large

probability that the clusters of the eye brow and the pupil has unified in one, so in this

case it is likely to look for the pupil in the lower half of the sector cause as we already

mentioned the upper half is probably the eye brow. So what we are going to do is to

calculate the area of the part of the cluster which is in the lower half of the sector, if it is

larger than a specified threshold then the center of the lower part is considered as the

pupil; if not the same will be applied to the upper half because it is possible that the

cluster that was found is the pupil cluster alone (without the eye brow cluster) and it fell

in the upper half. In case of failing to find a pupil in the upper half this sector will be

Omitted and no pupil is found.

2. If there are multiple clusters like sector S1 (left eye), we need to find the cluster that is:

The largest: some clusters are the result of video noise and they are small.

The darkest: the pupil is darker from other features in the sector The closest: to the darkest pixel of the sector: the eye pupil is black and we need

to find the cluster that contains the pupil or is the closest to it.

These criteria are applied in the lower half of the sector first, because as already

mentioned we need to avoid picking the eye brow cluster as the pupil, if the

chosen cluster is larger than a certain threshold its center will be considered as the

pupil, if not, we will do the same in the upper half, in case of failing to find a

pupil in the upper half this sector will be omitted and no pupil is found. If wedidnt find a left or right pupil candidate the cluster will be skipped and no longer

considered as a face candidate.

Find Nose Tip Module

Now that we located the eyes, the final step is to find the nose tip. From the following

figure we can see that the blue line defines a perfect square of the pupils and outside

41


29/57

corners of the mouth; the nose tip should fall inside this square, so this square becomes

our region of interest in finding the nose tip.

Fig 4.2.2.1 Location of ROI

So the first step is to extract the ROI [4], in case the face was rotated we need to

rotate the ROI [4] back to a horizontal alignment of the eyes. The nose tip has a convex

shape so it collects more light than other features in the ROI because it is closer to the

light source. Using the previous idea we tried to locate the nose tip with intensity profiles

in horizontal intensity profiles we add vertically to each line the values of the lines that

precedes it in the ROI [4], so since that the nose bridge is brighter than the surrounding

features the values should accumulate faster at the bridge location; in other words the

maximum value of the horizontal profile gives us the x coordinate of the nose tip. In

vertical intensity profiles we add horizontally to each column the values of the columns

that precedes it in the ROI [4] ; the same as in the horizontal profile, the values

accumulate faster at the nose tipposition so the maximum value gives us the y

coordinate of the nose tip. From both, the horizontal and vertical profiles we were able to

locate the nose tip position, but unfortunately this method did not give accurate results

because there might be several max values in a profile that are close to each other, and

choosing the correct max value that really points out the coordinates of the nose tip is a

difficult task.

So instead of using the intensity profiles alone, we will be applying the following

method.

41


30/57

At first we need to locate the nose bridge and then we will find the nose tip on that

bridge. As earlier mentioned the nose bridge is brighter than surrounding features, so we

will use this criterion to locate the nose-bridge-point (NBP) on each line of the ROI.

We will be using an SSR filter to locate the NBP [5] candidates in each ROI [4] line.

The width of the filter is set to the half of the distance between the eyes, because we can

notice that the line (nose width) is equal to the half of the line (distance between the

eyes).

SSR filter to locate nose bridge candidates

After calculating the integral image of the ROI [4], each line of it will be scanned

with this filter; we remember that the nose bridge is brighter than the regions to the left

and right of it; in other words the center of the SSR filter is considered as an NBP [5]

candidate if the center sector is brighter than the side sectors:

S2 > S1 (5.3)

S2 > S3 (5.4)

In each line we might get several NBP candidates, so the final NBP will be the

candidate that has the brightest S2 sector. In order to avoid picking some bright video

noise as the NBP [5] we will be using the horizontal intensity profile; so instead of

applying the SSR [1] filter to a line of the ROI we will be applying it to the horizontal

profile calculated from the first line to the line that we are dealing with, because as

already mentioned the values will accumulate faster at the nose bridge location, so by

using the horizontal profile we are sure that we are picking the right NBP

Candidate not some bright point caused by noise; of course the results will get

more accurate as we reach the last line of the ROI because the accumulation at the nose

bridge location will get more obvious.

41


31/57

Fig 4.2.2.2 Nose bridge detection

ROI is enlarged only for a clearer vision. Now that we located the nose bridge weneed to find the nose tip on that bridge. Since each NBP represents the brightest S2 sector

on the line it belongs to, and that S2 sector contains the accumulated vertical sum of the

intensities in that sector from the first line to the line it belongs to, we will be using this

information to locate the nose tip. Nose trills are dark areas, and the portion that they add

to the accumulated sum in the horizontal profile is smaller than the contribution of other

areas; in other words each NBP will add with its S2 sector a certain amount to the

accumulatedsum in the horizontal profile, but the NBP at the nose trills location will add

a smaller amount (S2 sector of the NBP at the nose trills location is darker than S2 sector

of other NBPs), so if we calculate the first derivate of the values of NBPs S2 sectors (the

first derivate of the values of the maximum value of the horizontal profile at each ROI

line) we will notice a local minima at the nose trills location; by locating this local

minima we take the NBP that corresponds to it as the nose trills location, and the next

step is to look for the nose tip above the nose trills.

Since the nose tip is brighter than other features it will donate with its S2 sector to

the accumulated sum more than other NBPs [5], which means a local maxima in the first

derivate ; so the location of the nose tip is the location of the NBP that corresponds to the

local maxima that is above the local minima in the first derivate.

4.2.3 Module 3

Face Tracking Module

41


32/57

The nose tip is tracked to use its movement and coordinates as the movement and

coordinates of the mouse pointer. The eyes are tracked to detect their blinks, where the

blink becomes the mouse click.

The tracking process is based on predicting the place of the feature in the current

frame based on its location in previous ones; template matching and some heuristics are

applied to locate the features new coordinates.

4.2.4 Module 4

Front End

This light weight eye controlled computer has a front end which is created using

SWINGS and AWT. Java swing has many built in APIs for creating the front end. The

front end of the eye controlled computer provides for a good user interface.

Swing is a GUI toolkit for Java. It is one part of the Java Foundation Classes

(JFC). Swing includes graphical user interface (GUI) widgets such as text boxes, buttons,

split-panes, and tables. Swing widgets provide more sophisticated GUI components than

the earlier Abstract Window Toolkit. Since they are written in pure Java, they run the

same on all platforms, unlike the AWT which is tied to the underlying platform's

windowing system. Swing supports pluggable look and feel not by using the native

platform's facilities, but by roughly emulating them. This means you can get any

supported look and feel on any platform. The disadvantage of lightweight components is

slower execution. The advantage is uniform behavior on all platforms.

Advantages of Swings:-

Swing is a platform independent, Model-View-ControllerGUI framework for Java. It

follows a single-threaded programming model, and possesses the following traits:

Platform independence: Swing is platform independent both in terms of its

expression (Java) and its implementation (non-native universal rendering of widgets).

41


33/57

Extensibility: Swing is a highly partitioned architecture which allows for the

'plugging' of various custom implementations of specified framework interfaces.

Component-Oriented : Swing is a component-based framework. The distinction

between objects and components is a fairly subtle point: concisely, a component is a

well-behaved object with a known/specified characteristic pattern of behavior. Swing

objects asynchronously fire events, have 'bound' properties, and respond to a well

known set of commands (specific to the component.) Specifically, Swing components

are Java Beans components, compliant with the Java Beans Component Architecture

specifications.

Text Fields

Text fields are the basic text handling components of the AWT. These

components handle a one dimensional string of text; they let you display text, let the user

enter text, allow you to take passwords by masking typed text, allow you to read the text

the user has entered, and most fundamental AWT components.

Buttons

Buttons provide users with a quick way to start some action-all they have to do is

click them. Every user is familiar with buttons, and we have already taken a look behind

the scenes on how buttons work in code when discussing event handling. You can give

buttons a caption, such as Click Me! when the user does click the button. Your code if

you have registered to handle events from the button.

Checkboxes

Checkboxes are much like, buttons, except that they are dual state, which means

they can appear as selected or unselected. when selected ,they display a visual indication

of some kind ,such as a checkmark or an x(it varies by operating system in AWT

programming, which is one of the reason sun introduced swing, which can display

components with the same look across many operating systems), the user can check a

41


34/57

check box to select an option of some kind (such as the items on sandwich),to enable

automatic spell checking, or to enable automatic spell checking, or to enable printing

while he is doing something else. You use checkbox to let the user select nonexclusive

options; for example, both automatic spell checking and background printing may be

enabled at the same time.

Radio buttons

You let the user select one of a set of mutually exclusive options using options

using radio buttons. Only one of a set of option buttons can let the user select a printing

color or the day of the week. In AWT radio buttons are actually a type of checkbox, and

when selected, they display a round dot or a clicked square or some other indication

(again, the visual indication depends on the operating system).you use radio buttons in

groups.

Layouts

We have just added components to applets and applications using the add method

is actually a method of the default layout manager-the flow layout manager is res possible

for arranges components in AWT applets, by default. The flow layout manager arranges

much like a word processor might arrange words-across the page and then wrapped to the

next line as needed , creating what sun calls a flow of components. you will see that you

can customize flow layouts to some extend ,such as left center, or right-aligning

components .however ,the limitation of flow layout are clear, especially if you are

counting on components maintaining some position with respect to others, because if user

resizes your applet or application ,the components will all move around. On the other

hand, there are other AWT layout manager (and quite a few new ones in swing), and we

all cover the AWT grid, border, card, and grid bag layout managers.

We can size and locate components as you want them, using adds to display them,

like this:

Setlayout(null);

41


35/57

text1 = new textfield(20);

text1.setsize(200,50);

text1.setlocation(20,20);

add(text1);

This adds a text field of size (200, 50) at location (20, 20) in a container such as

an applet window. Therefore, as you can add components to containers without any

layout manager at all, something thats useful to bear in mind if the AWT layouts

frustrate you too much.

One very useful AWT container to use with layouts is the panel components you

can arrange components in a panel and then add the panel, itself, to the layout of an

applet or application.

4.3 IMPLEMENTATION FLOW CHARTS OF CODING

Application.Java

import javax.swing.UIManager;

import java.awt.*;

import c.WaitFrame;

public class Application

{

boolean packFrame = false;

public Application()

{

WaitFrame wFrame = new WaitFrame();

Dimension screenSize = Toolkit.getDefaultToolkit().getScreenSize();

41


36/57

wFrame.setLocation((screenSize.width - wFrame.getWidth()) / 8, (screenSize.height -

wFrame.getHeight()) / 8);

wFrame.setUndecorated(true);

wFrame.getContentPane().setCursor(Cursor.getPredefinedCursor(Cursor.WAIT_CURSO

R));

wFrame.show();

Frame frame = new Frame(wFrame);

if (packFrame)

{

frame.pack();

}

else

{

frame.validate();

}

Dimension frameSize = frame.getSize();

if (frameSize.height < screenSize.height)

{

frameSize.height = screenSize.height;

}

if (frameSize.width < screenSize.width)

{

frameSize.width = screenSize.width;

}

frame.setLocation((screenSize.width - frameSize.width)/2, (screenSize.height -

frameSize.height)/2 );

frame.setVisible(true);

}

public static void main(String[] args)

{

41


37/57


38/57

}

public int findClustersLabels(int x,int y,int width)

{

int w,nw,n,ne;

if(y == 0)

{

if(x == 0)

{

label++;

clustersMembers[x][y] = label;

labels[label] = Integer.MIN_VALUE;

}

else

{

if(clustersMembers[x-1][y] == 0)

{

label++;

clustersMembers[x][y] = label;

labels[label] = Integer.MIN_VALUE;

}

else

clustersMembers[x][y] = clustersMembers[x-1][y];

}

}

labels[label+1] = Integer.MAX_VALUE; //a flag that shows where the array ends

return label;

}

//////////////////////////////////////////////////

41


39/57

FaceDetector.java

private int[] detectFaces(SSRFilter filter)

{

int label = 0;

for (int i = 0; i < fWidth; i++)

for (int j = 0; j < fHeight; j++)

clustersMembers[i][j] = 0;

ConnectedComponents CC = new ConnectedComponents(labels, clustersMembers);

Point noseTip = findNoseTip(face);

if( noseTip != null )

{

int coordinates[] = new int[2];

41


40/57


41/57

break;

}

else

break;

}

return gradiants;

}

//////////////////////////////////////////////////

private Point findNoseTip(int face[])

{

int x1,y1,x2,y2,xLen,yLen,length,step,sX,sY,cX,cY;

double slope;

x1 = face[0];

y1 = face[1];

x2 = face[2];

y2 = face[3];

xLen = x2-x1;

yLen = y2-y1;

length = (int)Math.sqrt(Math.pow(xLen,2)+Math.pow(yLen,2));

step = Math.abs(yLen != 0 ? xLen/yLen : xLen);

step = ( step < 3 ? 3 : step );

slope = ( yLen < 0 ? -1 : 1);

int ROI[] = new int[length*length];

sX = x1;

sY = y1;

for (int y = 0; y < length; y++) //extract face 'Region Of Interest'

{

cX = sX;

cY = sY;

for (int x = 0; x < length; x++)

{

if ( (cX >= 0) && (cX < fWidth) && (cY < fHeight))

41


42/57

ROI[y * length + x] = grayPixels[cY * fWidth + cX];

else

ROI[y * length + x] = ROI[(y-1)*length+x];

cX++;

if ( (x + 1) % step == 0)

cY = (int) (cY + slope);

}

if ( (y + 1) % step == 0)

sX = (int) (sX - slope);

sY++;

}

Point candidates[];

if( candidates != null )

{

int gradiants[] = calculateGradiants(candidates, length);

int uMin1 = Integer.MAX_VALUE,uMin2 = Integer.MAX_VALUE,

lMin1 = Integer.MAX_VALUE,lMin2 = Integer.MAX_VALUE;

int lMinGrad,lMinIndex,uMinGrad,uMinIndex,minGrad,minIndex;

int uInd1=0,uInd2=0,lInd1=0,lInd2=0,gLength;

gLength = gradiants.length;

for(int i=0 ; i


43/57

if( gradiants[i] < lMin1 )

{

lMin1 = gradiants[i];

lInd1 = i;

}

for(int i=3*gLength/4 ; i= 0.5 )

{

lMinGrad = lMin1;

lMinIndex = lInd1;

}

else

{

lMinGrad = lMin2;

lMinIndex = lInd2;

}

if( (double)uMin1/(double)uMin2 >= 0.5 )

{

uMinGrad = uMin2;

uMinIndex = uInd2;

}

else

{

uMinGrad = uMin1;

uMinIndex = uInd1;

}

if( (double)uMinGrad/(double)lMinGrad >= 0.5 )

{

minGrad = lMinGrad;

41


44/57

minIndex = lMinIndex;

}

else

{

minGrad = uMinGrad;

minIndex = uMinIndex;

}

int start;

if( minIndex >= gLength/2)

{

if (minIndex < 3 * gLength / 4)

start = gLength / 2;

else

start = 3 * gLength / 4;

}

else

start = 0;

int max = 0,index = 0;

for( int i=start ; i max) && (gradiants[i] != Integer.MAX_VALUE) )

{

max = gradiants[i];

index = i;

}

if( candidates[index] == null )

return null;

Point noseTip = new Point((int)candidates[index].getX(),index);

slope = (double)yLen / (double)xLen;

double angle = Math.atan(slope);

double x = Math.cos(angle)*noseTip.getX()-Math.sin(angle)*noseTip.getY();

double y = Math.sin(angle)*noseTip.getX()+Math.cos(angle)*noseTip.getY();

x += face[0];

y += face[1];

noseTip.setLocation(x,y);

41


45/57

return noseTip;

}

return null;

}

CHAPTER 5

41


46/57

TESTING

Testing is a process used to help identify the correctness, completeness and

quality of developed computer software. With that in mind, testing can never completely

establish the correctness of computer software it just find out the faults whatever we did

in our project.

Testing helps is verifying and Validating if the Software is working as it is intended to

be working. This involves using Static and Dynamic methodologies to Test the

application.

Software spends more time being maintained than being developed. Things that were obvious in development are easily obscured.

Fixing bugs is easier than finding them and making sure nothing else

breaks.

The real cost of software is programmer time and effort, not length of

code, licenses or required hardware.

5.1.Scope

Recent improvement in eye controlling technology in terms of accuracy permits on

the one hand to design more elaborated and sophisticated tasks for the eye, such as

zooming or scrolling and on the other to propose gaze interaction as an alternative to

other interaction techniques such as switches, or head pointing that most of the times

require more physical effort from the user. This implies two interesting reflections for

researchers.

The eye is not a mouse and we should design eye controlled methods to carry out

specific tasks. This involves analyzing eye movements and detecting eye potentials

for selected actions and improving interface design to incorporate new actions in the

most natural and effective way for the eye.

Evaluating the eye as an interaction tool. Although the eye seems to be a quicker way

of communication it is necessary to carry out systematic comparative studies with

other interaction techniques to assure it. This would permit to show the cases where

41


47/57

the eyes performs better, detecting eye interaction defects and proposing methods to

compensate for these problems.

5.2 Functional Testing

This testing involves live video capture to test whether the live video is captured in

the captured image segment. These steps involves installing the webcam and

initializing the JMF.

The results involves detection of live video captured and all the connected devices to

the system which perform eye controlled computer interaction.

5.3 Integration Testing

This testing involves the face detection to test whether the face is detected. In this we

click on the detect face button to detect the face and its facial features.

The results involves the detection of the face candidates and facial features like nose

tip, eyebrows and eyes. They are detected by using the SSR Filter.

5.4 Performance Testing

This testing involves face tracking to track the face movements while changing the

location of the face. In this movement of the captured figure is necessary.


actual results the facial features are not exactly detected.

5.5 Acceptance Testing

41


48/57

This is the final step in which the accepted input is interacted using the enable

interface button. In this we enable the enable interface button where the mouse

movements are replaced with the facial features to interact with the computer.

The result involves the replacing of the mouse movements with the facial features byclicking on the enable interface button.

CHAPTER 6

RESULTS

6.1 SNAPSHOTS OF PROJECT

6.1.1 Login GUI

41


49/57

6.1.2 Detecting the Face

41


50/57

6.1.3 Face Tracking

41


51/57

6.1.4 Enabled Visage

41


52/57

6.2 OUTPUT FOR DIFFERENT SAMPLE TEST

DATA

41


53/57

Fig6.2.1:Sample frames from sessions testing alternate positions of the camera.The

system still works accurately with the camera placed well below the users face.as well as

with the camera rotated as much as about 45 degrees

41


54/57

CHAPTER 7

CONCLUSION

7.1 Conclusion

We have successfully implemented the project. Thus we have developed a project

capable of replacing mouse with facial features to interact with computer.

The main usage of this project is that only a simple webcam with moderate resolution

is sufficient to capture the live video. People with hand disabilities can use this project

effectively.

The drawback of this project is that involuntary eye blinks also trigger mouse clicks.

Also it may lead to strain in the eyes due to continuous blink.

The system proposed in this project provides a binary switch input alternative for

people with disabilities .However, some significant improvements and contributions

were made over such predecessor systems.

7.2 Future Enhancement

Involuntary eye blinks can be detected and they can be prevented from firing the

events.

The sensitivity of nose movement is very high that even a small reaction induces

greater mouse movement. Higher frame rates and finer camera resolutions could lead

to more robust eye detection that is less restrictive to any user, while increased

processing power could be used to enhance the tracking algorithm to more accurately

follow the users eye and recover more gracefully when it is lost.

The ease of use and potential for rapid input that this system provides could be used

to enhance productivity by encorporating it to generate input for a task in any general

programs.

41


55/57

APPENDIX

ACRONYMS AND ABBREVATIONS

BTE Between the eyes

41


56/57

HCI Human Computer Science

JMF Java Media Framework

SSD Squared Difference

SSR Six-Segmented Rectangular Filter

NBP Nose Bridge Point

GUI Graphical User Interface

AWT Abstract Windowing Toolkit

JAR Java Archive File

BIBLIOGRAPHY

REFERENCES

41


57/57

[1] E.Hjelmas and B.K.Low(2001), Face Detection: A survey, Computer Vision and

Image Understanding.

[2] S.Kawato and J.Ohya,(October 2000) Two-step Approach for Real- Time Eye

Tracking with a New Filtering Technique: IEEE Int. Conf. on Systems, Man &

Cybernetics, Nashville, Tennessee, USA.

URLs:

WWW.WIKIPEDIA.ORG

WWW.SOURCEFORGE.NET
http://www.wikipedia.org/http://www.sourceforge.net/http://www.wikipedia.org/http://www.sourceforge.net/

eye controlled documentation 1

Documents