characters extraction for traffic sign destination boards...

82
Characters Extraction for Traffic Sign Destination boards in video and still images Qiu Peng 2010.9.30 Master Thesis Computer Engineering Nr:E3986D

Upload: others

Post on 21-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Characters Extraction for

    Traffic Sign Destination

    boards in video and still

    images

    Qiu Peng

    2010.9.30

    Master

    Thesis

    Computer

    Engineering

    Nr:E3986D

  • II

    DEGREE PROJECT

    Computer Engineering

    Programme Reg number Extent

    Masters Programme in Computer Engineering - Applied

    Artificial Intelligence

    E3986D 15 ECTS

    Name of student Year-Month-Day

    Qiu Peng 2010.9.30 Supervisor Examiner

    Hasan Fleyeh

    Company/Department Supervisor at the Company/Department Computer Science Hasan fleyeh

    Title

    Recognition characters on the Destination board Keywords RGB, HSV, extract character, SVM

    Abstract

    Traffic Control Signs or destination boards on roadways offer significant information for

    drivers. Regulation signs tell something like your speed, turns, etc; Warning signs warn

    drivers of conditions ahead to help them avoid accidents; Destination signs show distances

    and directions to various locations; Service signs display location of hospitals, gas and rest

    areas etc. Because the signs are so important and there is always a certain distance from them

    to drivers, to let the drivers get information clearly and easily even in bad weather or other

    situations. The idea is to develop software which can collect useful information from a special

    camera which is mounted in the front of a moving car to extract the important information and

    finally show it to the drivers. For example, when a frame contains on a destination drive sign

    board it will be text something like "Linkoping 50",so the software should extract every

    character of "Linkoping 50", compare them with the already known character data in the

    database. if there is extracted character match "k" in the database then output the destination

    name and show to the driver. In this project C++ will be used to write the code for this

    software.

  • III

    ACKNOWLEDGMENT First, I would like to thank my advisor, Mr. Hasan Fleyeh. The decision he made in Fall, last

    year, to take me as one of his graduate assistants gave me the opportunity to do the research I

    am interested in. I am grateful to his support and advice ever since then. He creates a

    wonderful and dynamic environment for me to learn and gives me the freedom to explore the

    interesting problems in field of Computer Vision and Digital Image Processing.

  • 1

    TABLE OF CONTENS 1 . C h a p t e r 1 I n t r o d u c t i o n……………………………. .………………… . . 3

    1.1 The background………………………………………………………….…………….4

    1.2 Application of road sign recognition system………………………………...………...4

    1.3 Aim………………………………………………………………………...………......5

    1.4 Contents arranged………………………………………………………….……..……5

    2 . Chapter 2 I ma ge process ing theory…………………………………… . 7

    2.1 Image acquisition………….………………………………………………………...…8

    2.2 The HSV color model………………..………………………………………………...8

    2.2.1 Theory details……………………….…………………………………………….8

    2.2.2 HSV color model definition……………………………………………...……….9

    2.3 Image segmentation…………………………..……………….………….……….….10

    2.4 Shadow and highlight invariant color segmentation……...……………………..…....10

    2.4.1 Theory details……………………………………………………………..….…..11

    2.5 The noise problem……………………………….…………………………….….…..11

    2.5.1 Problem with noise filters…………………………...……………………...……11

    3. Chapter 3 Support vector machine…………………………………… .15

    3.1 Introduction…………………………………………………………………..…...…..16

    3.2 Machine learning……………………………………………………………..…..…...16

    3.3 Statistical learning theory…………………………………………………...………...16

    3.4 Support vector machine……………………………………………………...………..17

    3.5 Two situations……………………………………………………………….…..……18

    3.5.1 Liner separated problem…………………………………………………….…....18

    3.5.2 Non liner separated problem………………………………………………....…..20

    3.6 Kernel function……………………………………………………………………….21

    3.7 Use Kernel function to solve non liner problem………………………………….......21

    4 . Chapter 4 The implementa t ion………………… . .………… . . .………22

    4.1 Real time traffic signs recognition flowchart…………………………………….…...24

    4.2 application component…………………………………………..………………....…25

    4.2.1 Implementation of background extraction module…………………………………25

    4.2.2 Apply shadow and highlight invariant segmentation algorithm…………………....29

    4.2.3 Algorithm Implementation………………………………………………………….30

    4.2.4 Extraction area implementation………………………………………………...…..31

    4.2.5 Second Time Image Processing Module…………………………………...………34

    4.2.6 Character extraction module implementation………………………………………37

    4.2.7 Training and testing with SVM……………………………………………………..42

    5. Chapter 5 Analys i s and resul t………………………………………… .47

    5.1 Analysis the application……………...…………………………………….…………48

    5.1.1 Analysis with the character extraction part…………………………………………48

    5.1.2 Analysis of hsv color model image…………………………………………………48

    5.1.3analysis of the character extraction algorithm………………………………………48

    5.1.4 analysis of the noise filter algorithm……………………………………..…………50

    5.1.5 result……………………………………………………………….………………..51

    5.1.6 Character recognition……………………………………………………………….66

    5.1.7 SVM applied here……………………………………………………………….….66

    5.1.8 Test with Liner function…………………………………………………………….68

    5.1.9 Test with Polynomial function……………………………………………………...70

    5.1.10 Test with RBF function……………………………………………………………72

    5.1.11 Test with Sigmoid…………………………………………………………………74

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    2

    6. Chapter 6 conclusion and future works…………………………………....…………..76

    6.1 conclusion……………………….…………………………………………………..77

    6.2 future works…………………………..……………………………...……………...77

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    3

    Chapter 1 Introduction

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    4

    1.1 The background In an environment with all kinds of traffic signs and city names. These kind of things plays an

    important role in regulating the traffic and warns the driver to prohibit certain actions for their

    safety and for the safety of their passengers.

    Road signs use colors, shapes, and markings to communicate message to on road drivers.

    Without such information the motion of traffic would be disorderly and unpredictable. It‟s

    very crucial for drivers to identify road signs at right time, at right place but at times when

    everything is expected to be perfect, from others off course, we tend to forget the inherent

    imperfection of mankind . Noticing these safety precaution signs on the road greatly depends

    on the physical and mental health of the drivers. There visual perception ability can be

    affected by stress, tension and physical illness and sometimes it‟s the lack of knowledge about

    road signs. According to a recent poll conducted by motoring website, New Car Net, one in

    three motorists fail to recognize even the most basic Road Signs. It‟s because of these reasons

    an autonomous robust real time road sign recognition system has gained interest since last two

    decades. The very first paper appeared in 1984 which aimed on testing various computer

    vision methods for detection of objects in outdoor scenes. Since then many research groups

    and companies have been interested and have conducted research in the field. Computer

    vision has been applied to a wide variety of intelligent transport systems (ITS)[1] such as

    traffic monitoring system, traffic related parameter estimation and intelligent vehicles, and an

    important part of intelligent vehicles is the detection and recognition of Road signs. A robust

    real time and automatic road sign detection and recognition system can really support and

    disburden drivers by giving information at good time; it can increase driving efficiency, save

    lives and can provide driving comfort.

    1.2 Application of Road Sign Recognition System The Road Sign Recognition is a field of applied computer vision research concerned with the

    automatic detection and classification of road signs in traffic scene images acquired from a

    moving car. The result of this research effort will be the subsystem of Driver Support System

    (DSS). The aim is to provide DSS with the ability to understand its neighborhood

    environment and so permit advanced driver support such as collision prediction and

    avoidance.

    Employing computer vision technology in smart vehicle design calls for consideration of all

    its advantages and disadvantages. Firstly, vision subsystem incorporated into the DSS may

    exploit all the information processed by human drivers without any requirements for new

    traffic infrastructure devices (a very hard and expensive task). Smart cars equipped with

    vision based systems will be able to adapt themselves to operate in different countries (with

    often quite dissimilar traffic devices).

    As the integration of various technologies in the field of traffic engineering has been

    introduced (ITS) the convenience of computer vision usage has become more obvious. We

    may observe this trend e.g. in proceedings of annual IEEE International Conference on

    Intelligent Vehicles (IVS). More than 50% of papers are focused on Image Processing and

    Computer Vision method.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    5

    Obviously, there exist even disadvantages of the vision-based approach. Smart vehicles will

    operate in real traffic conditions on the road. So, the algorithms must be robust enough to give

    good results even under adverse illumination and weather conditions. Although this system

    property may seem to be solved easily it is the real challenge for the algorithm developers.

    For example Fridtj of Stein, main project manager of Cleopatra project (Clusters of embedded

    parallel time-critical applications) said that "reliable optical detection is the biggest hurdle the

    project must overcome".

    There cannot be assured absolute system reliability and the system will not be "fail-safe"

    because of the definition of individual transportation system. The aim is to provide a level of

    safety similar to or higher than that of human drivers. For example it could assist drivers

    about signs they did not recognize before passing them. Specifically, speed limit sign

    recognition could provide driver the present speed limit as well as giving an alert if a car is

    driven faster than the speed limit.

    In future, autonomous vehicles would have to be controlled by automatic road sign

    recognition. As with any vehicle, an autonomous vehicle driving on public roads must obey

    the rules of the road. Many of these rules are conveyed through the use of the road signs,

    soan autonomous vehicle must be able to detect and recognize sings and change its behavior

    accordingly.

    1.3 Aim Aim of this research project is to present an Intelligent Road Sign Recognition System based

    on state-of-the-art technique, the Support Vector Machine and image processing skills.

    The project is an extension to the already known system that can recognize traffic sign. This

    application can extract every character on the destination board then output the city name.

    1.4 Contents arranged Chapter 1 Image processing

    Image acquisition (this part introduced how and what the types of images was

    captured)

    HSV color model (this part introduced from the RGB color model to the HSV

    color model, and the advantage the HSV have compared with

    RGB in the segmentation field)

    Shadow And Highlight Invariant Color Segmentation Algorithm (this part

    shows how the

    color was

    extracted and

    can be

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    6

    distinguished

    with other

    colors)

    Character extract (show how the characters was extracted)

    Character Normalization (normalize the extracted characters to be 30*30 pixels)

    Chapter 2 SVM theory

    Introduction (introduce the SVM theory)

    Machine learning (shows the origin of SVM)

    Statistical Learning Theory (another part of the origin of SVM)

    Support Vector Machine (what is SVM, and how it works)

    Two situations (the liner and non-liner problems)

    Kernel Function (introduce the kernel formula)

    Chapter 3 Implementation (shows the steps how the theory works with real life problems)

    Chapter 4 Analysis (analysis the application based on these theories, and shows out how the

    application works, the result we get by this application)

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    7

    Chapter 2 Theoretical background

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    8

    2.1 Image acquisition Image acquisition is the first step of the Traffic Sign Recognition. An input image can be

    either taken by the live stream from the camera mounted on the vehicle‟s deck or taken from

    the video for an experimental purpose. The video format, acceptable by the OpenCV platform,

    should be in the AVI format. Each frame of the video is in a RGB Image format. The

    dimension of captured image is set to be 400 x 600 pixels set by my application.

    Figure below shows such an example

    Figure 2.1 sample image from video stream

    2.2 The HSV color model

    1. The image acquired by the camera is in RGB format is greatly sensitive to chromatic

    variation of the daylight. The coordinates of three colors are highly correlated. As a

    result of this any variation in the ambient light intensity affect the RGB system by

    shifting the cluster of colors towards the white or the black corners. As a result, it

    will be hard to recognize the object.

    2. HSV was the ideal color model for the recognition problem since it decouples the chromatic and achromatic notion of light. This method is also preferable because

    Hue feature is invariant to shadows and highlights.

    3. HSV represents the colors in a similar way by which human eye senses the color.

    2.2.1 Theory details

    Every Color in this space is represented by three components:

    1. Hue (H): the apparent light color (determined by dominant wavelength).

    2. Saturation (S): the purity of light.

    3. Value (V): the total light across all frequencies.

    The HSV model is illustrated as a conical object. The cone is usually represented in the three

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    9

    dimensional form. The hue is represented by the circular part of the cone. The saturation is

    calculated using the radius of the cone and value is the height of the cone. Advantage of the

    conical model is that it is able to represent the HSV color space in a single object.

    2.2.2 HSV color model defination

    Figure 2.2 HSV color model

    The hue Red, Green and Blue (RGB) are the three primary colors used by computer monitors. 180

    degree away from a primary, none of it is mixed in. These colors are the complement hues i.e.

    Cyan, Magenta and Yellow. The next level colors are between the secondary and primary

    colors, are called the tertiary hue colors. This process continues, creating a solid ring of colors

    around the primaries. This definition of color describes just one dimension of color that is hue.

    Hue is more specifically described by the dominant wavelength. Hue describes a dimension of

    color readily experienced by the eye. Hence it is the dimension of color interpreted by the

    human brain.

    The value Value is the brightness of the color, ranges from 0 to 100% and varies with color saturation.

    When the value is 0, the color space will be completely black. In terms of a spectral definition

    of color, value describes the overall intensity or strength of the light. If The hue can be

    thought of as a dimension. go around a wheel, then value is a linear axis like an axis running

    through the middle of the wheel as shown in figure up.

    The saturation Saturation refers to the dominance of hue in the color. On the outer edge of the hue wheel, are

    the 'pure' hues. Near the center of the wheel, the hue to describe the color dominates less and

    less. Exactly in the center of the wheel, no hue dominates. These colors directly on the central

    axis are considered de-saturated. These de-saturated colors constitute the gray scale ranges

    from 0 to 100%, running from white to black with all of the intermediate grays in between,

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    10

    perpendicular to the Value axis.In terms of a spectral definition of color, saturation is the ratio

    of the dominant wavelength to other wavelengths of color. White light is white because it

    contains an even balance of all wavelengths.

    Here are two images with RGB color field and HVS color field

    Figure 2.3 RGB and HSV image

    2.3 Image segmentation

    1. Image Segmentation is a process by which the specific objects in the image are

    distinguished from the background. Based on the color information candidate traffic sign

    needs to be separated from the rest of the image.

    2. By segmenting the image in the binary image, only two types of pixels are left to be

    processed, those are “white and black”. In this way the complexity of the image

    processing will be reduced for Traffic Sign Recognition.

    3. The processing time will be improved too, because only two intensity levels will be

    used for processing the image.

    2.4 Shadow And Highlight Invariant Color Segmentation

    Algorithm Most of the times, the weather condition will give big problems for us to extract traffic signs,

    for example may strong sun shine will make some color of traffic signs missing.

    Figures showing below

    Figure 2.4 original image

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    11

    Figure 2.5 color segmentation with better algorithm

    2.4.1 Theory details The color segmentation algorithm is carried out by taking RGB images using a digital camera

    mounted on a moving car. The images are converted to HSV color space. The hue, saturation,

    and value are normalized into [0,255]. The HSV color space is chosen because the Hue

    feature is invariant to shadows and highlights.

    While normalized Hue is used as a priori knowledge to the algorithm, normalized Saturation

    and Value are used to specify and avoid the achromatic subspaces in HSV color space. When

    the hue value of color of the pixel in the input image is with the specified color range

    specified in figure below, and its hue value is not in the achromatic area, then the

    corresponding value in the output image is set to white. The output image is then divided into

    a number of 16x16 pixel sub-images and used to calculate the seeds for the region growing

    algorithm. A seed in initiated if the number of white pixels in the output image is above a

    certain threshold level. Region growing algorithm is then applied to find all the objects in the

    output image which are big enough to initiate at least one seed. Noise and other small object

    are rejected because of the region growing algorithm. This has an advantage that no more

    filtering is needed to delete these objected and the remaining objects are only the ones which

    can be used for recognition.

    2.5 The extraction of every character (traffic sign)

    2.5.1 Character extraction algorithm Due to learn the traffic sign board, that gives some very important theories. Is for every traffic

    board in the world, they all have a background. Then put the city name and other something

    on the background.

    Why the background will be painted out, is because, the background color was chosen very

    carefully, totally different from the whole environment that can be easily looked, so the

    background can give most attention to the people there is some information. If there was no

    such background, only characters exist in the air. People are very easy to ignore they.

    So the algorithm is built on such theory and combined with the background color which is

    used in Sweden, light blue.

    For the first, the application will get a image from the video stream.

    Then applied the HSV algorithm to process the image. Extract the light blue color then turn

    the light blue area white and the rest of them black.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    12

    And rescan the image pixel by pixel from four directions. they are from top to bottom, from

    bottom to top, from left to right, from right to left. And for every time scan, if the pointer meet

    one white pixel them then break, and save the position values where they meet white pixel.

    After rescan, should be given 4 position values, they are most top, most bottom,

    Most left, and most right. So use these points, to find out the matrix that will be the area use to

    extract the characters.

    Combine the most top value and most left value to find out the left bound, and use the most

    right, and most bottom value to find out the right bound.

    After make sure the area where to extract.

    Because there was commonly two colors being write down on the board (the white color and

    the black color characters).

    First for the white color, apply the HSV algorithm turn the whole image white color to be

    absolutely white, and the rest black. then rescan again the image pixel by pixel inside the area

    from top, left bound to bottom, right bound. and when find white pixel on any line and start a

    matrix, continue scan, then for one line, if can not find any white pixels exist, then stop the

    matrix, and keep the them into an array.

    Second for the black color, apply the HSV algorithm turn the whole image black color to be

    absolutely black, and the rest black. Then rescan again the image

    Pixel by pixel inside the area from top, left bound to bottom, right bound. and

    When find white pixel on any line and start a matrix, continue scan, then for one

    Line, if can not find any white pixels exist, then stop the matrix, and keep the them

    Into the same array as the white chars.

    The next, is pop out every matrix, calculate the matrix position value, and from the values to

    do scanning form left to right for every row from top to bottom, if can be find any color

    change, then start a matrix to save them, and put it into the bottom of the array. if can not find.

    Then means, this is a char. And put them into another array, that only keeps chars after

    separation.

    When every characters was separated, a final array we can get, that keeps every chars.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    13

    FIGURES showed below:

    Figure 2.6 original image

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    14

    Figure 2.7 image process result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    15

    Chapter 3 Support vector machine

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    16

    3.1 Introduction Support vector machines are widely used for pattern classification because of their good

    generalization ability compared with conventional classifiers. In support vector machines the

    input space is mapped to a higher dimensional space called the Feature Space. The aim is to

    find an optimal hyperplane in this higher dimension feature space that can separate the data in

    the best way possible. Since training of a support vector machine is formulated as a quadratic

    optimization problem with the number of variables being equal to the number of training data,

    a global optimal solution can be achieved. Among those training data set the instances

    necessary for the construction of the decision function are the ones closer to the class

    boundary. These are called the Support Vectors.

    3.2 Machine Learning Being a broad field of Artificial intelligence, Machine Learning is concerned with the

    development of algorithms and techniques that allow computers to Learn. It has a wide

    spectrum of applications including object recognition, medical diagnosis, speech and

    handwriting recognition, robot locomotion, computer vision and many more. To be more

    specific the goal of machine learning is to ensemble learning and adaptation abilities of living

    species in computers; more deeply to program computers to use past experience to solve a

    given problem. Machine learning under went a great deal of advancement in the late eighties

    and nineties with the active research done in the field of Artificial Intelligence and Neural

    Networks. These advancements in machine learning will lead researchers in understanding the

    learning behavior in humans and animals and systems like I-Swarm Robots, that imitate the

    behavior of ant colonies performing tasks which are much difficult and unsafe for humans to

    performance, and the success of DARPA grand challenge have shown the achievements and

    upcoming challenges in this field. Learning can be categorized in various types some as

    follows:

    Supervised learning

    • Learning form examples.

    • Learning by taking advice. Unsupervised learning

    • Competitive learning.

    • Clustering.

    • Reinforcement learning. In context of object recognition, machine learning aims on finding a pattern of similarity or

    structure in a data set that will lead to generalization of learning system and consequently

    identification of unknown data.

    3.3 Statistical Learning Theory Support vector algorithms are considered as the first practical spin-off of statistical learning

    theory. Therefore, it‟s important to have a little insight about statistical learning theory

    before going into details of Support Vector Machine. Statistical learning theory addresses the

    fundamental issue of how to control the generalization ability of a neural network in

    mathematical terms. Since SVM is a set of supervised learning algorithms, so statistical

    theory is only reviewed in its context.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    17

    There are three basic components interrelated with each other in a supervised learning model.

    These are:

    The feasibility of the system depends how much information does the training set has,

    generated by the joint probability distribution function of environment and supervisor R(x, d) ,

    for the learning system to have good generalization. Supervised learning problem can be

    viewed as an approximation problem.

    3.4 Support Vector Machine Support Vector Machine is a linear classifier, using the roots of statistical learning theory and

    the very powerful kernel function, and are more demandingly used for solving classification

    and regression problems. It‟s a linear machine closely related to classical Neural Networks,

    infact a support vector machine with a sigmoid kernel function acts as a two-layer feed

    forward neural network. SVM is based on the concept of decision planes that defines the

    decision boundaries. . To explain the main idea of a support vector machine perhaps the

    easiest way is to take the scenario of separating patterns that arises in context of pattern

    classification. In that case the role of support vector machine would be to draw a decision

    surface which will be called Hyperplane, Such that the distance between the closest samples

    and the hyperplane is maximized. This distance between the closest sample and the

    hyperplane is known as the Margin and the closest samples with respect to which we calculate

    the margin are called the Support Vectors.

    Finding a hyperplane with maximum margin is very important. It helps prevent data over

    fitting problem and enables the system to classify unknown samples from testing set which

    come closer to hyperplane. A hyperplane with maximum margin is called the Optimal

    Hyperplane.

    Any classification task consists of data instances divided into two sets:

    • Training set: used to train the system.

    • Testing set: used to test the learning of the system.

    Now each instance in the training set has one “target value” called the Class Label along with

    several “attributes” called as Features. The task of selecting the most suitable features for

    learning and testing is called Feature Selection. It‟s these features that help the learning

    system define the hyperplane.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    18

    3.5 Two situations

    3.5.1 liner separated problem For this problem, all the data can be line separated as figure showed below

    Figure 3.1 liner separation model

    The SVM can easily find some straight lines that can separated them.

    According to the image:

    Considering a finite set of input space

    (3.1) generated through probability distribution function.

    Xi represent data instance from input space X.

    Di represent the corresponding output of input space { -1, +1 }

    Optimal Margin Hyperplane:

    Figure 3.2 optimal margin hyperplane

    In neural terms a hyperplane separating a linearly separable data is represented by following

    equation:

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    19

    (3.2) w is the weight vector orthogonal to the hyperplane (decision surface), controlling the angular

    movement of the hyperplane.

    b is the bias controlling the movement of the hyperplane parallel to the origin.

    Figure below can present:

    Figure 3.3 calculate separation line

    The formal equation

    Figure 3.4 equation of separation line

    To emphasis the effect of choosing the decision surface with maximum margin let‟s take two

    hyperplanes such that there orientation allows one to have greater margin then the other.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    20

    Training sets

    Figure 3.5 seperation line with training sets

    Testing sets

    Figure 3.6 seperation line with testing sets

    Form this, we can easily find that: Data points in GREEN color are the points that come

    inside the margin but still distinguished by the hyperplane but the data points in BLUE color

    are the ones those are not recognized by the hyperplane.[6]

    So, we can conclude from the above example figure some data instances came too close to

    hyperplane but the left side hyperplane, the one with greater margin, was able to classify them

    because of its flexibility but the hyperplane with small margin, the one on the right side,

    wasn‟t able to classify some of the data instances as they lie on the hyperplane. Such flexible

    hyperplane is called the Optimal Hyperplane giving the optimal results on both the training

    and the testing set.

    3.5.2 non-liner separated problem for the most of the real world problems is non-liner problem.[11][12]

    These kind of problems requires non-linear dividing line for separating the instances into two

    classes such as the one shown in the figure below:

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    21

    Figure 3.7 non-liner separation model

    This is the point where some advance technique for handling the situation are required and

    this is where the concept of Kernel comes in handy. Rather then fitting a nonlinear curve to

    the data set the Support Vector Machine uses the kernel function to map the data into a

    different space where a linear hyperplane can be used as the dividing line.

    This higher dimensional mapping space is called the Feature Space. The functional concept of

    kernel mapping is very important and powerful. It allows SVM models to perform separations

    even on data set having very complex boundaries by using N-dimensional hyperplanes.

    3.6 Kernel function Kernel defined the function to map the classes from a space that is non-liner separated to

    another space that will be liner-separated.

    Based on the kernel function, we can easily do training samples to get template, and input

    data then get finally result.

    3.7 Use kernel function solve non-liner problem A Support Vector Machine (SVM) performs classification by constructing an N-dimensional

    hyperplane that optimally separates the data into two categories. SVM models are closely

    related to neural networks. In fact, a SVM model using a sigmoid kernel function is

    equivalent to a two-layer, perceptron neural network. Support Vector Machine (SVM) models

    are a close cousin to classical multilayer perceptron neural networks. Using a kernel function,

    SVM‟s are an alternative training method for polynomial, radial basis function and multi-

    layer perceptron classifiers in which the weights of the network are found by solving a

    quadratic programming problem with linear constraints, rather than by solving a non-convex,

    unconstrained minimization problem as in standard neural network training. In the parlance of

    SVM literature, a predictor variable is called an attribute, and a transformed attribute that is

    used to define the hyperplane is called a feature. The task of choosing the most suitable

    representation is known as feature selection. A set of features that describes one case (i.e., a

    row of predictor values) is called a vector. So the goal of SVM modeling is to find the optimal

    hyperplane that separates clusters of vector in such a way that cases with one category of the

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    22

    target variable are on one side of the plane and cases with the other category are on the other

    size of the plane. The vectors near the hyperplane are the support vectors. The figure below

    presents an overview of the SVM process.

    Figure 3.8 A Two-Dimensional example

    If all analyses consisted of two-category target variables with two predictor variables, and the

    cluster of points could be divided by a straight line, life would be easy. Unfortunately, this is

    not generally the case, so SVM must deal with (a) more than two predictor variables, (b)

    separating the points with non-linear curves, (c) handling the cases where clusters cannot be

    completely separated, and (d) handling classifications with more than two categories.

    three kernel mapping functions motioned last chapter can be used – probably an infinite

    number. But a few kernel functions have been found to work well in for a wide variety of

    applications. The default and recommended kernel function is the Radial Basis Function

    (RBF).

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    23

    Chapter 4 The Implementation

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    24

    4.1 Real-time traffic signs recognition flowchart the system is based upon the four main steps (including sub-steps) which include one more

    step of „Tracking‟ for faster search by the prediction of next search region. The flow chart in

    figure depicts the final design of the real-time traffic sign recognition system:

    Figure 4.1 flow chart of the project

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    25

    4.2 application component The image showed below is the structure of the characters recognition system

    Figure 4.2 the System processing flowchart

    4.2.1 Implementation of background extraction module

    In this part, and job is to extract the traffic sign background use HSV color segmentation.

    Because of when extract the characters, and the environment will give a lot of noises, and

    these noises will be very difficult to remove or remove them will take much CPU processing

    resources, so extract the traffic sign background color, will produce the less noises, and easy

    to make a region for extract characters only in the gray mode.

    The reason to extract the background color is due to the destination board background plays a

    very important role in the character extraction part. So for every image first extract the back

    ground shows below, then do the image segmentation.

    Because for the absorbing people attention aim. It is very easy to extract the light blue board

    back ground from the image. Method and implementation showed in Chapter 4.2.4.

    To get every image for the board here gives three method and works with almost every

    destination board. Use array to save the matrix, then do pop check and separation. If for every

    matrix. Counts was 2.then got results, if not, then continue pop check and separation till

    Counts was 2.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    26

    Figure 4.3 Original image

    Figure 4.4 Extract the traffic sign background color use color segmentation

    The color segmentation algorithm showed below:

    Calculation formula:

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    27

    HSV is defined mathematically by transformations between the r, g, and b coordinates. Let r,

    g, b ∈ [0, 1] be the red, green, and blue coordinates in RGB color space. Let max be the

    greatest of r, g, and b, and min the least of r, g, and b. To find the hue angle h ∈ [0, 360] for HSV, compute the following equation:

    (4.2)

    (4.2) R image

    Figure 4.5 R image G image

    Figure 4.6 G image B image

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    28

    Figure 4.7 B image H image

    Figure 4.8 RGB and H image S image

    Figure 4.9 RGB and S image V image

    Figure 4.10 RGB and V image

    After converting the RGB color mode to HSV color mode

    According to the HSV color model, easily find the color range to extract

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    29

    Figure 4.11 HSV color model for find color range

    4.2.2 Apply Shadow And Highlight Invariant Color

    Segmentation Algorithm The Swedish National Road Administration defined the colors used for the signs in CMYK

    color space. These values are converted into Normalized Hue and Normalizes Saturation as

    shown in Table below.

    The color segmentation algorithm is carried out by taking RGB images using a digital camera

    mounted on a moving car. The images are converted to HSV color space. The hue, saturation,

    and value are normalized into [0,255]. The HSV color space is chosen because the Hue

    feature is invariant to shadows and highlights.

    While normalized Hue is used as a priori knowledge to the algorithm, normalized Saturation

    and Value are used to specify and avoid the achromatic subspaces in HSV color space.

    When the hue value of color of the pixel in the input image is with the specified color range

    specified in Table below, and its hue value is not in the achromatic area, then the

    corresponding value in the output image is set to white.

    Table 4.1 color space relationship with different conditions

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    30

    Table 4.2 specified color table

    4.2.3 Algorithm Implementation Step 1. Convert the RGB image into HSV color space.

    Step 2. Normalize the grey level of every pixel in the H image from [0,360] to [0,255].

    Step 3. Normalize the grey level of every pixel in the S image from [0,1] to [0,255].

    Step 4. Normalize the grey level of every pixel in the I image from [0,1] to [0,255].

    Step 5. For all pixels in the H image

    If (H pixel value >240 AND H pixel value= 0 AND

    H pixel value < 10) Then H pixel value =255

    Step 6.For all pixels in the S image

    If corresponding S pixel value < 40 Then H pixel value = 0

    Step 7.For all pixels in the V image

    If corresponding ( V pixel value < 30) OR( V pixel value > 230) Then H pixel value =

    0

    Application result:

    Figure 4.12 applied better algorithm

    Figure 4.13 result with better algorithm

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    31

    4.2.5 Extraction area implementation Because maybe the destination board was put in an very complex environment. and the

    background was white the same color with characters on the board, this case especially

    happened in Sweden, and if doing extract like that, will give o many noises and maybe will

    cause the extraction failed, so the background of the destination board become very important.

    That was the blue object can absorb human sight. And the algorithm is depends on this theory.

    Extract the blue background color and make sure the area and first, then record the blue color

    area coordinates, and next doing scanning inside the area. So can easily find out the characters

    with less noises.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    32

    See figures below

    Figure 4.14 the image after extract blue background

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    33

    Figure 4.15 the extraction area find

    In this step, keep the 4 coordinates in an array. they are top, bottom, right, left.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    34

    4.2.4 Second Time Image Processing Module In this part, due to use HSV color mode to extract the characters directly maybe will give

    many noises, that will give big troubles to the character extraction module, so for here gray

    image will be applied to extract the characters.

    Figure 4.16 Gray image

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    35

    Extract black characters

    (define a number, bigger than that number will give black, otherwise white)

    Figure 4.17 the black characters find

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    36

    Extract white characters

    (define a number, bigger than that number will give black, otherwise white)

    Figure 4.17 the white characters find

    Here, from these two images, we can still find a lot of noises, that will give troubles to the

    character extraction, especially the 4.13, but in last chapter, we have already defined a region.

    So for now, we just need to apply the 4 coordinates, so easy to fine a region showed as the red

    line area on the image.

    So for the next character extraction module, the application only need to start scan in side the

    red line area, that will give only a little noises, and reduce the calculation time and cpu

    resources.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    37

    4.2.5 Character extraction module implementation 1.For the first category: Texts on the image are well arranged, row by row or line by line.

    Figure 4.14 destination board suitable with height scan

    Step.1

    (1).Scan by every pixels from left to right, top to bottom.

    Figure 4.15 scan the image from L to R Figure 4.16 scan the image from T to B

    (2).If color changes, then give count.

    (3).For this image, width scan counts like 3, height scan counts like 1.

    (4).Width counts is more than height counts.

    Step.2

    (1).Use width scan to separate the image, find the matrix, see figure below:

    Figure 4.17 image after separated (2).put all the matrix into the array one by one showed below

    Figure 4.18 prepare the array

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    38

    Figure 4.19 how the image was put into array

    Figure 4.20 put image rects into container Step.3 get the first matrix from the array and redo step 1 and step 2, see figure below

    Figure 4.21 get the first matrix

    Figure 4.22 redo step.1 and step.2 and get new matrix

    Step.4 put these new matrixes into the array again, showed below

    Figure 4.23 put these matrixes into the array again

    Figure 4.24 third round scanning and image separation

    Step.5 then get them one by one, redo step one, when the scanning counts is equal to 1, that

    means can not being separated any more. If not then step.2 separate and put them

    into the array again. After images can not being separated anymore, Get result.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    39

    Showed below:

    Figure 4.25 results

    2.For the second category: Texts on the image but with some crosses and arrows that will

    interfere the texts separation field.

    Figure 4.26 destination board suitable with width scan

    Step.1

    (1).Scan from left to right, top to bottom.

    Figure 4.27 width scan Figure 4.28 height scan (2).If color changes, then give count.

    (3).For this image, height scan counts like 3, width scan counts like 1.

    (4).Height counts is more than width counts.

    Step.2

    (1).Use height scan to separate the image, see figure below:

    Figure 4.29 image after separated

    Step.3

    (1).Image like this can not give right city name after processed.

    (2).Add some threshold, the different color space if is less than a number, then

    (3).Do not separated them.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    40

    Figure 4.30 shows the threshold

    Figure 4.31 after mixed the bound

    Step.4

    (1).Use the method introduced in last section, then can get result:

    Figure 4.32 result

    3.For the third category: the image was not the normal image, somehow strange, so we should

    define some threshold in the image to help to separated them with correct output city name.

    All the texts were linked together and different color. Use scanning method can only find a

    big chunk. image showed below:

    Step.1

    (1).Width or height scan but with some threshold:

    Figure 4.33 Image not well arranged Step.2

    (1).deal with these kind of images. should put some threshold at the beginning

    (2).Like for one row if black pixels number is less than threshold, then put them

    all white. See figures below:

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    41

    Figure 4.34 find threshold

    Figure 4.35 calculate according to the threshold

    Step.3

    (1).Apply the method showing in section 1, it should be very easy to find the city name.

    Figure 4.36 final output of figure showing in 4.9

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    42

    4.2.6 Training and testing with SVM

    In this part, I will train the images coming from the image processing module, and testing

    them with the application link to the SVM library that download from internet, and with four

    different kernel functions. And show the result at chapter 5.

    Figure shows how the image was been processed to be feature vector.

    But for the first, build a character transfer table is very important, because this SVM lib only

    accept the digitals and do calculations, not characters.

    Table showed below:

    0---------------------------------------------------------------------------------------------noises

    1---------------------------------------------------------------------------------------------a

    2---------------------------------------------------------------------------------------------b

    3---------------------------------------------------------------------------------------------c

    4---------------------------------------------------------------------------------------------d

    5---------------------------------------------------------------------------------------------e

    6---------------------------------------------------------------------------------------------f

    7---------------------------------------------------------------------------------------------g

    8---------------------------------------------------------------------------------------------h

    9---------------------------------------------------------------------------------------------i

    10--------------------------------------------------------------------------------------------j

    11--------------------------------------------------------------------------------------------k

    12--------------------------------------------------------------------------------------------l

    13--------------------------------------------------------------------------------------------m

    14--------------------------------------------------------------------------------------------n

    15--------------------------------------------------------------------------------------------o

    16--------------------------------------------------------------------------------------------p

    17--------------------------------------------------------------------------------------------q

    18--------------------------------------------------------------------------------------------r

    19--------------------------------------------------------------------------------------------s

    20--------------------------------------------------------------------------------------------t

    21--------------------------------------------------------------------------------------------u

    22--------------------------------------------------------------------------------------------v

    24--------------------------------------------------------------------------------------------w

    25--------------------------------------------------------------------------------------------x

    26--------------------------------------------------------------------------------------------y

    27--------------------------------------------------------------------------------------------z

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    43

    Figure 4.39 image transferred to feature vecter

    Three steps:

    1. Prepare input data for training(sample).

    2. Train these data, and get result file.

    3. Predict information based on result file then get output.

    Step 1.training data

    If the problem is liner problem, then we can apply the liner function.

    Due to the data was non-liner separated, applied kernel function separated them.

    (4.3) Table the kernel functions(RBF was most common method, here were applied)

    Figure 4.40 training data based on kernel function

    Transferring the images into the feature data for training, figure showed below

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    44

    Figure 4.41 training data

    The template after processed the training data

    Figure 4.42 the template

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    45

    Figure 4.43 template file summary

    Step 2.input data

    Transferring the images into the feature data for training, figure showed below

    Figure 4.44 testing data

    Step 3. easily get output file that show result

    Figure 4.45 output(0 means noises, 1 means character a)

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    46

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    47

    Chapter 5 Analysis and result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    48

    5.1 Analysis the application

    5.1.1 Analysis with the character extraction part

    5.1.2 Analysis of hsv color model image

    1. Advantages of converting to HSV The HSV color space is quite similar to the way in which humans perceive color. The other

    models define color in relation to the primary colors. The colors used in HSV are clearly

    defined by human perception which is not always the case with RGB or CMYK.

    Hue played the central role in the color detection because it is invariant to the variations in

    light conditions as its scale invariant, shift invariant and invariant under saturation changes.

    HSV model has been very helpful to resolve the problems of Shadows and Highlights or

    the chromatic variation of the day light. For example a faded image is considered as one

    with the low saturation; the value of saturation can be tuned of that color as per the weather

    conditions. Therefore it is able to preserve the maximum image information.

    2. Problems with hue in HSV Color Space The hue coordinate is unstable and small changes in the RGB caused strong variation in hue.

    It suffered from three problems as stated by Fleyeh:

    When the intensity is very low or very high the hue is meaningless.

    When the saturation is very low, the hue is meaningless.

    When the saturation is less than the threshold value, the hue becomes unstable.

    5.1.3 analysis of the character extraction algorithm

    Due to the special environment, especially in Sweden, the characters on the destination board

    normally should be white, but for the winter in Sweden, everything is white. So use HSV

    method directly to extract the white character. Will give many noises and, for this application.

    All the extract images will be in the size 30*30, then normalize them to be full of the image.

    Here comes problem, for the character l, full image is a black square, but for many small

    black spot(noise), is also a black square, this gives recognition problem.

    So, make sure the noise is less as possible as we can, are very important. after extract the blue

    background to make sure the area should be extract. it still gives noise. So for here, we use

    region growing to filter them second time.

    And for the extraction, was divided into two parts.

    The first part is extract the area with the full city name.

    Showed below:

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    49

    Figure 5.1 first part extraction

    After get this area, then put them into an array.

    Second part

    Take them out from the array do recursion to split every characters.

    the first part processing time is liner, it is easy to calculate and estimated.

    Time consume showed below:

    Figure 5.2 first processing time growing

    the second part processing, due to this part, nobody can make sure the city name structures, so

    let recursion to finish the splitting. so the time spend is non-liner, if there was too many

    characters inside a city name, or some arrow, cross or something. That will consume a lot of

    time to process them.

    Time consume showed below:

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    50

    Figure 5.3 second processing time growing

    5.1.4 analysis of the noise filter algorithm For here I have applied the noise filter algorithm. This algorithm truly filtered a lot of noises.

    And the algorithm time consume is liner and is very to control, for one testing image, the

    difference can be found from the figure showed below:

    Figure 5.4 noises filter

    The red cols means the noises exist in the image, from the image can be easily find that, when

    applied this filter, noises reduced a lot. Because of the time consume is liner, so it is possible

    for users to decide filter the image with more times and use different size. That can gives

    better result.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    51

    5.1.5 Result 1.image result The algorithm used in this part, that can separated every character with good accuracy.

    but for this algorithm, still for some images, that is need to define the threshold. Because of

    the distance of the destination board image captured by the camera. When the destination

    board is far, threshold may give wrong result. b

    So under the complex destination board conditions for the whole video stream, maybe there

    was only one or two frames, the application can get perfect result,

    Here will be testing with ten images and see the result:

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    52

    Image1:

    Figure 5.1 image processing

    Figure 5.2 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    53

    Image2:

    Figure 5.3 image processing

    Figure 5.4 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    54

    Image3:

    Figure 5.5 image processing

    Figure 5.6 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    55

    Image4:

    Figure 5.7 image processing

    Figure 5.8 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    56

    Image5:

    Figure 5.9 image processing

    Figure 5.10 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    57

    Image6:

    Figure 5.11 image processing

    Figure 5.12 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    58

    Image7:

    Figure 5.13 image processing

    Figure 5.14 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    59

    Image8:

    Figure 5.15 image processing

    Figure 5.16 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    60

    Image9:

    Figure 5.17 image processing

    Figure 5.18 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    61

    Image10:

    Figure 5.19 image processing

    Figure 5.20 character result

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    62

    2.result analysis This character application runs based on:

    CPU:

    OPERATION SYSTEM: windows 7 64bits

    The processing time of the algorithm: Due to the different of every image, and the application processing data is different, so for

    every image, and in some part of the video stream will be very slow, or take long time to

    process, but the average time of the core calculation of this application processing time is:

    4.8ms. and the algorithm TIME COMPLEXITY should be N(logn).

    Image Nr Image Time Total Extract Failed Rate

    Image 1

    0.578s 54 54 5 0.90

    Image 2

    0.328s 72 72 11 0.84

    Image 3

    0.429s 24 24 0 1.00

    Image 4

    0.414s 11 5 6 0.54

    Image 5

    0.371s 51 51 12 0.76

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    63

    Image 6

    0.489s 15 7 8 0.46

    Image 7

    0.398s 18 11 7 0.61

    Image 8

    0.542s 9 9 0 1.00

    Image 9

    0.372s 39 15 24 0.38

    Image 10

    0.401s 25 19 6 0.76

    The accuracy rate of the extraction characters:

    Totally the normal destination board extracted accuracy rate can be reach up to about 70%.

    But for the complex destination board extracted depends on threshold.

    Because of this application is use the threshold to make sure that every character can be

    extracted with arrows and crosses inside them. So for some special case, it will failed to

    extract them, or given out not accuracy result.

    Figures showed below:

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    64

    Figure 5.21 the slope background

    for the image showed, it can be very difficult for the application to extract EXKOPING, is

    because the black pixel counts from the top to the character ENKOPING, the structure is

    different, but the total black pixel counts was closed, so the application can not extract

    ENKOPING.

    Figure 5.22 I and l character problem

    From the figure showed here, can be easily find out that, due to the application was extract the

    characters, and resize them to the full region, for this application is 30*30, so for the character

    I and l was should a totally black square, this is the same thing with some noises extract from

    the image. So this will give big problem to the character recognition part.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    65

    Figure 5.23 the problem threshold makes

    From the image up there, people can easily find it was a character O, but in fact, is not a

    character O, it is a Sweden character, the full spell should be O with two nodes , but the two

    nodes was missing. The reason cause this problem is due to the splitting threshold we make.

    So, there should be existed a balance that can make them both good, if not, then the system

    gives out the errors.

    Figure 5.24 the light problems

    Due to the environment lights, maybe this problem is not produced by this processing

    application, but I still should motion them here, that is the light, the image showed up, was a

    dark image, and the white character color gray level is different from the background, so it is

    no problem to extract, but for the black characters, the color gray level is closed to the

    background color level, so extract the black characters, was very difficult, that can leads to

    two different situations, one is the character extracted was very fuzzy, even human with good

    logical was difficult to recognize them, and another situation is will make them a totally white

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    66

    chunk, no one knows what was that.(the figures shows how the image processed was before

    this section figure 5.17)

    Figure 2.25 special character combination

    From the image showed up, that we can easily find out, due to some not normal spellings and

    special characters, like V and A, put them together, can cause problems. For this application,

    there was no method to solve these kind of problem. But train them as an special combination.

    Then give two character output maybe can solve these problem.

    5.1.6 Character recognition

    5.1.7 SVM applied here For this application SVM was applied is because

    1. For empirical risk minimization SVM is better than NN. 2. NN is very difficult to decide the hidden layer number, but for SVM the kernel

    functions was given out, no more changes.

    Figure 5.25 characters for testing from the image For every image showed before, then scan them from top to bottom and left to right. The first

    pixel was given a number, and if the value was 0, then put 0 after the number, 255 put 1.

    Figure 5.26 training sets

    Then get template file

    And predict from the testing sets

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    67

    As the figure showed below

    Figure 5.27 testing sets Then we get result

    For here „1‟ equals „p‟

    Test the accuracy rate with all the four kernel function.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    68

    5.1.8 Test with Liner function Image Nr Image 270 image correctness 570 image correctness

    Image 1

    32% 27%

    Image 2

    35% 38%

    Image 3

    46% 43%

    Image 4

    34% 36%

    Image 5

    41% 32%

    Image 6

    43% 44%

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    69

    Image 7

    41% 41%

    Image 8

    32% 33%

    Image 9

    27% 19%

    Image 10

    33% 33%

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    70

    5.1.9 Test with polynomial function Image Nr Image 270 image correctness 570 image correctness

    Image 1

    68% 72%

    Image 2

    62% 64%

    Image 3

    72% 77%

    Image 4

    69% 71%

    Image 5

    73% 73%

    Image 6

    73% 78%

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    71

    Image 7

    75% 62%

    Image 8

    77% 79%

    Image 9

    69% 72%

    Image 10

    64% 62%

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    72

    5.1.10 Test with RBF function Image Nr Image 270 image correctness 570 image correctness

    Image 1

    72% 77%

    Image 2

    74% 75%

    Image 3

    76% 79%

    Image 4

    72% 75%

    Image 5

    80% 80%

    Image 6

    73% 73%

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    73

    Image 7

    75% 78%

    Image 8

    79% 82%

    Image 9

    72% 75%

    Image 10

    77% 79%

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    74

    5.1.11 Test with sigmoid function Image Nr Image 270 image correctness 570 image correctness

    Image 1

    62% 64%

    Image 2

    67% 69%

    Image 3

    64% 65%

    Image 4

    63% 55%

    Image 5

    62% 54%

    Image 6

    63% 71%

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    75

    Image 7

    67% 62%

    Image 8

    70% 70%

    Image 9

    69% 75%

    Image 10

    77% 78%

    The result showed that for the image problem, due to the size of the image is not very big and

    very clear, so the accuracy is not very high, but for the image recognition problem, due to it is

    belong to non-liner problem, so use liner-kernel function to train them, then predict, that gives

    almost wrong output, and compared with the three kernel function, rbf, polynomial, sigmoid.

    For here, rbf gives best output. And for the image counts. From the result showed, that can tell,

    sometimes, training with more sample vectors maybe will give lower accuracy, that is because

    when the kernel function mapping the non-liner space into some high dimension space and try

    to find the liner space, but due to to many sample vectors, and will cause very difficult to find

    accuracy liner separated space, so will give lower accuracy output even compare with less

    training vectors.

    So, for the problem type and make sure the best number training vectors for the the kernel

    function due to use, will give very good result.

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    76

    Chapter 6 Conclusion and future works

  • Qiu Peng Masters September, 2010

    E3986D

    Dalarna University Tel: +46(0)23 7780000 Röda vägen 3S-781 88 Fax: +46(0)23 778080 Borlänge Sweden http://www.du.se

    77

    6.1 Conclusion To conclude the thesis report, the Real-Time city name Recognition System presented here

    has been a success. The aim based on right output city name which was the necessary

    parameters for the real-time environment has been accomplished. The average accuracy value

    achieved by the application is like 60%.

    1. Image Segmentation is found to be the most critical task during the whole project. This is

    because of the illumination conditions especially due to highlights and shadows. There is

    always a need to tune the parameters in segmentation during the process. However the

    Image Segmentation that has been done in the project was satisfactory.

    2. Noise filtering with the multiple median filter has been very efficient somehow. as it

    reduces the number of objects to be appointed as candidate traffic signs for recognition.

    Multiple median filter is harmless to the internal information carried by the traffic sign

    which could be used for the classification. Therefore it is very compatible for the traffic

    sign recognition system in real time.

    3. Character extraction is also the import part in my thesis. Use width and height scan to

    extract the characters from the image, only works with formal image I have showed before.

    If the image is very complex so should and some threshold. Or worked with the useful

    frames from the video stream. Not all of them.

    4. But for the Character extraction part, due to noise problem, and some image, especially with the white frame inside the blue background, if the image was slope, so maybe will

    cause the every char extraction failed, instead will output a big black chunk.

    5. The Support vector machine is very powerful to classify characters. And gives high

    recognition rate. but there is one not good compared with neural networks is must train the

    noise image also. And give them a feature space.so that the system can recognize it as

    noise.

    6. And for this SVM algorithm, first I have used another algorithm, that is save all the image in an array, and