robust trafﬁc sign recognition based on color

8/12/2019 Robust Traffic Sign Recognition Based on Color

1/12

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 1

Robust Traffic Sign Recognition Based on Color

Global and Local Oriented Edge Magnitude PatternsXue Yuan, Xiaoli Hao, Houjin Chen, and Xueye Wei

AbstractMost of the existing traffic sign recognition (TSR)systems make use of the inner region of the signs or the local fea-tures such as Haar, histograms of oriented gradients (HOG), andscale-invariant feature transform for recognition, whereas thesefeatures are still limited to deal with the rotation, illumination,and scale variations situations. A good feature of a traffic signis desired to be discriminative and robust. In this paper, a novelColor Global and Local Oriented Edge Magnitude Pattern (ColorGlobal LOEMP) is proposed. The Color Global LOEMP is aframework that is able to effectively combine color, global spatialstructure, global direction structure, and local shape informationand balance the two concerns of distinctiveness and robustness.

The contributions of this paper are as follows: 1) color angularpatterns are proposed to provide the color distinguishing infor-mation; 2) a context frame is established to provide global spatialinformation, due to the fact that the context frame is establishedby the shape of the traffic sign, thus allowing the cells to be alignedwell with the inside part of the traffic sign even when rotation andscale variations occur; and 3) a LOEMP is proposed to representeach cell. In each cell, the distribution of the orientation patternsis described by the HOG feature, and then, each direction ofHOG is represented in detail by the occurrence of local binarypattern histogram in this direction. Experiments are performedto validate the effectiveness of the proposed approach with TSRsystems, and the experimental results are satisfying, even forimages containing traffic signs that have been rotated, damaged,altered in color, or undergone affine transformations or images

that were photographed under different weather or illuminationconditions.

Index TermsHistogram of oriented gradient (HOG), localbinary pattern (LBP), rotation invariant, traffic sign recognition(TSR).

I. INTRODUCTION

AT PRESENT, intelligent transportation system technology

is developing at a very rapid pace. Traffic problems, such

as driving safety, city traffic congestion, and transportation

efficiency, are expected to be alleviated through the application

Manuscript received February 25, 2013; revised July 17, 2013, October 8,2013, and December 14, 2013; accepted January 6, 2014. This work wassupported in part by the Specialized Research Fund for the Doctoral Programof Higher Education under Grants 20110009120003 and 20110009110001, bythe National Natural Science Foundation of China under Grants 61301186 and61271305, and by the School Foundation of Beijing Jiaotong University underGrants W11JB00460 and 2010JBZ010. The Associate Editor for this paper wasS. S. Nedevschi.

X. Yuan is with the School of Electronics and Information Engineering,Beijing Jiaotong University, Beijing 100044, China, and also with the ChineseAcademy of Surveying and Mapping, Beijing 100830, China(e-mail: [email protected]).

X. Hao, H. Chen, and X. Wei are with the School of Electronics andInformation Engineering, Beijing Jiaotong University, Beijing 100044, China.

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TITS.2014.2298912

of information technology and the intelligent transportation of

vehicles. As an important subsystem in intelligent transporta-

tion system technology, traffic sign recognition (TSR) systems

based on computer vision have gradually become an important

research topic in the field of intelligent transportation system

technology [1][10].

Traffic sign images are generally obtained from outdoor

natural scenes by means of cameras installed on vehicles, and

then, the images are input to a computer for processing. Due

to the many types of complicated factors present outdoors,

outdoor environments are much more complex and challengingthan indoor systems. The main difficulties of TSR systems are

how to extract the robust descriptor with rich information in

accordance with various lighting conditions, shape rotations,

affine transformations, dimension changes, and so on.

Through in-depth study of traffic sign data sets, some com-

mon characteristics of traffic signs may be observed, such as the

following, for example.

1) Some traffic signs have the same local features but differ-

ent background colors [see Fig. 1(a)].

2) Some traffic signs have the same local features and back-

ground colors but different distributions of global shapes

[see Fig. 1(b)].

3) Some traffic signs share the same components but have

different meanings [see Fig. 1(c)].

The proposed system consists of the following two stages:

detection performed using maximally stable extremal regions

(MSERs) and recognition performed by the novel Color Global

and Local Oriented Edge Magnitude Pattern (Color Global

LOEMP) features, which are classified using a support vector

machine (SVM). To the best of our knowledge, this is the first

paper that adopts local binary pattern (LBP)-based features

for TSR, which are more discriminative and robust in han-

dling shape rotations, various illumination conditions, and scale

changes from traffic sign images than the existing systems. The

remainder of this paper is organized as follows. In Section II,we review previous work and describe our improvements.

Section III presents the traffic sign detection algorithm, and

Section IV details the Color Global LOEMP feature extraction.

Recognition based on SVMs is presented in Section V. The

experimental results are presented in Section VI. Finally, a

conclusion is presented in Section VII.

II. RELATED W OR K

In recent years, research with regard to TSR has grown

rapidly due to the significant need for such systems in future

vehicles. The most common approach consists of two main

1524-9050 2014IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.Seehttp://www.ieee.org/publications_standards/publications/rights/index.htmlfor more information.
mailto:%[email protected]:%[email protected]://www.ieee.org/publications_standards/publications/rights/index.htmlmailto:%[email protected]:%[email protected]://www.ieee.org/publications_standards/publications/rights/index.html


2/12


2 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS

Fig. 1. Examples of traffic signs. (a) Traffic signs with the same localtexture patterns, butdifferentbackgroundcolors. (b) Traffic signs with the same localtexturepatterns and background colors but, different distributions of global geometrics. (c) Traffic signs that share the same components but, have different meanings.

stages: detection and recognition. The detection stage identifies

the regions of interest and is performed mostly using color

segmentation, followed by some form of shape recognition. The

detected candidates are then either identified or rejected during

the recognition stage. The features used in the recognition

stage are the inner components of traffic signs, Haar, and HOG

features. Classifiers such as SVM [1][3], neural networks [4],

and fuzzy regression tree frameworks [5] were reported on

recent papers.

In the detection stage, the majority of the systems make

use of color information as a method for segmenting the

image. The performance of color-based traffic sign detec-

tion is often reduced in scenes with strong illumination,

poor lighting, or adverse weather conditions. Color models,

such as huesaturationvalue (HSV), YUV, YCBCR, and

CIECAM97, have been used in an attempt to overcome issues.

For example, Gaoet al.[6] proposed a TSR system based on theextraction of the red and blue color regions in the CIECAM97

color model. References [1] and [2] extracted color information

for traffic sign detection in the HSV color model. Reference

[4] detected potential road signs by means of the distribution

of red pixels within the image in theY CBCR color model. In

contrast, there are several approaches that ignore color informa-

tion entirely and instead use only shape information from gray

scale images. For example, Loy [7] proposed a system that used

local radial symmetry to highlight the points of interest in each

image and detect octagonal, square, and triangular traffic signs.

Recently, Greenhalgh and Mirmehdi [3] have proposed a traffic

sign detection algorithm using a novel application of MSERs;

the authors validated the efficiency of the MSERs method.

In the recognition stage, the majority of the systems make

use of the inner region. For example, Fleyeh and Davami

[8] extracted the binary inner component of traffic signs for

recognition. They processed the size and rotation normalization

before recognition in order to reduce the effects caused by

shape rotation, affine transformation, and dimension changes.

Then, they used the principal component analysis algorithm to

determine the most effective eigenvectors of the traffic signs.

Escalera et al. [9] indicated that a sign is the sum of a color

border, an achromatic (either white and/or black) inner com-

ponent, and a shape. They proposed a method that computes

color energy, chromatic energy, gradient energy, and distanceenergy. They also proposed two techniques for determining the

minimum as traffic sign regions in the energy function, namely,

simulated annealing and genetic algorithms. Maldonado-

Bascnet al. also proposed a TSR system using gray images

of inner regions [1], [2], in which the gray images were nor-

malized and the contrasts were stretched and equalized before

recognition to reduce the effect of illumination variations.

Some recent systems have made use of HOG, Haar, and

scale-invariant feature transform (SIFT) features for TSR. For

example, Rutaet al.[5] extracted Haar and HOG features from

traffic sign images. Greenhalgh and Mirmehdi [3] proposed

a TSR algorithm using HOG features. Takaki et al. [10] and

Ihara et al. [11] proposed a TSR method based on keypoint

classification by SIFT. In their system, two different feature

subspaces are constructed from gradient and general images.

Detected keypoints are then projected onto both subspaces,

and SIFT is a local descriptor, which remains unchanged

for images of different scales and small rotation angles.Abdel-Hakim and Farag [12] proposed a traffic sign detection

and recognition technique by augmenting SIFT with the ad-

dition of new features related to the color of the keypoints.

In [13], Yuan et al. proposed a context-aware SIFT-based

algorithm for TSR. Furthermore, a method for computing the

similarity between two images is proposed in their paper, which

focuses on the distribution of the matching points, rather than

using the traditional SIFT approach of selecting the template

with the maximum number of matching points as the final

result. However, some issues still remain when illumination

and rotation variations occur. For example, the performance of

inner-region-based TSR is often reduced in scenes with various

illumination conditions, rotation, and scale variations and that

have undergone affine transformations. HOG [14] is very effec-

tive in capturing gradients, long known to be crucial for vision

and robust to appearance and illumination changes. However,

images are clearly more than just gradients, and traditional

HOG is still limited to performing the rotation variations. SIFT

is proposed to solve these problems, which is robust to various

illumination conditions, shape rotations, and scale changes.

However, the dimension of the SIFT feature is dependent on

the number of detected keypoints, and the number of keypoints

is always different among the images. Due to the fact that the

dimensions of the features are different, it is difficult to design

a suitable classifier for classification based on the SIFT feature.In this paper, an LBP-based feature, which is robust to shape


3/12


YUANet al.: ROBUST TRAFFIC SIGN RECOGNITION BASED ON COLOR GLOBAL LOEMPs 3

Fig. 2. Examples in the process of traffic sign detection. (a) Original images. (b) Normalized red/blue images. (c) MSERs (each level of the MSER is painted adifferent color). (d) Borders of the MSERs. (e) Detection results.

rotations, various illumination conditions, and scale changes

from traffic sign images, is proposed. Furthermore, the LBP-

based feature is more suitable for combining with common

classifiers.

Ojala et al. first proposed the concept of LBP [15], which

may be converted to a rotational invariant version for applica-

tion in texture classification. Various extensions of LBP, such as

LBP variance with global matching [16], dominant LBPs [17],

completed LBPs [18], and joint distribution of local patterns

with Gaussian mixtures [19], have been proposed for rotational

invariant texture classification.

To the best of our knowledge, there have been no public

report of using LBP for TSR; it is due to the fact that the

following issues remain in traditional LBP.

1) LBP is unable to provide color information.

2) LBP only focuses on local textures while ignoring the

distribution of global shapes.

3) The rotational invariant version of LBP proposed in [15],namedLBPriu2, has a very small size, and such a small

size of this feature cannot effectively represent a complex

traffic sign image.

In this paper, a novel traffic sign descriptor is proposed,

known as Color Global LOEMP, which is robust to illumination

conditions, scale, and rotation variations and balances the two

concerns of distinctiveness and robustness. The main contribu-

tions of this paper are as follows.

1) The proposed color angular feature is able to exploit the

discriminative color information derived from the spati-

ochromatic texture patterns of different spectral channels

in a local region, which can respect the abundant color

information of traffic signs.

2) A novel context frame is established to provide global

spatial information. The context frame is established by

the shape of the traffic sign, thus allowing the cells to be

aligned well with the inside part of the traffic sign, even

when rotation and scale variations occur.

3) A novel descriptor known as LOEMP is extracted fromeach cell, which is robust to lighting conditions and


4/12



Fig. 3. Flow of the proposed TSR system.

rotation variations. We apply the concept of calculatingboth the HOG-based structure, to describe the distribution

of the holistic orientation of the local shape for each

orientation, and the LBP-based structure, to describe the

distribution of the local shape for each orientation.

Comparative experiments were performed to test the effec-

tiveness of the proposed Color Global LOEMP. The public

traffic sign data sets used were the Spanish traffic sign set [20],

the German Traffic Sign Recognition Benchmark (GTSRB)

data set [21], and an image data set captured from a moving

vehicle on cluttered China highways. The experimental results

show that the proposed Color Global LOEMP feature is able to

yield excellent performance when applied to challenging traffic

sign images.

III. TRAFFIC S IG N D ETECTION

This paper uses the MSERs method and shape information

to extract traffic signs. First, the candidate regions of the traffic

signs are detected as MSERs [see Fig. 2(c)], each MSER level is

painted a different color, which are regions that maintain their

shapes when the image is thresholded at several levels. Then,

the border of each MSER is extracted [see Fig. 2(d)]. Finally,

elliptical, triangular, quadrilateral, and octagonal regions are

located as the candidate regions for further recognition [see

Fig. 2(e)]. Examples of this traffic sign detection process areillustrated in Fig. 2, and each step is presented in detail as

follows.

Greenhalgh and Mirmehdi [3] proposed a traffic sign de-

tection algorithm using a novel application of MSERs, and

they proved that the MSERs were robust to both variations in

lighting and contrast. In this paper, MSERs are adopted for

detecting the traffic sign candidate regions. For each pixel of the

original image, values are found for the ratio of the blue channel

to the sum of all channels and the ratio of the red channel to the

sum of all channels. The greater of these two values is used as

the pixel value of the normalized red/blue image, i.e.,

= max

RR + G + B

, BR + G + B

. (1)

MSERs are found for the image [see Fig. 2(b)]. Each imageis binarized at a number of different threshold levels, and the

connected components at each level are found. The connected

components that maintain their shape through several threshold

levels are selected as MSERs [see Fig. 2(c)].

An efficient ellipse detection method [22] is adopted to locate

the ellipses from the MSERs borders, and the general regular

polygon detector method proposed by Loy [7] is adopted to

detect the triangular, quadrilateral, and octagonal regions.

It should be noted that detected traffic sign candidate regions

contain a lot of noise blobs; the recognition system is used to

judge whether a candidate region is a traffic sign. Examples of

extracted traffic signs are shown in Fig. 2(e).

IV. COLORG LOBAL LOEMP FEATURE E XTRACTION

The flow of the proposed TSR system is illustrated in Fig. 3.

First, the input color image is divided into three color compo-

nents, and the color angle patterns are computed as chromatic

information. Then, a context frame is used to divide the traffic

sign region into several cells, in order to describe the global

spatial structure of each color angle pattern. After that, the

LOEMP is extracted from each cell, and the LOEMPs extracted

from all the color angle patterns and cells are combined as the

final descriptor. Finally, an SVM is used as the classifier.

A. Global Feature Extraction

1) Color Angle Patterns: For the purpose of extracting the

discriminative patterns contained among the different spectral

bands, the ratio of the pixels between a pair of the spectral-

band images is calculated (see Fig. 4), using a method pro-

posed by Choiet al. [23]. This directional information may be

useful for extracting discriminative color angular patterns for

classification.

The ratio of the pixel values between the spectral bands is

defined as

i,j = vjvi+

, fori < j, i= 1, . . . , K (2)


5/12



Fig. 4. Illustration of extracting the color angular patterns from pixel C

obtained from three color bands.

Fig. 5. Traffic sign detection and establishing the context frame.

where vi andvj are the elements of color vector c associated

with theith andj th spectral bands of the color image, respec-

tively. Note that is a small-valued constant used to avoid

a zero-valued input in the denominator term. The color angle

between theith andj th spectral bands is computed as

(i,j) = tan1((i,j)) (3)

where the values of(i,j) fall between 0 and 90. Note that,

as shown in Fig. 4, represents the value of angle computed

between the axis (corresponding to the ith spectral band) and

the reference line, which is formed by projecting C onto the

plane associated with theith andj th spectral bands.

2) Global Spatial Structure: In this paper, a novel context

frame is established, which is robust to image rotation and scale

variations, to provide the information of a global spatial struc-

ture. In order to establish a more robust context frame, a frame

of overlapping cells is built with polar coordinates. As shown in

Figs. 5 and 6, the proposed context frame divides the images intoM Ncells in polar coordinates [Fig. 4(a2)(c2) shows the

Fig. 6. Context frame model. 0 is the initial angle of the context framemodel.

examples of 2 4 cells, and Fig. 6 shows the example of 3 8 cells], where M is the number of the cells divided on the radial

coordinate, and N is the number of the cells divided on the

angular coordinate. The proposed implementation is not exactly

a logarithmic polar coordinate, since the radial increment is 0.

It should be noted that the cells overlap each other on both the

radial and angular coordinates in this paper, the overlapping

rate on the radial coordinate named Loand the overlapping rate

on the angular coordinate named Ao. The initial0 is used to

build the context frame, which is equal to the horizontal angle

between one border of the quadrilateral, the octagonal, or the

triangular and thex-axis [see Figs. 5(a2) and (c2) and 6]. The

context frame is established by the shape of the traffic sign, thus

allowing the cells to be aligned well with the inside part of thetraffic sign, even when rotation and scale variations occur.

B. Local Feature Extraction

Ojala et al. [15] proposed the rotation invariant LBP in

2002. In order to build the rotation invariant features possessing

distinctiveness for each cell, the authors propose applying

the concept of calculating both the HOG-based structure, to

describe the distribution of the holistic orientations, and the

LBP-based structure, to describe the distribution of the local

shape for each orientation. Then, the sequence of the bins

of the holistic HOG is adjusted, and all histograms originatefrom its principal orientation. Based on the result of this step,

all identical traffic sign images may be considered to be on

the same rotation. Finally, the LBP codes of each orientation

are integrated based on the distribution of the holistic edge

information.

The image gradient is computed in one cell, and the gradient

orientation of each pixel is evenly discretized across 0180.

Then, a HOG is formed from the gradient orientations of the

image. As shown in Fig. 7, the HOG has K bins covering the

180 range of orientations. Each sample added to the histogram

is weighted by its gradient magnitude. The maximal bin of the

HOG is assigned as the principal orientation main. Then, all the

bins of the histogram are shifted until the principal orientationshifts to the first position.


6/12


7/12



To optimize the performance of the linear SVM classifier,

an appropriate value for the cost of a regularization parameter

Chas to be selected. A cross correlation of the training set is

performed, and the value ofC that produces the highest cross-

correlation accuracy is used. In this paper,Cis set to 1.2.

VI. EXPERIMENTS

In order to evaluate the effectiveness of the proposed method,

a series of comparative experiments were performed using

several traffic sign data sets. The experiments included two

components, namely, traffic sign detection and classification.

A. Databases

We illustrate the effectiveness of the detection module by

presenting experiments on two traffic sign data sets: the Spanish

traffic sign set and the authors data set. The effectiveness of our

recognition module is illustrated by presenting experiments on

two traffic sign data sets: the GTSRB data set and the authors

data set.

1) Spanish Traffic Sign Set [20]: Many sequences on dif-

ferent routes and under different lighting conditions were

captured. Each sequence included thousands of images.

With the aim of analyzing the most problematic situations,

313 images selected from thousands of 800600 pixel imagesextracted several sets in [20]. The images presented different

detection and recognition problems, such as low illumination,

rainy conditions, array of signs, similar background color, and

occlusions.

2) GTSRB Data Set [21]: The GTSRB data set was created

from approximately 10 h of video that was recorded whiledriving on different road types in Germany during the day-

time. The sequences were recorded in March, October, and

November. The testing set contains 12 630 traffic sign images

of the 43 classes, and the training set contains 39 209 training

images.

3) Authors Data Set: The authors collected a data set by

capturing images on different roads and under different lighting

conditions. The camera images have a resolution of 1024 768 pixels. Each sequence included several thousand frames,

among which more than 5000 frames were analyzed. Visibility

status included occluded, blurred, shadowed, and visible. In

order to evaluate the effectiveness of the recognition module,the traffic signs were divided into two sets, known as testing

set and training set. The testing set contains 4540 actual traffic

signs and the training set contains 4605 actual traffic signs in

41 classes; the training images were captured on different routes

with the test images.

It is important to note that all the aforementioned data

sets are unbalanced, and the number of images representing

different classes varies. Examples of the test images in the

aforementioned detection databases are shown in Fig. 8(a) and

(b), and examples of the test images used in the recognition

database are shown in Fig. 9. As shown in Figs. 8(a) and (b)

and 9, these images varied in their different rotation angles,

geometric deformation, occlusion, and shadows, according todifferent weather and light conditions.

Fig. 8. Examples used in the experiments for traffic sign detection.(a) Examples on the Spanish traffic sign set. (b) Examples on the authorsdata set.

Fig. 9. Examples used in the experiments for traffic sign recognition.

B. Experiments for Traffic Sign Detection

All the elliptical, triangular, quadrilateral, and octagonalregions were detected from the borders of MSERs. The traffic

sign regions in the two data sets were all manually labeled by

the authors; the sizes of the traffic signs varied between 15 15and 156193 pixels. The manually labeled traffic sign regionswere used to evaluate the efficiency of the detection system.

The accuracy of the traffic sign detection was evaluated by

observing the outputs of both the detection and recognition

modules. If the final result was identified as a traffic sign,

then it was considered a detection. If the algorithm failed

to detect a sign that was present in the test image, then it

was a miss. Finally, if the system detected a non-road-sign

object and classified it as a traffic sign, then it was a false

alarm. Tables I and II summarize the results generated duringthe detection processing of the two data sets and include the


8/12



TABLE ITRAFFICS IGNS D ETECTIONR ESULTS ON THE

SPANISH T RAFFICS IGNS DATAS ET

TABLE IITRAFFICS IGNSD ETECTIONR ESULTS ON THEAUTHORS DATAS ET

following information: 1) the total number of traffic signs that

appear in the test sequence; 2) the number of detections of

traffic signs that have been correctly detected by both the

detection and recognition modules; 3) the number of false

alarms in the output of the system; and 4) the number of misses.

After the traffic sign detection step was completed, the traffic

sign candidate regions were input into the recognition module

for further classification.

C. Experiments for Traffic Sign Classification

Several comparative experiments were performed to validate

the effectiveness of the proposed approach in TSR systems.

Accuracy rate is computed in each comparative experiment

with the given formula

Accuracy rate= nm

nt

wherenm is the number of accurate classified images, andntis the number of the test images.

1) Comparison With the LBP-Based Features: For

comparison purposes, the following comparative experiments

using nine sorts of LBP-based features (LBPu2P,R, LBPriu2P,R ,

Color + LBPriu2P,R , Global LBPriu2P,R , Color + Global

LBPriu2P,R , LOEMP, Color+LOEMP, Global LOEMP, Color+Global LOEMP) were performed to confirm the effectiveness

of the proposed features, where the parameters in LBP R and

Pwere set to 2 and 8.

Additionally, in order to obtain the global spatial structure,

the context frame was used to divide the whole image into

M Noverlapping cells in polar coordinate, where M wasthe number of the cells divided on the radial coordinate, and N

was the number of the cells divided on the angular coordinate.

On the other hand, in order to build the LOEMP feature, the

HOG had K bins covering the 180 range of orientations.The number of cells (M N) and the number of HOG bins

(K) are two parameters that impact the performance of theproposed method. Fig. 10(a) and (b) shows the recognition rate

by varying the number of cells and HOG bins. As expected,

a too large or a too small cell size results in a decreased

recognition rate because of the loss of spatial information or

sensitivity to local variations. A smaller size of HOG bins

loses the discriminative information, and a larger one increasesthe computational cost. Considering the tradeoff between the

recognition rate and the computational cost, in the following

experiments, the traffic sign images were divided into 3 12cells in polar coordinates, where the overlapping rate on the

both the angular and radial coordinates (L0, A0)was set to 0.5.The number of HOG bins was set to 14.

The comparison features are presented as follows:

LBPu28,2: Extraction of the feature ofLB Pu28,2 (uniform

LBP) and computation of the occurrence histogram from

the whole traffic sign region as the final descriptor for

classification. The length of the feature vector LBPu28,2was 59.

LBPriu28,2 : Extraction of the feature ofLBPriu28,2 (uniform

rotation invariant LBP) and computation of the occurrence

histogram from the whole traffic sign region as the final

descriptor for classification, resulting in feature vectors of

length 10.

Color+ LBPriu28,2 : Extraction of the feature ofLB Priu28,2

and computation of the occurrence histogram from each

color angle pattern of the whole traffic sign region. After

that, the combination of the occurrence histograms of

all the color angle patterns as the final descriptor for

classification. The length of the feature vector Color+

LBPriu28,2 was 210= 20. It should be noted that only

two color angle patterns (RG andBG) were used in this

experiment.

GlobalLBPriu28,2 : Division of the traffic sign region into

several cells by the context frame and then extraction of

the feature of LBP and computation of the occurrence

histogram from each cell and each color angle pattern.

Combination of the occurrence histograms of all the cells

as final descriptor for classification. The length of the

feature vector GlobalLBPriu28,2 was 360.

Color+Global LBPriu28,2 : Division of the traffic signregion into several cells by the context frame and then

extraction of the feature ofLB Pand computation of theoccurrence histogram from each cell and each color angle

pattern. Combination of the occurrence histograms of all

the color angle patterns and cells as the final descriptor for

classification. The length of the feature vector was 720.

LOEMP: Extraction of the feature of LOEMP and compu-

tation of the occurrence histogram from the whole traffic

sign region as the final descriptor for classification, result-

ing in feature vectors of length 1410= 140. Color+LOEMP: Extraction of the feature of LOEMP and

computation of the occurrence histogram from each color

angle pattern. Combination of the occurrence histograms

of all the color angle patterns as the final descriptor for

classification, resulting in feature vectors of length 1402= 280.


9/12



Fig. 10. TSR rates with different parameters. (a) TSR rates with different cell numbers. (b) TSR rates with different HOG bin numbers.

TABLE IIIEXPERIMENTALR ESULTS OFC OMPARATIVEE XPERIMENTS

ON THE AUTHORS DATAS ET

Global LOEMP: Division of the traffic sign region into

several cells by the context frame and then extraction of

the feature of LOEMP and computation of the occurrence

histogram from each cell. Combination of the occurrencehistograms of all the cells as the final descriptor for

classification. The length of the feature vector was 14036= 5040.

Color+ Global LOEMP: Division of the traffic sign re-gion into several cells by the context frame and then ex-

traction of the feature of LOEMP and computation of the

occurrence histogram from each cell and each color angle

pattern. Combination of the occurrence histograms of all

the color angle patterns and cells as the final descriptor

for classification. The length of the feature vector context-

awareLBPriu28,2 +Color was 10 080.

Table III presents the recognition results of the comparisonexperiments for LBP-based features on the authors data set. As

shown in Table III, the proposed Color Global LOEMP attains

the highest recognition rate for all LBP-based feature extraction

methods.

2) Comparison With Other Features: For comparison pur-

poses, the following comparative experiments using 1) candi-

date traffic sign regions (64 64 and 32 32 pixels) [1],[2], 2) two sorts of HOG (set1 and set2) [3], [5] and 3) color

histograms were performed to confirm the effectiveness of the

proposed Color Global LOEMP.

Candidate traffic sign regions: The recognition stage in-

puts were candidate traffic sign regions that were scaled toa size of 6464 and 3232 pixels in gray scale images.

TABLE IVEXPERIMENTALR ESULTS OFC OMPARISONW IT HOTHER

DESCRIPTORS ON THE GTSRB DATAS ET

TABLE VEXPERIMENTALR ESULTS OFC OMPARISONW IT HOTHER

DESCRIPTORS ON THE AUTHORS DATAS ET

HOG descriptors: Based on the gradients of the color

images, different weighted and normalized histograms

were calculated, first for the small nonoverlapping cells

of multiple pixels that cover the whole image (set 1) and

then for the larger overlapping blocks that integrate over

multiple cells (set 2). Two sets of features from differently

configured HOG descriptors were used. To compute the

HOG descriptors, all images were scaled to a size of 128128 pixels. For sets 1 and 2, the sign of the gradient

response was ignored. Sets 1 and 2 used cells the size of

16 16 pixels, a block size of 4 4 cells, and anorientation resolution of 8.

Color histograms: This set of features was provided to

complement the gradient-based feature sets with color

information. It contains a global histogram of the hue

values in HSV color space, resulting in 256 features per

image.

The experimental results of the comparisons with other fea-

tures are listed in Tables IV and V. The most ideal results are

marked in bold font. As shown in Tables IV and V, the proposed

Color Global LOEMP attains the highest recognition rate for all

feature extraction methods.

Additionally, in Fig. 11, the confusion matrix shows therecognition rate of the 41 classes in the authors data set. The


10/12



Fig. 11. Confusion matrix obtained for a 41-class traffic sign problem.

values on the x-axis represent the individual traffic sign classes,

and the values on the y-axis represent the predictions made

by the classifier. In Fig. 11, most of the confusions occurred

between very similar images, for example, some triangular

signs.3) Comparison With Other Systems: In [1], Maldonado-

Bascn et al. adopted the inner region of the traffic sign

normalized into 31 31 pixels as the descriptor, and the linearSVM was adopted as the classifier. Using their method, the

accuracy rate on the GTSRB data set was 85.7869%. In [13],

Yuan. et al. proposed a context-aware SIFT-based algorithm

for TSR. Furthermore, a method for computing the similarity

between two images is proposed in their paper, which focuses

on the distribution of the matching points, rather than using the

traditional SIFT approach of selecting the template with the

maximum number of matching points as the final result. Due

to the fact that the dimensions of the SIFT features are different

from each other, it is difficult to design a suitable classifier for

classification based on the SIFT feature; template matching is

used in their system. Using the context-aware SIFT method, the

recognition rate on the GTSRB data set was 79.2381%.

The results of GTSRB Competition Phase I were listed on

[21]. The experiments made use of the inner region of signs,

Haar, or HOG features for recognition. In the recognition

stage, classifiers such as convolution neural network (CNN),

convolutional networks, and subspace analysis were proposed

for traffic signs recognition. The CNN-based method attained

the highest accuracy rate for all comparison experiments. In

this paper, a novel feature of a traffic sign was proposed; in the

recognition stage of our system, the linear SVM was adopted asthe classifier. The best accuracy rate based on the linear SVM

TABLE VIPROCESSINGT IME OF THE TSR SYSTEM FOR PER F RAME

classifier was 95.89% on the results list of GTSRB Competition

Phase I.

The proposed approach with the accuracy rate of 97.2581%

on the GTSRB data set was greater than all the aforementioned

feature extraction methods of the traffic recognition systems.

4) Processing Time: Running on a 2.93-GHz Intel Core Duo

CPU E7500 central processing unit with MATLAB 7.0, where

the frame dimensions were 1360 1024, and Just-In-TimeAccelerator was used to speed up our programs. The system

speed was around 4 frames/s. The average processing time of

each part is listed in Table VI.

D. Discussion

As shown in Table III, the recognition rates increased by

about 8% when the color information was combined, due to

the fact that some traffic signs have the same local features

but different background colors. Combining with the color

information was able to provide the richer image features that

are robust to illumination variations.

The recognition rates increased by about 23% when the

global spatial structure information was combined, due to the

fact that some traffic signs had the same local patterns andbackground colors but different distributions of global shapes.


11/12


12/12



Xiaoli Hao received the B.E., M.E., and Ph.D.degrees from Beijing Jiaotong University, Beijing,

China, in 1992, 1995, and 2010, respectively.In 1995 she joined the School of Electronics and

Information Engineering, Beijing Jiaotong Univer-sity, where she has been an Associate Professorsince 2002. In 2006 she was a Visiting Scholarwith University of California, San Diego, CA, USA.

Her research interests include optical imaging, signalprocessing, and machine vision.

Houjin Chen received the B.E. degree fromLanzhou Jiaotong University, Lanzhou, China, in1986, and the M.E. and Ph.D. degrees from BeijingJiaotong University, Beijing, China, in 1989 and2003, respectively.

In 1989 he joined the School of Electronics andInformation Engineering, Beijing Jiaotong Univer-sity, where he became a Professor in 2000 and iscurrently the Dean of the School of Electronics andInformation Engineering. In 1997 he was a VisitingScholar with Rice University, Houston, TX, USA,

and in 2000 he was a Visiting Scholar with the University of Texas at Austin,

Austin, TX. His research interests include signal and information processing,image processing, and the simulation and modeling of biological systems.

Xueye Weireceived the B.S. and M.S. degrees fromTianjin University, Tianjin, China, in 1985 and 1988,

respectively, and the Ph.D. degree from Beijing In-stitute of Technology, Beijing, China, in 1994.

He is a Professor of electronics and informationengineering, Beijing Jiaotong University, Beijing.His research interests are in the theory of automaticcontrol.

robust trafﬁc sign recognition based on color

Documents