pedestrian detection and tracking in urban …chateaut.free.fr/publications/2009gidelits.pdftion of...

1

Pedestrian Detection and Tracking in UrbanEnvironment using a Multilayer Laserscanner

Samuel Gidel, Paul Checchin, Christophe Blanc, Thierry Chateau,and Laurent Trassoudaine, Lasmea, France

Abstract —Pedestrians are the most vulnerable participants to urban traffic. The first step toward protecting pedestriansis to reliably detect them in a real time framework. In this paper, a new approach is presented for pedestrian detection,in urban traffic conditions, using a multilayer laser sensor mounted on board a vehicle. This sensor, placed on the frontof a vehicle collects information about distance distributed according to 4 planes. Like a vehicle, a pedestrian constitutesin the vehicle environment an obstacle which must be detected, located, then identified and tracked if necessary. Inorder to improve the robustness of pedestrian detection using a single laser sensor, a detection system based on thefusion of information located in the 4 laser planes is proposed. The method uses a non-parametric kernel density basedestimation of pedestrian position of each laser plane. Resulting pedestrian estimations are then sent to a decentralizedfusion according to the 4 planes.Temporal filtering of each object is finally achieved within a stochastic recursive Bayesian framework (Particle Filter),allowing a closer observation of pedestrian random movement dynamics. Many experimental results are given andvalidate the relevance of our pedestrian detection algorithm in regard to a method using only a single-row laser-rangescanner.

Index Terms —Pedestrian detection, LIDAR, intelligent vehicle, fusion, SIR PF, Parzen kernel method.

✦

1 INTRODUCTION

The pedestrian detection is an essential functionalityfor intelligent vehicles, since avoiding crashes withpedestrians is a requisite for aiding the driver inurban environments. Currently, in France, more than535 pedestrians die in road accidents every year,while several hundred thousands are injured. Mostaccidents (70%) take place in urban areas where se-rious or fatal injuries often happen at relatively lowspeeds. So, it is important to develop a pedestriandetection system. These issues take place in the con-text of the LOVe Project (Software for vulnerablesobservation) which aims at improving road safety,mainly focusing on pedestrian security [1].

In this paper, a system for pedestrian detectionbased on an approach using a multilayer lasersensor mounted on board a vehicle is presented.The system is composed of a single multilayer lasersensor mounted on board a vehicle with a variablescan area limited to 150o. It is designed to work ina particularly challenging urban scenario, in whichtraditional pedestrian detection approaches wouldyield non-optimal results. Because it must detect all

the pedestrians (moving or static), no motion modelof persons proposed in the litterature is convenient.Using a laser system in this way presents manydifficulties including occlusions, non-rigid targets,obvious limitations of this sensor (no informationabout shape, contour, texture, color of objects) andvarying atmospheric conditions (rain and fog).

For a broad review of the various sensors usedfor pedestrian detection, one can consult [2] wherepiezoelectric, radar, ultrasound, laser range scannersensors and cameras operating in the visible or inthe infrared are described. Using video sensors tosolve the problems of detection and identificationseems natural at first, given the capacity of this typeof sensor to detect/analyze the size, the shape andthe texture of a pedestrian. Many methods to detecthuman beings were developed in computer visionbased on monocular or stereoscopic images [3]–[6]. However, the strong sensitivity to atmosphericconditions, the wide variability of human appear-ance, the limited aperture of this sensor and theimpossibility to obtain direct and accurate informa-tion concerning depth have, among other reasons,given rise to an interest for the development of

2

a detection method starting from an active sensorlike a radar or a laser sensor. In this article, wehave chosen to focus on the latter type of sensor.Thus, we are interested in the development of apedestrian detection technique using only data froma 4-layer laser sensor such as the one developed by[8]. This type of sensor, especially in its mono layerversion, has already been used in a great number ofpractical mobile robotic applications such as SLAM(Simultaneous Localization and Mapping), naviga-tion of robots, detection, localization and trackingof moving objects [9]–[19].

In real traffic conditions, the pitch of a vehicle inmotion can cause the system to fail if a single-rowlaser range scanner is used. In fact, a small pitchmovement (< 1o) can move the laser plane 50 cm,30 m away, which can change the informationcontained in the laser layer. We propose to use theinformation located in the 4 laser planes in order tosolve this drawback and improve the robustness ofpedestrian detection algorithm. After this detectionstage, a Sampling Importance Resampling basedParticle Filter (SIR PF) is used in order to trackmore easily the pedestrian random movement whichcan include abrupt trajectory changes.

This paper is organized as follows: in Section 2, areview of articles related to our research interests iscarried out in order to position our work in relationto existing methods. In Section 3, our approach, thesystem and the sensor are described. In the LOVeproject framework, the Renault manufacturer usesthe IBEO ALASCA XT on board an experimentalvehicle. The first part of Section 4 is dedicated tothe segmentation of the laser image. In the secondpart, the method developed to isolate pedestrianobjects and to merge the 4 laser layers is described.Finally, in Section 5, the SIR PF is used to trackpedestrians. The results obtained on real data fromseveral scenarios are presented in Section 6.

2 RELATED WORKMany research works have been carried out over thelast years concerning pedestrian detection from alaser sensor. The pedestrian recognition applicationon-board vehicles is particularly challenging witha laser sensor due to the wide range of possiblepedestrian appearances, occlusions and the cluttered(uncontrolled) background that are involved. Thearticles related to this research work can be dividedinto two main approaches:

• detection and/or tracking pedestrian in a dy-namic mode, in other words these methodsdetect only a moving pedestrian;

• detection and/or tracking pedestrian in staticand dynamic mode, in other words these meth-ods detect both static and moving pedestrians.

Several methods suggest using a sensor in a dy-namic mode only. Zhao etal. [16] seek and trackamong the laser data binomials of clusters of points(equivalent to the feet of a pedestrian) which havea known periodic movement. They use a Kalmanfilter. In other methods the systems is mountedon the mobile platform. Prassler etal. [9], Lind-ström et al. [12] and Elfes [7] detect pedestriansby assimilating them to particular moving areasdepending on their size through a temporal analysisof an occupation grid. Schulz etal. [14] use eachlocal minimum to compute a set of two-dimensionalposition probability grids, each containing all theprobabilities that a pedestrian’s legs are at position< x,y > relative to the robot. They use a SJPDAFin order to track people.

In the second category, they detect and/or trackpedestrians in a static and dynamic mode. Fodet al. [17] propose a method which subtracts abackground model to aggregate laser measurementsinto big blobs and then matches the previous blobswith the new ones for each scan. Their trackingalgorithm is based on a Kalman filter. In othermethods, the system is mounted on the mobileplatform. Szarvas etal. [15] clustered the laser datapoints before classifying them with a ConvolutionalNeural Network classifier. Montemerlo etal. com-pute probabilities based on disparities in x-y space.The probabilities of each point are computed fromthe Euclidean distance between these points andthe closest object, be it a person or an occupiedmap cell. Fuerstenberg etal. [8] achieve pedestrianrecognition by classifying objects thanks to pre-established criteria (vehicles, vulnerable items, etc.),then track pedestrians using a Kalman filter.

All these methods presented above do not usethe 4 laser layers from a multilayer laserscannerbefore making a final decision and furthermore usethe traditional Kalman filter which does not seem totake into account the right assumptions in our view(linear evolution model and Gaussian noise).

3

3 OVERVIEW

3.1 Proposed approachIn order to find a method enabling outdoor pedes-trian detection from a 4-plane laser sensor as the onedeveloped by the IBEO company, different workslisted in the bibliography guided us in this research.Without prior knowledge of the number of obstaclesin the observed scene, a segmentation, and classi-fication algorithm [8] which clusters together thelaser measurements and classifies them in differentgeometrical classes to keep only the objects having a"pedestrian" shape have been chosen. Our approachintroduces a new idea which consists in using theinformation located in the 4 laser planes, beforemaking a final decision. This algorithm uses theParzen method suggested in Cui’s article [22] toextract beforehand the pedestrian objects in eachplane before merging them. Contrary to Cui’s etal., Parzen methods are used to detect pedestrianswithout focusing on parts of the body (legs forinstance). Moreover the use of the Gaussian kernelcontaining the geometrical information related to apedestrian is original. In fact, to improve the per-formances of pedestrian detection algorithms basedon a single laser layer sensor, the basic idea is thatmissing information at a given timet in some layerscan be compensated with the other layers and thewrong information located in one or two layers canbe rejected by using the others; and thus the rateof correct detection is increased. In order to trackpedestrians from a moving vehicle, a SIR PF [27]is used because it proved an efficient way to track avarying number of targets when a priori knowledgeor assumption about the movement of a pedestrianwas not available. An overview of this algorithm ispresented in Fig. 2.

3.2 The IBEO LaserscannerIn the LOVe project framework, the Renault manu-facturer uses the IBEO ALASCA XT. The IBEOlaserscanner (see Fig. 1) has a variable scan areaup to 270o but limited here to 150o for our experi-ments. The laserscanner is mounted in the center ofthe frontal area of Renault test vehicle. From thisposition, the sensor can detect all relevant objectsin front of the vehicle. The manufacturer indicatesthat the IBEO sensor has a range measurement upto 128 m with a accuracy of +/- 5 cm. The angleof resolution varies with scan frequency (at 20 Hz

the resolution angle is 0.5o) thus providing 300measurements per channel and scan. These scanplanes have a total opening angle of approx. 3.2o.Two SMAL video cameras mounted on the top ofthe vehicle (see Fig. 1) simultaneously record thescene.

Fig. 1: The IBEO ALASCA XT Laserscanner andthe Renault test vehicle.

4 ALGORITHM PROPOSED FOR PEDES-TRIAN DETECTION

In this section, the different modules of the objectdetection algorithm (see Fig. 2) are presented. Inthe context of the LOVe project, the algorithm hasto detect all the moving and not moving pedestrianswho are located in front of the vehicle. The firststep of the algorithm is a segmentation phase.Then a presentation of the uncommon use of the4 laser layers and the adopted method in orderto extract pedestrian objects is carried out, where,two complementary goals are desired: filtering falsedetections due to information in one or two layersby using the others; increasing the rate of correctdetections which can appear in a single layer byseeking confirmation of detection in the other three.

4.1 SegmentationExtracting observation from sensor data is the firstfundamental stage of any object detection algorithm.For that purpose, the process starts by groupingall the measures of a scan into several clusters,according to the distance between two consecutivepoints Pi and Pi+1, followed by line fitting of thepoints in each cluster. To extract segments or clus-ters, the algorithm chosen for our application is partof the algorithms presented and evaluated in [21].The selected technique minimizes the orthogonaldistance between the points of measurement and the

4

Tracking

Layer 1

Layer 2

Layer 4

Layer 3

Segmentation

Segmentation

Pedestrians

the 4 layersSensor

filteringdetectionFalse

extractionPedestrian

filteringdetectionFalse

extractionPedestrian

Fusion of

Fig. 2: Pedestrian detection algorithm using a multilayer laser sensor.

estimated line. The process continues by incorporat-ing into clusters (called beacons) all the points thathave not been approximated as segments and whichrepresent a large group of points (points close toeach other).

Without a priori knowledge of the number ofobstacles in the observed scene, our segmentationtechnique gathers the points in various geometricalclasses with the aim of extracting from the laserimage the characteristics relating to the walls, carsor panels and to keep only the objects having a"pedestrian" signature (see § 4.2). All the pointsgathered in the geometrical classes are eliminatedfrom the initial laser layer before looking at the"pedestrian" signature.

4.2 Pedestrian extraction in the 4 laserplanesThis section presents the detection system based onthe fusion of information located in the 4 horizontalplanes; this system allows to improve pedestriandetections in comparison with a method using onlya single-row laser-range scanner. The main idea isto reduce false detections which may appear in asingle layer by seeking confirmation of detection inthe other three. Part of the occlusion problem ofan object by a smaller one is also solved thanks tothe angular variation of∼ 1.07o between the 4 laserlayers. After the segmentation step and the four ob-ject classifications in each layer, we propose, a non-parametric method based on a discrete modeling ofthe probability density of each laser reading usinga kernel density estimator [22].

Initially and for each layer, all the points whichwere not filtered by the segmentation module areused in the Parzen density estimator for calculatingthe pedestrian probability densities located in the

observed laser plane. In the fusion module, all the"pedestrian objects" detected in each laser layer areprojected onto a plane parallel to the ground plane.Then, the Parzen density estimator is also used inorder to compute the pedestrian probability densitylocated in the observed scene by the 4 laser planes.

4.3 Pedestrian Detection

Using a laser scan, the system must deduce fromthe position of laser points, a non-parametric repre-sentation of the likelihood. Thus the complexity ofthis model will depend directly on the assumptionsmade from the pedestrian shape located in a laserimage.

• How many raw observations?• How are these laser measurements dispatched?

To answer these questions, we propose an originalapproach to build a non-parametric model based onkernel functions, allowing a smart selection of themost pertinent 2D laser points from a likelihoodanalysis function. A likelihood discriminating func-tion permits the classification of each laser point asthe pedestrian gravity center or not. This method isnot supervised, so no prior knowledge is requiredto process a laser scan.

Let Z .= {zk}k=1,...,Ns denote the vector composed

by 2D laser readings which have not been filteredby the segmentation module in a laser plane. Wedefined a Bernouilli random variablewk ∈ {w1,w2}given bywk = w1 if the associated event is classifiedas pedestrian gravity center orwk = w2 in all othercases.

The likelihood functionp(Z|wk) can compute theprobability that a laser point belongs to the pedes-trian gravity center. We propose to illustrate thelikelihood p(Z|wk) by a non-parametric model using

5

an estimation based on kernel functions (Parzenwindow model).

p(Z|wk) =1

Nbpts

Ns

∑i=1

ϕ(zk,zi) (1)

where Ns represents the total number of pointspresent in the image,Nbpts represents the numberof theoretical points that a pedestrian should sendback into a laser layer according to distance (D).Finally ϕ(zk,zi) is the kernel function which allowsto modify the zone of influence of a point with itsneighbours, it is defined by:

ϕ(zk,zi) = exp[−λc ·dc(zk,zi)] (2)

The λc parameter permits to adjust the weightcalculation. Thedc distance used is a Mahalanobisdistance defined by:

dc(zk,zi) = (zk − zi)Σ−1ϕ (zk − zi)

T (3)

with Σϕ , the covariance matrix associated toboth pedestrian geometrical components (width andthickness) in a laser image.

Σϕ =

[

σwidth2 0

0 σthickness2

]

(4)

Finally the function that weighs the likelihood func-tion p(Z|wk) depending on the sensor characteris-tics, the sought objects and on the detection distanceis defined by:

Nbpts =W

D · tanθ(5)

This expression takes into account the pedestrian’sdimensions (width:W ), the angular resolution (θ ) ofthe sensor according to the distance (D) separatingthe obstacle from the vehicle.

For each plane, the 2D laser points having thehighest probabilityzk ∈ w1 are chosen by the maxi-mum likelihood estimator as the pedestrians’ posi-tions:

Zk|k = argmaxk

(p(Z|wk ∈ w1)) (6)

Once the pedestrian gravity center is defined, thenext step consists in searching all the points belong-ing to the points group of the gravity center. Thesepoints will be eliminated if their distancedc(zk,zi)

is lower than the thresholdα. Thus the point listLi

below is eliminated:

Li = {dc(zk,zi) < α},α = MaxPedestrianWidth/2(7)

This algorithm is reiterated while the likelihoodmaximum estimator contains a value which is higherthan the trust thresholdδ ∈ [0,1] (see Algorithm 1).

Algorithm 1 Pedestrian detection algorithm with anon-parametric estimatorEnter: set of points composed by 2D lasermeasurements which have not been filtered by thesegmentation moduleZ = {zi}i=1,...,Ns

Compute the likelihood function:p(Z|wk)

initialization: m = 0 andZ0 = Z

repeatm = m+1Extraction of maximal likelihood point

zm = Zk|k = argmaxk

(p(Zm|wk ∈ w1))

Compute the associated points set

Lm = {Zi ∈ Z|dc(zm,zi) < α}

Update input set

Zm+1 = Zm⋂

Lm

while: reached stopping criteria:p(Zm|wk ∈w1) < δM = mreturn The set of points selected:Z = {z1, z2, ..., zM}

4.4 False detection filteringWhat is a false detection? A false detection is anobject classified as "pedestrian" when it is not.Indeed, in the complex urban environment, a lot ofdetections are unfortunately wrong because manyobjects are not totally filtered such as cars, trucks,buses, poles, trees, crash barriers, etc. So, thismodule checks the size and the orientation angleof the segments labelled for the Parzen algorithmas ”pedestrian” in order to filter all the segmentswith a size lower than 30 cm or greater than 80 cmand with an orientation angle greater than 18◦ orlower than−18◦ in the laser reference frame.

6

4.5 Fusion of the 4 layers

Once all the positionsZ = {zk}k=1,...,M resultingfrom the 2D raw observations representing thegravity center of the pedestrians located in eachlaser plane are known, one must verify if the 4laser planes confirm the same information: that isknown as the fusion stage. First, the fusion of theinformation located in the 4 laser planes consistsin projecting all the pointsZ in the same plane.Then the fusion is carried out by a similar methodto pedestrian detection (see Algorithm 1) withΣψ ,the covariance matrix associated to the laser sensorinaccuracy in the two dimensions (x andy).

Σψ =

[

σx2 0

0 σy2

]

(8)

The function that weighs the likelihood functionp(Z|wk) depending on the sensor characteristics, onthe sought objects and on the detection distance isdefined by:

Nl =4

∑i=1

Nl(i) (9)

with Nl(i) = 1 if (0< (hc+D · tan(φi)) < H) (10)

These expressions take into account the pedestrian’sdimensions (height:H), angular spacing (φi) be-tween layers according to the distance (D) sepa-rating the obstacle from the vehicle and the heightof the sensor (hc) in relation to the ground. The2D prominent laser pointszk ∈ w1 are selected asthe pedestrian position by the maximum likelihoodestimator (see equation 6).

Once the pedestrian gravity center is defined,the next step consists in searching all the pointsbelonging to the points group of the gravity center.These points will be eliminated if their distancedc(zk, zi) is lower thanβ . Thus the point listLi

below is eliminated:

Li = {dc(zk, zi) < β},β = SensorInaccuracy (11)

This algorithm is reiterated while the likelihoodmaximum estimator contains a value higher than thetrust thresholdυ ∈ [0,1].

5 ALGORITHM PROPOSED FOR PEDES-TRIAN TRACKING

The choice of the tracking algorithm depends di-rectly on the application. In the case of pedestriantracking, no prior knowledge or assumption abouttheir 2D motion which can be very uncertain isassumed. The most commonly used framework fortracking is based on a Bayesian sequential estima-tion.

Under such assumptions (stochastic state equationand/or non linear state and/or non Gaussian noises),particle filters are particularly well adapted.

The SIR PF and the derived Auxiliary and Regu-lar particle filters, as proposed by Gordon etal.[27], are the most popular particle filters to estimatenon-Gaussian density probability or a non-linearevolution model.

5.1 SIR PFIn the following section the theory of the sequentialMonte Carlo methods in the framework of multipleobject tracking is briefly reminded. For more details,the reader can refer to Gordon’s work [27].Let us consider a discrete dynamic system:

Xk = f (Xk−1)+Wk (12)

Zk = h(Xk)+Vk (13)

where Xk represents the state vector andZk themeasurement vector at instantk.

Particle filters provide an approximate Bayesiansolution to discrete time recursive problems by up-dating a rough description of the posterior filteringdensityp(xk|z1:k). This a posteriori belief representsthe state in which the objects are.

The main purpose of particle filters is to approxi-mate the prior distribution of the recursive Bayesianfilter p(xk|z1:k−1) as a set ofNs samples, using thefollowing equation:

p(xk|z1:k−1) =1Ns

Ns

∑i=1

δ (xk −xik) (14)

where δ is the discrete Dirac function. Then theposterior distributionp(xk|z1:k) can be estimated by:

p(xk|z1:k) = p(zk|xk)Ns

∑i=1

p(xk|xik−1) (15)

This approach can be implemented by a bootstrapfilter or a SIR PF.

7

Association

Tracking algorithm

module

management

Track

with SIR PF

Exploration state

Exploration state

Limitation of

Observations

Assumptions

Data

Fig. 3: Pedestrian tracking algorithm.

With regard to the dynamics of the pedestrians’movements (see equation 12), we suppose that noprior information on their trajectory (change in pace,direction, sudden stop, etc.) is available. In orderto predict all these trajectory modifications as wellas possible, an evolution model with a circularmotion [29] is used. The heading angle is usedas a disturbance of the predicted trajectory. Themodel with circular motion applied to each particleis defined below.

Fk =

1 ∆T sin(θk)θk

0 −∆Tθk

.(1−cos(θk)

0 cos(θk) 0 −sin(θk)

0 ∆Tθk

.(1−cos(θk)) 1 −∆T sin(θk)

θk

0 sin(θk) 0 cos(θk)

(16)and θk+1 = θk +bg with bg ∼ N (0,σg). σg is thestandard deviation of the heading angle concerningthe pedestrian’s trajectory.

The state vector used summarizes all the infor-mation observed in the scene, i.e. the number ofobserved pedestrians and their characteristics:

Xk = (Ok,x1,k, ...,xN,k) (17)

with Ok a discrete random variable representingthe number of pedestrians present in the scene andxN,k = (pN,k,IN,k) the state vector associated to theobjectN. The 2D positions and speed characteristicsare given bypN,k. IN,k gives identification, age, andthe number of points that a pedestrian sends backinto a laser layer. According to equation 13

Zk = [I2∗2]xk+1+Vk (18)

where xk represents the object position. Finally,noise, Vk and Wk are assumed to be a Gaussian

function, of zero-mean and of respective covari-ances:

Qk =

[

σ2x 0

0 σ2y

]

, Rk =

[

σ2cx 00 σ2

cy

]

(19)

The variances of the added noises depend onthe maximum movement amplitude possible for apedestrian i.e.σx = σy = 2 m and maximumerrors of sensor measurements areσcx = 0.2 mandσcy = 0.2 m.

5.2 Track management module

To allow modification to the number of objects,Khan etal. [26] introduce the RJMCMC (ReversibleJump Markov Chain Monte Carlo) methods. Indeed,as the number of visible objects may change, thestate space dimension may also change followingthe set of RJMCMC moves defined by the user. Forexample, the move set can be included: {update,birth, death, merge, split,...}.

In order to change the number of tracked objects,a track management module is used. Its definitionis summarized below:

• If an observation cannot be associated withthe set assumption, then the track managementmodule proposes a new assumption.

• If an assumption does not find out any ob-servation over 500 ms, the track managementmodule proposes to suppress the assumption. Inthis case, of course, an evolution model helpsto guide state space exploration of the SIR PFalgorithm with a prediction of the state.

The limitation of the exploration state is given bythe maximum displacement speed of a pedestrian(≈ 2 m/s).

8

5.3 Limitation of exploration state and dataassociation

Multiple objects tracking with a particle filtergenerally uses a data association step, in which eachtarget is mapped to an object hypothesis. Conven-tional methods such as Nearest Neighbor StandardFilter (NNSF), Joint Probabilistic Data AssociationFilter (JPDAF) [28] and also the Multi HypothesisTracking (MHT) [25] calculate a region delimitingthe space where future observation are likely tooccur [28]. Such a region is calledvalidation gateor gate. Selecting a too small gate size may leadto miss the target originated measurement, whereasselecting a too large size is computationally expen-sive and increases the probability of selecting falseobservations.

In our framework, the validation gateGk can beapproximated by an ellipsoidal region given by aGaussian density which is given byp(xk|z1:k) =N (xk,Pk) with:

xk =Ns

∑i=1

wikxi

k (20)

Pk =Ns

∑i=1

wik(x

ik − xk)(x

ik − xk)

T (21)

where xk and Pk are the first two moments of thepredicted Gaussian density. In this case, the valida-tion window is the ellipsoid of sizeNz (dimensionof measurement vector) defined such as:

Gk = {zk : (zk − zk)S−1k (zk − zk)

T ≤ γ} (22)

WhereSk = H ·Pk ·Ht +R is the covariance of theinnovation corresponding to the true measurement.The thresholdγ is obtained from the Chi-squaretables forNz degrees of freedom and represents theprobability that the (true) measurement will fall inthe gate.

In this paper, the NNSF was chosen in order tomatch the different measurements with the differentassumptions. The NNSF is the most popular andwidely used algorithm for target tracking due to itscomputational simplicity and its low computationtime. Because the measurements (or observations)sent by pedestrian detection algorithm (cf. § 4) arelocated in a few cluttered environment, the NNSFcan be used with good performance [28].

6 EXPERIMENTSThis section presents the experiments which haveallowed to validate the algorithm of pedestrian de-tection and tracking [20] [23].

6.1 Detection results

In order to evaluate the pedestrian detection algo-rithm, we have tried to answer at the followingquestion:

How can we objectively evaluate the perfor-mance of pedestrian detection?

In order to define a framework for detection eva-luation, it is important to understand what qualitiesare essential to a good laser-based detection method.To do so, it can be helpful to consider what consti-tutes a "golden" pedestrian detection algorithm bymeans of a laser sensor. One could argue that a goodlaser detection, in a real-life situation, should:

1) detect all the moving and not moving pedes-trians who are located in front of the vehicle;

2) get a false alarm rate equal to zero;3) detect objects in all weather conditions (rain,

fog, etc.);4) accurately estimate pedestrian position;5) be fast (real time pedestrian detection).

So, this evaluation method focuses on the moregeneric tasks above. From this list of qualities, aground truth has been created from one of ourscenarios in order to evaluate our detection algo-rithm. This ground truth allows to know for eachlaser scan all the pedestrian objects located in thescene with their exact position. Few studies areproposed in the literature concerning the evaluationof pedestrian detection methods by means of a lasersensor. The ground truth proposed in our frameworktakes into account the sensor incapacity to detect thepedestrian occluded by other objects in order not tobias the pedestrian detection rate. In fact, occludedobjects are a special case that can cause spuriouserrors to appear when evaluating the configuration.

So, "Ground Truth" scenario from our experi-ments allow us to evaluate the pedestrian detectionalgorithm with real data. All experiments are basedon real laser scan sequences.

The "Ground Truth" scenario takes place in an ur-ban environment. This experiment is obtained withthe Renault test vehicle moving at real conditiontraffic. It includes several pedestrians(> 5) who

9

appear or who disappear in the sensor area. Thesensor resolution angle is 0.25o.

During experiments, a complete laser statementis memorized approximately every 140 ms. Thisalgorithm is implemented in Matlab and C/C++[30]. All the results are obtained using the samesingle set of parameters. This algorithm was testedin different situations such as an urban scene, asemi-urban scene or a car park. For each scan, thenumber of false detections is obtained by calculatingthe ratio:

rate_of_false_detections=NT −NP

NT(23)

with NT the total number of detections andNPthe number of detected pedestrians. The rate ofpedestrian detection is given by calculating the ratio:

rate_of_pedestrian_detection=NP

NP_VT(24)

with NP_VT the number of pedestrians who areeffectively in the sensor area.

6.2 Advantage of the fusion of the 4-planelaser method

Table 1 show the advantage of the use of the 4 laserlayers in order to significantly decrease the numberof false detections. It can also be noticed that therate of pedestrian detections is higher when usingthe 4 laser layers on our "Ground Truth" scenariospresented in last paragraph.

TABLE 1: Rate of false and correct detectionsaccording to the number of layers used for thescenarios presented in the article.

One layer 4 layersfalsedetec-tionrate

pedestriandetec-tionrate

falsedetec-tionrate

pedestriandetec-tionrate

dynamic scenarioobtained with theRenault vehicle(50 s)

0.424 0.705 0.342 0.916

Fig. 4 presents some results of the ROC (Re-ceiver Operating Characteristic) curve obtained withthe detection algorithm. In fact, ROC curves area standard way to display the performance of aset of binary classifiers for all feasible ratios ofthe costs associated with false positives and false

negatives. The aim of ROC analysis is to displayin a single graph the performance of classifiersfor all possible costs of misclassification. In thispaper, laser pedestrian detection is considered asa classifier parameterized by a thresholdδ ∈ [0,1](see Algorithm 1), threshold which determines theproportion of false or true positives.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False positive rate

Tru

e po

sitiv

e ra

te

4 layersone layer

with same parameters set

Fig. 4: Receiver Operating Characteristic (ROC)curve obtained from two algorithms (4 layers vs onelayer).

All the results that are presented in the differ-ent figures, were obtained only from the detectionmethod using 4 laser layers. Taking into account theresults of the ROC curve, we have chosen to use apedestrian acceptation threshold of 0.6 allowing toobtain a good compromise between false detectionrate and pedestrian detection rate. To illustrate thedetection obtained in an external environment, thedetected pedestrians’ estimated positions are pro-jected in the video image. These experiments pro-pose to deal with a great number of urban situationswhich allow to test the robustness of our method. Itis interesting to notice that pedestrian detection iscorrect at a distance up to 20 m, which is difficultto achieve with an angle resolution of 0.25o.

6.3 Tracking evaluation

The behavior of the SIR PF is carried out ondifferent sequences on board the vehicle (see Fig. 1)which served to validate our pedestrian detectionalgorithm. These sequences acquired in an urbanscene (see Fig. 9), show one or more people walkingalone across the scene, passing each other, meeting

10

at the center of the scene or walking together acrossthe scene.

0 50 100 150 2004

6

8

Dep

th [m

]

0 50 100 150 20060

80

100

120

iteration

The

ta [°

]

Fig. 5: Result of pedestrian tracking on depth andtheta positions. Measurements are always repre-sented in gray circles.

0 50 100 150 200−1

0

1

Dep

th v

eloc

ity [m

/s]

0 50 100 150 200−10

0

10

iteration

The

ta v

eloc

ity [°

/s]

Fig. 6: Result of pedestrian tracking on depth andtheta velocity.

Several experiments are now presented demon-strating the performance of SIR PF algorithm. SIRPF algorithm was tested on real data. The presentedscenario (see Fig. 10) include several pedestrians(> 5). In urban scene, the pedestrians move in alldirections. The vehicle moves at a speed rangingfrom 0 to 50 km/h, which allows to test the robust-ness of this method.

7 DISCUSSION AND CONCLU-SIONIn the future, vehicles are expected to become moreintelligent and responsive, managing information

0 50 100 150 2000

0.01

0.02

0.03

Dep

th [m

]

0 50 100 150 2000

0.1

0.2

iteration

The

ta [°

]

Fig. 7: Result of pedestrian tracking on depth andtheta position error.

0 50 100 150 2000

0.2

0.4

Dep

th v

eloc

ity [m

/s]

0 50 100 150 2000

1

2

3

iteration

The

ta v

eloc

ity [°

/s]

Fig. 8: Result of pedestrian tracking on depth andtheta velocity error.

delivery in the context of driver’s situation. Pedes-trian protection is one method of accomplishing thisgoal. The study presented in this paper is related tothe capability to detect pedestrians using only a lasersensor mounted on the front of a vehicle.

This work takes place in the LOVe Project whichaims at improving road safety, mainly focusingon pedestrian security. The purpose is to designsafe and reliable software for the observation of"vulnerables". A lot of research work has beencarried out over the last years concerning pedestriandetection using a laser sensor. However, importantissues still remain concerning reliability and mainlyself-diagnosis algorithms. Intelligent systems stillhave to be developed before being integrated intomass-produced cars. Indeed, these systems must

11

−8 −6 −4 −2 0 2 4 6 80

5

10

15

20

25

30

X [m]

Z[m

]

laser screenshots layer 1




pedestrian detection

Fig. 9: Video and laser screenshots in an urbanenvironment. Below, the detected pedestrian in thelaser image and above their respective projection inthe video image.

not deliver any false information concerning theobserved scene. The scientific community is highlyaware to this necessity and this is one of LOVe’sobjectives.

In most methods, a single row laser range scannerversion is used but it is unusual to take advantageof the complementarity of planes provided by amultilayer laser sensor. Furthermore, few papers [8]present complex urban environments in real trafficconditions. And, few authors propose a pedestriandetection and tracking system which allows to lo-calize as accurately as possible all the pedestrianspresent in the scene, either at a standstill, or inmotion. In this paper, we introduce a new schemewhich meets these requirements.

This paper has presented a new algorithm to in-crease the safety and the possibly to avoid collisionswith vulnerable road users. The goal of this workis to obtain a robust pedestrian detection algorithmallowing real time detection, location, identificationand tracking. We first gave an account of our workconcerning pedestrian detection using only a lasersensor that enabled us to attest the originality of thechosen approach concerning the sensor as well as

−8 −6 −4 −2 0 2 4 6 80

5

10

15

20

25

30

data5

data6

X [m]

Z[m

]

laser screenshots layer 1laser screenshots layer 2laser screenshots layer 3laser screenshots layer 4

pedestrian detection

Fig. 10: Video and laser screenshots in a urbanenvironment. Below, the detected pedestrians in thelaser image and above their respective projectionsin the video image.

the algorithmic solution. We proposed a fusion ofthe 4 laser planes method based on the Parzen kernelmethod. This work allows to show that judicioususe of 4 laser planes improves pedestrian detectionand significantly decreases the number of falsealarms. Moreover, a SIR PF allows to track thepedestrians; that enables to increase the robustnessof the pedestrian detection algorithm and to managethe occlusion of a pedestrian by another one. Atthis stage of the study, we consider that Parzen’smethod allows after a decentralized fusion of the 4planes an effective selection of the laser observationclusters having the geometrical characteristics of apedestrian.

Currently, the results show that more than 90%(see Table 1) of collisions between pedestriansand vehicles could be detected if vehicles wereequipped with our pedestrian collision avoidancesystem based on a laser sensor. However, frequentocclusions between objects, the obvious limitationsof this sensor (no information about shape, contour,texture, color of objects), its sensibility to atmo-spheric conditions such as rain and fog, require todevise a method of laser/camera fusion to improve

12

a pedestrian collision avoidance system. Indeed, ourpedestrian detection algorithm still returns about30% (see Table 1) of false alarms. Therefore, thenext research step will consist in developing a newmethod of laser/camera fusion in order to improvethese results.

REFERENCES

[1] http://love.univ-bpclermont.fr/.[2] T. Gandhi and M.M. Trivedi, "Pedestrian Collision Avoidance

Systems: a Survey of Computer Vision based Recent Studies",in Procs. IEEE Intelligent Transportation Systems 2006, Sept.2006, pp. 976-981.

[3] C. Curio, J. Edelbrunner, T. Kalinke, C. Tzomakas, and W.von-Seelen, "Walking Pedestrian Recognition", inIEEE Trans. onIntelligent Transportation Systems,1(3):155-163, Sept. 2000.

[4] M. Bertozzi, A. Broggi, M. Del Rose, and M. Felisa, "ASymmetry-based Validator and Refinement System for PedestrianDetection in Far Infrared Images", Seattle, USA, 2007.

[5] A. Broggi, M. Bertozzi, M. Del Rose, M. Felisa, and A. Rako-tomamonjy, "A Pedestrian Detector using Histograms of Ori-ented Gradients and a Support Vector Machine Classificator",in Procs. IEEE Intl. Conf. on Intelligent Transportation Systems2007, pages 144-148, Seattle, WA, USA, Sept. 2007.

[6] A. Broggi, M. Bertozzi, R.I. Fedriga, C.H. Gomez, G. Vezzoni,and M. Del Rose, "Pedestrian Detection in Far Infrared Imagesbased on the use of Probabilistic Templates", inProcs. IEEEIntelligent Vehicles Symposium, pages 327-332, Istambul, Turkey,June 2007.

[7] A. Elfes, "Using Occupancy grids for Mobile Robot Perceptionand Navigation", Computer, vol. 22, no. 6, pp. 46-57, June, 1989.

[8] K.C. Fuerstenberg, D.T. Linzmeier, and K.C.J. Dietmayer,"Pedestrian Recognition and Tracking of Vehicles using a Vehiclebased Multilayer Laserscanner", in10th World Congress onIntelligent Transport Systems (ITS), Madrid, Spain, November2003.

[9] E. Prassler, J. Scholz, and P. Fiorini, "Navigating a RoboticWheelchair in a Railway Station during Rush Hour", Int. Journalon Robotics Research (IJRR), 18(7), pp. 760-772, 1999.

[10] B. Kluge, C. Koehler, and E. Prassler, "Fast and Robust Track-ing of Multiple Moving Objects with a Laser Range Finder",in Proc. of the IEEE International Conference on Robotics andAutomation (ICRA), pp. 1683-1688, Seoul, Korea, 2001.

[11] S. Wender, and K.C.J Dietmayer, "An adaptable Object Clas-sification Framework", in Proc. of the IEEE Intelligent VehiclesSymposium (IV), Tokyo, Japan, 2006.

[12] M. Lindström and J.-O. Eklundh, "Detecting and TrackingMoving Objects from a Mobile Platform using a laser RangeScanner", inProc. IEEE/RSJ Int. Conf. on Intelligent Robotsand Systems (IROS), pp. 1364-1369, 2001.

[13] M. Montemerlo, S. Thrun, and W. Whittaker, "ConditionalParticle Filters for Simultaneous Mobile Robot Localization andPeople-Tracking", inProc. of the IEEE International Conferenceon Robotics and Automation (ICRA), Washington D.C, 2002.

[14] D. Schulz, W. Burgard, D. Fox, and A.B. Cremers, "TrackingMultiple Moving Objects with a Mobile Robot", inProc. of theIEEE Computer Society Conference on Computer Vision andPattern Recognition (CVPR), 2001.

[15] M. Szarvas, U. Sakai, and J. Ogata, "Real-time PedestrianDetection using LIDAR and Convolutional Neural Network", inProc. of the IEEE Intelligent Vehicles Symposium (IV), Tokyo,Japan, 2006.

[16] H. Zhao and R. Shibasaki, "A Novel System for TrackingPedestrians using Multiple Single-Row Laser Range Scanners",IEEE Trans. on Systems, Man, and Cybernetics (SMC), Part A:Systems and Humans, Vol. 35, no. 2, pp. 283-291, march 2005.

[17] A. Fod, A. Howard, and M. Mataric, "Laser-based PeopleTracking", in Proc. of the IEEE International Conference onRobotics and Automation (ICRA), Washington D.C, 2002.

[18] O. Frank, J. Nieto, J. Guivant, and S. Scheding, "Multiple TargetTracking using Sequential Monte Carlo Methods and StatisticalData Association", inProc. IEEE/RSJ Int. Conf. on IntelligentRobots and Systems (IROS), 2003.

[19] X. Shao, H. Zhao, K. Nakamura, K. Katabira, R. ShibasakiandY. Nakawaga, "Detection and Tracking of Multiple Pedestriansby using Laser Range Scanner", inProc. IEEE/RSJ Int. Conf. onIntelligent Robots and Systems (IROS), 2007.

[20] S. Gidel, P. Checchin, C. Blanc, T. Chateau, and L.Trassoudaine, "Pedestrian Detection Method using a MultilayerLaserscanner: Application in Urban Environment", inProc.IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS),2008.

[21] V. Nguyen, A. Martinelli, N. Tomatis, and R. Siegwart, "AComparison of Line Extraction Algorithms using 2D LaserRangefinder for Indoor Mobile Robotics", inProceedings of theIEEE/RSJ International Conference on Intelligent Robots andSystems (IROS), Edmonton, Canada, 2005

[22] J. Cui, H. Zha, H. Zhao, and R. Shibasaki, "Laser-basedDetection and Tracking of Multiple People in Crowds", inComputer Vision and Image Understanding (CVIU), SpecialIssue on "Advances in Vision Algorithms and Systems Beyondthe Visible Spectrum", Volume 106, Issue 2-3, pp. 300-312, May2007.

[23] S. Gidel, P. Checchin, T. Chateau, C. Blanc, and L.Trassoudaine, "Parzen Method for Fusion of Laserscanner Data:Application to Pedestrian Detection", in Proc. of the IEEE Intel-ligent Vehicles Symposium (IV), Eindhoven, The Netherlands,2008.

[24] R.E. Kalman, "A New Approach to Linear Filtering and Predic-tion Problems", Trans. ASME J.Basic Eng, flight 82, pp.34-45,March 1960.

[25] D.B. Reid, "Multiple an Algorithm for Alignment Targets", InProc. CDC, pages 1202-1211, December 1978.

[26] Z. Khan, T.R. Balch, and F. Dellaert, "An MCMC based ParticleFilter for Tracking Multiple Interacting Targets." In ECCV, vol.3024, pages 279-290, 2004.

[27] N.J. Gordon, N.J. Salmond, and A.F.M. Smith, "Novel Ap-proach to Nonlinear/non-Gaussian Bayesian State Estimate",IEEE Proc.-F, vol. 140, no.2, pp.107-113, 1993.

[28] Y. Bar-Shalom and T.E. Fortmann, "Alignment and Data Asso-ciation", New York: Academic, 1988.

[29] X.R. Li and V.P. Jilkov, "A Survey of Maneuvering TargetAlignment: Dynamic Models", in6in Proc. of SPIE Conferenceone Signal and Data Processing of Small Target, Orlando, FL,the USA, April 2000.

[30] J. Falcou, J. Sérot, L. Pech and J.T. Lapresté, "Meta-programming Applied to Automatic SMP Parallelization of Lin-ear Algebra Code",In Proceeding of EuroPar, Las Palmas, GranCanaria, August 2008.

[31] F. Bardet and T. Chateau, "MCMC Particle Filter for Real-TimeVisual Tracking of Vehicles", in Proc. of the IEEE ConferenceIntelligent Transportation System, Beijing, China, 2008.

pedestrian detection and tracking in urban …chateaut.free.fr/publications/2009gidelits.pdftion of...

Documents