multiple target tracking with the probability …dec1/thesis/danielclarkthesis.pdf1.1 target...

HERIOT-WATT UNIVERSITY

Multiple Target Tracking with

The Probability Hypothesis Density Filter

Daniel Edward Clark

SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

ON COMPLETION OF RESEARCH IN THE

DEPARTMENT OF ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING.

OCTOBER 2006

This copy of the thesis has been supplied on the condition that anyone who consults it is

understood to recognise that the copyright rests with the author and that no quotation from

the thesis and no information derived from it may be published without the written consent

of the author or the University (as may be appropriate).

Declaration

I hereby declare that the work presented in this thesis was carried out by myself at Heriot-

Watt University, except where due acknowledgement is made, and not been submitted for

any other degree.

Signature of Daniel Edward Clark :

Signature of Supervisor:

Abstract

The random-set framework for multiple target tracking offers a distinct alternative to the tra-

ditional approach to multiple target tracking by treating the collections of individual targets

and observations as finite-sets. The multi-target state is predicted and updated recursively

based on the set-valued observation. The complexity of computing the multi-target recur-

sion grows exponentially with the number of targets and so a method for approximating

the optimal filter using a recursion for the first-order moment of the multi-target posterior,

known as the Probability Hypothesis Density (PHD) filter, was developed.

This thesis addresses some of the essential issues required for the PHD filter to be

of practical value in multiple target tracking applications. Two implementations of the

PHD filter are studied in detail; the Particle PHD filter, which is a Sequential Monte Carlo

technique based on particle filtering, and the Gaussian Mixture PHD filter, which provides

a closed form solution to the PHD filter.

A detailed study of the convergence properties is conducted which gives theoretical

justification for the use of the algorithms. Novel methods to determine the trajectories of

the targets for each of the algorithms are developed which enable the PHD filter to be used

for true multiple target tracking. These methods are implemented on forward-looking sonar

data and demonstrate that the multiple target tracking methods developed for the PHD filter

can be used for real applications.

Acknowledgements

A big thanks to Judith Bell for her excellent supervision throughout the course of my PhD,

her consistent support and guidance has been invaluable.

Thanks to QinetiQ for supporting this work, and, in particular, Douglas Carmichael and

Samantha Dugelay for their interest in this work.

At Heriot-Watt, thanks to Yvan Petillot and Ioseba Tena Ruiz for their expertise on

tracking and sonar and for developing the tracking algorithm with Kalman filters on sonar

data. Also at Heriot-Watt, thanks to Yves de Saint-Pern for providing his code, Chris

Haworth for the tracking work on millimetre wave images, Chris Capus for helping me

recover this thesis from my dead laptop, and to the excellent support staff in the department.

Thanks to Ba-Ngu Vo for an interesting couple of months in Melbourne and for devel-

oping the algorithms on which this thesis is based. Also in Melbourne, thanks to Kusha

Panta for his contribution to the Fusion paper. In Cambridge, thanks to Sumeetpal Singh

for helping with the complicated mathematics and his high level of rigour.

Thanks to Ronald Mahler for developing this interesting area in mathematics and en-

gineering and for inviting me to Florida to present some of this work. The anonymous

reviewers, some of whom have refereed a number of the articles in this thesis, have con-

tributed substantially to improving this work and deserve a special thanks.

Thanks to Spela for inspiring me to do something good.

Finally, the biggest thanks go to my parents for always supporting me, without whom

this work would not have been possible.

Contents

1 Introduction 6

1.1 Target Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2 Multiple Target Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 The PHD Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.5 Original Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Bayesian Filtering 18

2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2 Single-Target Bayesian Filtering . . . . . . . . . . . . . . . . . . . . . . . 19

2.3 Kalman Filtering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.3.1 The Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.2 The Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . 27

2.3.3 The Unscented Kalman Filter . . . . . . . . . . . . . . . . . . . . 30

2.3.4 The Gaussian Sum Filter . . . . . . . . . . . . . . . . . . . . . . . 32

2.4 Sequential Monte Carlo Filtering . . . . . . . . . . . . . . . . . . . . . . . 35

2.4.1 Sequential Importance Sampling and Resampling . . . . . . . . . . 35

2.4.2 The Particle Filter Algorithm . . . . . . . . . . . . . . . . . . . . . 37

2.4.3 Convergence Properties . . . . . . . . . . . . . . . . . . . . . . . 39

2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3 The Probability Hypothesis Density Filter 42

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.2 Random Set Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.3 Point Process Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3.1 Janossy Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.3.2 Probability Generating Functionals . . . . . . . . . . . . . . . . . 47

3.4 PHD Filter Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.4.1 The PHD Prediction Equation . . . . . . . . . . . . . . . . . . . . 51

3.4.2 The PHD Measurement Update Equation . . . . . . . . . . . . . . 53

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4 The Particle PHD Filter 60

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.2 The Particle PHD Filter Algorithm . . . . . . . . . . . . . . . . . . . . . . 61

4.3 Convergence for the Particle PHD Filter Algorithm . . . . . . . . . . . . . 65

4.3.1 Criteria for Convergence . . . . . . . . . . . . . . . . . . . . . . . 65

4.3.2 Convergence of the Mean Square Errors . . . . . . . . . . . . . . . 66

4.3.3 Convergence of Empirical Measures . . . . . . . . . . . . . . . . . 77

4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5 The Gaussian Mixture PHD Filter 85

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.2 The Gaussian Mixture PHD Filter Algorithm . . . . . . . . . . . . . . . . 86

5.3 Convergence of the Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.3.1 Initialisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.3.2 Prediction Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 94

5.3.3 Measurement Equation . . . . . . . . . . . . . . . . . . . . . . . . 97

5.4 Pruning and Merging of Gaussian components . . . . . . . . . . . . . . . . 98

5.4.1 Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.4.2 Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5.5 Non-linear Target Dynamic Models . . . . . . . . . . . . . . . . . . . . . 102

5.5.1 Extended Kalman Prediction Equation . . . . . . . . . . . . . . . . 104

5.5.2 Extended Kalman Measurement Update . . . . . . . . . . . . . . . 106

5.5.3 The Unscented Kalman PHD Filter . . . . . . . . . . . . . . . . . 109

5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6 PHD Filter Target Estimation in Sonar Images 111

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.2 The Particle PHD Filter with State Estimation . . . . . . . . . . . . . . . . 114

6.3 Forward-Looking Sonar Implementation . . . . . . . . . . . . . . . . . . . 116

6.3.1 Simulated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6.3.2 Real Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

7 State Estimation and Track Continuity for the Particle PHD Filter 126

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

7.2 Multi-Target State Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 129

7.2.1 Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

7.2.2 Multi-target Miss Distance Metrics . . . . . . . . . . . . . . . . . 132

7.2.3 Simulated Examples . . . . . . . . . . . . . . . . . . . . . . . . . 133

7.2.4 PHD Filter Estimated Target Number . . . . . . . . . . . . . . . . 137

7.2.5 Time Complexity of PHD filter Tracker . . . . . . . . . . . . . . . 140

7.2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

7.3 Track Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

7.3.1 Particle Labelling Association . . . . . . . . . . . . . . . . . . . . 146

7.3.2 Estimate-to-Track Association . . . . . . . . . . . . . . . . . . . . 149

7.3.3 Simulated Examples . . . . . . . . . . . . . . . . . . . . . . . . . 150

7.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

8 The GM-PHD Filter Multiple Target Tracker 157

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

8.2 The Gaussian Mixture PHD Filter Multiple Target

Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

8.3 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

8.3.1 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

8.3.2 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

8.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

9 Multiple Target Tracking in Sonar Images 174

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

9.2 Tracking and Data Association . . . . . . . . . . . . . . . . . . . . . . . . 175

9.2.1 Tracking with Kalman filters . . . . . . . . . . . . . . . . . . . . . 176

9.2.2 Tracking with the Particle PHD filter . . . . . . . . . . . . . . . . 178

9.2.3 Tracking with the GM-PHD Filter . . . . . . . . . . . . . . . . . . 179

9.3 Implementation on Forward-Looking Sonar . . . . . . . . . . . . . . . . . 180

9.3.1 Simulated Sonar Data . . . . . . . . . . . . . . . . . . . . . . . . 180

9.3.2 Real Sonar Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

9.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

9.4.1 Simulated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

9.4.2 Real Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

10 Conclusions 193

10.1 Thesis Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

10.2 Current Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

10.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

Chapter 1

Introduction

1.1 Target Tracking

Target tracking is a necessary part of systems that perform functions such as surveillance,

guidance or obstacle avoidance. Tracking algorithms take their input measurements from

sensors which provide the signals such as radar, sonar or video. The measurements are

taken at regular intervals and the task is to estimate the state of a target at each point in

time, such as its position, velocity or other attribute. Successive estimates provide the

tracks which describe the trajectory of a target.

The almost universally accepted mathematical framework used to describe this prob-

lem is that of filtering theory and, in particular, Bayesian filtering. The posterior prob-

ability distribution is recursively predicted by propagating this distribution with the state

model, which describes the motion of a target, and updated when a new observation be-

comes available. The mean and covariance of the state are determined at each time-step

from the posterior distribution. The most widely used filtering technique is the ubiquitous

Kalman filter, derived in 1960 [1], for linear and Gaussian target models. More recently,

sample based techniques have proved to be popular, including the particle filter developed

by Gordon in 1993 [2]. Chapter 2 describes the most commonly used filtering algorithms.

One of the current research aims in engineering is to develop autonomous vehicles such

as unmanned aerial vehicles (UAVs) or autonomous underwater vehicles (AUVs). The aim

is to develop self-navigating robots which operate without human interaction. AUVs can be

equipped with a range of sensors including forward-look sonar, sidescan sonar and video

to enable them to navigate autonomously for applications such as mine countermeasures,

pipeline inspection or seabed habitat mapping. Methods for detecting and tracking ob-

jects on the seabed are required to aid path planning [3] and navigation [4]. The vehicle

has to sense its environment to prevent collision with any obstacles. In this thesis, novel

techniques are developed for tracking a variable number of targets and implemented on

forward-scan data obtained from an underwater vehicle.

1.2 Multiple Target Tracking

The multiple target tracking problem extends the scenario to a situation where the number

of targets may not be known and varies with time. The measurements which have originated

from targets are not known since some of them may be due to false alarms. We are now

required to estimate the positions of an unknown number of targets, based on observations

of the targets corrupted by noise, with the possibilities that there may be missed detections

and that observations may be false alarms due to clutter. In addition, the identities of

the targets may need to be known to determine their trajectories. The usual method for

solving this problem is to assign a single-target stochastic filter, such as a Kalman filter or

an extended Kalman filter, to each target and use a data association technique to assign the

correct measurement to each filter [5].

The data association problem in multiple target tracking usually involves ensuring that

the correct measurement is given to each stochastic filter so that the trajectories of each tar-

get can be accurately estimated, this is referred to as measurement-to-track association. The

three classical approaches to this are the Nearest Neighbour Standard Filter (NNSF) [5],

the Joint Probabilistic Data Association Filter (JPDAF) [5], and the Multiple Hypothesis

Tracking filter (MHT) [6].

The Nearest Neighbour Standard Filter simply takes the nearest validated measurement

to the predicted measurement to update each of the target states. This can result in prob-

lems since the nearest validated measurement may be the same for two different targets.

The Joint Probabilistic Data Association Filter computes the joint probabilities for all the

pairings between the predicted measurements and estimated target states. This technique

also has to consider the false alarms from spurious measurements but is restricted to a

known, fixed number of targets. The ideal Multiple Hypothesis Tracking filter maintains

probabilities of all possible associations at each time step. Unlike the NNSF and JPDAF,

this does not just consider the probabilities from the previous time step, which allows for

backtracking and also track initiation. In practise, it is not feasible to keep track of all

possible hypotheses, as the computational complexity grows exponentially. Techniques for

reducing the complexity include gating, to ignore irrelevant observations, pruning, to elim-

inate low probability hypotheses, and merging, to combine hypotheses into a single track.

Some extensions of these techniques include the probabilistic MHT (PMHT) [7] which uses

a soft-gating procedure and Monte Carlo (MC)-JPDA [8], which uses a sample based JPDA

algorithm. A review of multiple target tracking and data association techniques was pre-

sented recently in [9], including novel developments for multi-target Monte Carlo filtering.

An alternative solution to the multiple target tracking problem is to view the set of ob-

servations collectively, and try to estimate the set of target states directly, where the correct

report-to-track association is considered unobservable [10]. The disadvantage of this ap-

proach is that the continuity of the individual target tracks are not kept. One such method

uses Finite Set Statistics for multiple target tracking [11], with an approach analogous to

the recursion used in Bayesian filtering by constructing multiple target posterior distribu-

tions. The time required for calculating joint multi-target likelihoods grows exponentially

with the number of targets so is therefore not very practical for sequential target estimation

as this may need to be undertaken in real time. A practical alternative to Bayesian multi-

ple target tracking was proposed [12] for propagating the first-order statistical moment, or

Probability Hypothesis Density (PHD), instead of the multiple target posterior itself. An

overview of this technique is given in the next section and the mathematical framework is

described in chapter 3.

1.3 The PHD Filter

The mathematical foundation of the multiple target filtering methods used in this thesis

are based on the theory of Random Sets, which was first studied by Matheron [13] in the

1970s. Mahler constructed Finite-Set Statistics (FISST) [14] from the mathematical theory

of point processes [15] and Random Set theory in the mid 1990s as a way of extending

classical single-sensor, single-target statistics to a multi-sensor, multi-target statistics of

finite-set variates. The multi-target states and observations are represented as Random Fi-

nite Sets from which a theoretically optimal Bayesian multi-sensor multi-target filter can

be derived [11]. The multi-target Bayes filter is not tractable for real-time implementa-

tions due to the combinatorial complexity of the multiple target likelihoods [11] and so

the optimal filter must be approximated. A recursive approach was proposed to propagate

the first-order statistical moment, or expectation, of the multi-target posterior distribution

based on the Stein-Winter Probability Hypothesis Density (PHD) [16]. This was called the

PHD filter [12]. The predictive density is approximated by a Poisson point process to track

potentially many targets, including birth, death and spawning of targets automatically.

Although the foundation was established in the form of Finite Set Statistics, its rela-

tionship to conventional probability was not entirely clear. Vo, Singh and Doucet estab-

lished the relationship between FISST and conventional probability [17], which led to the

development of a sequential Monte Carlo (SMC) multi-target filter. In addition, a SMC

implementation of the PHD filter was proposed in the form of a multi-target particle filter

which operates on sets of observations to provide a multi-modal intensity function from

which the multiple target states are determined [18] [17]. Particle filter methods for the

PHD-filter were also devised by Zajic et al. [19], and Sidenbladh [20]. Convergence prop-

erties of these algorithms have been established by Vo et al. [17], Clark [21] (as presented

in chapter 4) and Johansen et al. [22], which show that the empirical representation of the

PHD converges to the true PHD.

Practical applications of these methods have included tracking vehicles in different ter-

rains [20], tracking targets in passive radar located on ellipses [23] and tracking a variable

number of targets in forward scan sonar [24] [25] [26] (as demonstrated in chapters 6 and

9), tracking feature points in images sequences [27], and locating an unknown time-varying

number of speakers [28].

The advantage of the particle PHD filter is that it can track a variable number of tar-

gets, estimating both the number of targets and their locations. It avoids the need for data

association techniques as part of the multiple-target framework, since the identities of the

individual targets are not required. In addition to estimating the number of targets and their

states at each point in time, it is also important in tracking scenarios to know the trajectories

of the targets and to be able to distinguish between different targets. Some early techniques

for associating the targets between frames have been reported in the literature. The first

of these [29] used the PHD filter for pre-filtering the data input to a Multiple Hypothesis

Tracker. The second technique [30] represents the PHD in a resolution cell to differentiate

the peaks of the PHD posterior, and validation gating was used to determine the weights of

the particles. More recently, two methods were presented independently in [31] (see chap-

ter 7) and [32]. The first of these considered associating target estimates between iterations,

also known as estimate-to-track association. The second method used the partitioning of

the particle data to assign labels to the particles within the same cluster and associate the

clusters between time frames if there is a large intersection of particles with the same label

from the previous time step.

The Gaussian mixture Probability Hypothesis Density (GM-PHD) filter was derived re-

cently to provide a closed-form solution to the PHD filter [33] [34]. It was shown that,

under linear-Gaussian assumptions, the posterior intensity at any point in time is a Gaus-

sian mixture. The means and covariances of the Gaussians are determined from the Kalman

filtering equations and the weights are calculated according to the PHD filter update equa-

tion. The asymptotic convergence properties of the GM-PHD filter have been established,

showing that the mixture approximation converges to the true PHD [35] (see chapter 5).

The multiple target states of the GM-PHD mixture are determined from the Gaussian com-

ponents with the highest weights. It can be shown that Gaussians within the mixture track

the evolution of individual target states which ensures the continuity of target identity (see

chapter 8).

The first practical implementations of both the Particle PHD filter and the GM-PHD

filter with track continuity are presented in chapter 9 for multiple-target tracking in forward-

looking sonar images [26] [36]. These techniques are compared with the traditional NN

approach with Kalman filters.

1.4 Thesis Outline

The theory of Bayesian filtering is presented in chapter 2 with its relation to target track-

ing. The Kalman filter [1] is derived using properties of Gaussian distributions [37] and

the extended Kalman filter [38] is presented to accommodate nonlinearities in the state and

observation models. A more recent alternative, the Unscented Kalman filter [39], approx-

imates the mean and covariance of a Gaussian by a set of sigma-points. More general

probability distributions can be represented by the Gaussian Sum filter [40] which uses a

weighted sums of Gaussians which are updated with the Extended Kalman filter equations.

Finally, it is shown how the Particle filter uses Sequential Monte Carlo methods to provide

an approximate solution to the problem without relying on the restrictive linear or linearised

conditions for the signal and observation processes and has guaranteed convergence prop-

erties [2] [41].

Chapter 3 describes the Probability Hypothesis Density (PHD) Filter [12]. The point

process theory required for the derivation of the PHD filter is given together with its rela-

tionship to Random Finite Sets. The Multiple Target Tracking model is presented with the

Bayesian recursion analogous to the single target scenario, from which the PHD filter is

derived [12] [42].

Two practical implementations of the the PHD filter are studied in this thesis. The

first of these, in chapter 4, is the Sequential Monte Carlo implementation known as the

Particle PHD Filter [18] [17] which extends the single target particle filter to a multiple

target version. The second implementation, in chapter 5, is the Gaussian Mixture PHD fil-

ter [34] [33], which provides a closed form solution to the PHD filter under linear-Gaussian

conditions and is similar in style to the Gaussian Sum filter [40] [43].

A detailed study of the convergence properties is conducted for both of the implemen-

tations of the PHD filter. In chapter 4, it is shown that the empirical representation of the

PHD converges weakly to the true density as the number of particles increases and bounds

are provided for the mean square errors based on results for particle filters [21]. In chapter

5, it is shown that the Gaussian sum representation of the PHD converges uniformly to the

true PHD and error bounds are provided for the pruning and merging stages of the algo-

rithm to show that these fall within acceptable limits [35]. Conditions are provided for the

extended Kalman implementation of the algorithm to converge uniformly based on results

for the Gaussian Sum filter [44].

An example of the particle PHD filter algorithm for tracking in forward-looking sonar

data is shown in chapter 6 [24], demonstrating the potential for the algorithm for target

estimation in practical applications. A set of target states is estimated at each iteration

from range and bearing measurements obtained from a sonar device fitted to an underwater

vehicle surveying an area of seabed. The algorithm is demonstrated on real and simulated

sonar data with a variable number of targets in cluttered environments.

Since the posterior PHD is a multi-modal distribution, methods are required to deter-

mine the target state estimates at each iteration [45]. Methods for clustering the particles

are considered in chapter 7 for finding peaks in the empirical particle distribution. Another

important consideration for multiple target tracking is to maintain continuity track identity

to identify the same target in successive iterations of the algorithm. Chapter 7 also presents

novel methods to enable track continuity for the Particle PHD filter [31]. Chapter 8 presents

the GM PHD Multi-target Tracker [46] and demonstrates the the Gaussian Mixture PHD

filter has the inherent ability to maintain target tracks by following the individual Gaussians

within the mixture.

The ninth chapter demonstrates that the methods developed for multiple target tracking

with the PHD filter can be implemented on real data with an application on forward-looking

sonar data. It is shown that the Particle PHD filter gives comparable performance to a near-

est neigbour approach with Kalman filters [26] [47]. In addition, it is shown that the GM

PHD filter can track a variable number of targets in a reasonably high level of clutter [36].

These results provide the first implementations of the PHD filter for multiple target track-

ing with continuity of track identity on real data and demonstrate that these techniques have

real practical value.

The final chapter summarises the work presented in this thesis and outlines future re-

search with PHD filters.

1.5 Original Contributions

This thesis addresses some of the essential issues required for the PHD filter to be of prac-

tical value in multiple target tracking applications. Two implementations of the PHD filter

are studied in detail. The first of these is the Particle PHD filter [17], which is a Sequential

Monte Carlo technique based on particle filtering techniques. The second implementation

studied is the Gaussian Mixture PHD filter [33], which provides a closed form solution to

the PHD filter. The specific contributions of each chapter are outlined below.

Chapter 4: The Particle PHD Filter

This chapter presents mathematical proofs of convergence for the Particle PHD Filter algo-

rithm and gives bounds for the mean square error.

”Convergence Results for the Particle PHD Filter” IEEE Transactions on Signal Process-

ing, Volume 54, No. 7, p2652-2661, July 2006.

Chapter 5: The Gaussian Mixture PHD Filter

This chapter proves uniform convergence of the errors in the Gaussian Mixture PHD filter

algorithm and provides error bounds for the pruning and merging stages.

”Convergence Analysis of the GM PHD Filter”, IEEE Transactions on Signal Processing,

in press.

Chapter 6: PHD Filter Target Estimation in Sonar Images

An implementation of the particle PHD filter is demonstrated on real forward-looking sonar

taken from an underwater vehicle to estimate both the number of targets and their locations.

”Bayesian Multiple Target Tracking in Sonar Images with the PHD Filter” IEE Proceed-

ings on Radar, Sonar and Navigation, Volume 152, Issue 5 , p. 327-334. 2005

”PHD Filter Multi-target Tracking in 3D Sonar”, IEEE Oceans Europe Conference, Brest

June 2005. Volume 1, June 20-23, 2005 p265 - 270

Chapter 7: Target Tracking with the Particle PHD Filter

Two clustering techniques are compared for determining the multiple-target states from

the particle density, namely k-means clustering and mixture modelling via the expectation-

maximization algorithm. Novel techniques are developed for associating the targets be-

tween frames to enable identification of the individual target tracks.

”Multi-Target State Estimation and Track Continuity for the PHD Filter”, IEEE Transac-

tions on Aerospace and Electronic Systems, Volume 43 no 3. July 2007

”Data Association for the PHD Filter”, Intelligent Sensors, Sensor Networks and Informa-

tion Processing Conference, 2005. Proceedings of the 2005 International Conference on

5-8 Dec. 2005 Page(s):217 - 222

Chapter 8: The GM-PHD Filter Multi-Target Tracker

It is shown here that the trajectories of the targets can be determined directly from the

evolution of the Gaussian mixture of the PHD and that single Gaussians within this mix-

ture accurately track the correct targets. Furthermore, the technique is demonstrated to be

successful in estimating the correct number of targets and their trajectories in high clutter

density.

”The GM-PHD Filter Multi-Target Tracker” Proceedings of the International Conference

on FUSION, July 2006.

Chapter 9: Multiple Target Tracking in Sonar Images

The multiple target tracking techniques developed for the two implementations of the PHD

filter are demonstrated on both simulated sonar and real forward-looking sonar data ob-

tained from an Autonomous Underwater Vehicle (AUV) and these approaches are com-

pared with a conventional Nearest Neighbour approach with Kalman filters. It is shown

that the Particle PHD filter with estimate-to-track association gives comparable tracking

performance to the Nearest Neighbour approach, and that the GM-PHD filter is demon-

strated to give comparable performance in higher levels of clutter.

”Particle PHD Filter Multiple Target Tracking in Sonar Images” IEEE Transactions on

Aerospace and Electronic Systems, Volume 43, no 3. July 2007.

”Multiple Target Tracking and Data Association in Sonar Images” 2006 IEE Seminar on

Target Tracking. p149-154

”GM-PHD Multi-target Tracking in Sonar Images”, 2006 SPIE Defense and Security Sym-

posium [6235-29]

Chapter 2

Bayesian Filtering

2.1 Background

Single-target tracking requires the estimation of the state of a signal at each point in time

based on a discrete set of noisy measurements, where a new measurement is received at

each time-step. This definition coincides with the mathematical theory of filtering and the

terms have become synonymous in the engineering community (provided that the correct

measurement is assigned to the filter).

One interpretation of filtering theory, called optimal non-linear filtering, is defined as

follows. Suppose that we wish to estimate a process which can not be observed directly, by

observations from a different noisy process, where a relationship between the two processes

is known (these processes may be continuous in time). The problem is to provide the

estimate of the signal based on the observations up to the current time. The signal and

observation processes are given by stochastic differential equations [48]. In the case where

an estimate is required when each measurement becomes available, the problem is known

as filtering [49]. The Kalman filter [1] is a special case of the nonlinear filtering problem

when the signal and observation processes are linear and the noise processes are Gaussian.

In this case, the stochastic differential equations can be solved explicitly since the linear

filtering equations form a closed set.

An alternative interpretation of filtering theory is Bayesian filtering, which recursively

applies Bayes’ rule to determine the conditional probability distribution of the signal pro-

cess. The signal and observation processes are now discrete in time, which is consistent

with measurements received for target tracking and thus is applicable. The Bayesian deriva-

tion of the Kalman filter relies only on the properties of Gaussian distributions and does not

require an understanding of stochastic differential equations and martingale theory [37].

Furthermore, the Bayesian interpretation can allow for a Sequential Monte Carlo approach

to be adopted [41], where simulation based methods can be used for approximating the

posterior distributions. These techniques do not rely on any of the linearised/Gaussian as-

sumptions on the signal and observation models and have provable convergence properties.

This chapter provides a motivation for the filtering algorithms in the context of target

tracking before describing Bayesian filtering and presenting the commonly used techniques.

2.2 Single-Target Bayesian Filtering

To make an inference about the state of a dynamic system, two equations are needed, the

state equation which describes the evolution of state with time, or the motion of a target,

and the measurement equation which relates the observations received from a sensor to the

state. In the case where an estimate is required every time a measurement is received, a

recursive filtering approach is used which is predicted and updated for each time-step. The

prediction stage uses the state equation to predict the state in the next time-step and update

stage uses the measurement equation to calculate the posterior distribution according to

Bayes rule.

Let x0:t := x0, ...,xt be an unobserved signal process of dimension n that we wish

to estimate, and Zt := σ(z1, ...,zt) be the σ-algebra generated by noisy observations of

dimension m ≤ n related to this process. The single-target filtering problem is to estimate

recursively in time, the probability distribution p(xt |Zt) of the signal. From this, an estimate

of the target location, xt , needs to be determined. One of the possible estimates is the con-

ditional expectation, xt = E(xt |Zt), of the signal. Other possible choices are the maximum

a-posterior estimate which may be more appropriate with multi-modal distributions.

The evolution of the signal process x0:t is governed by the state equation

xt = ft(xt−1,vt−1), (2.1)

where ft is a (possibly) non-linear function representing the motion of the target and v0:t−1

is the process noise sequence representing the uncertainty in the target motion. The obser-

vations are governed by the measurement equation,

zt = ht(xt ,εt), (2.2)

where ht is a function related to observing xt and ε1:t := ε1, . . . ,εt is the observation noise

sequence reflecting errors in the observations. The process and measurement error noise

sequences are uncorrelated. When functions ft and ht are linear and the noise sequences

v0:t−1 and ε1:t are Gaussian, then the optimal estimate is given by the Kalman filter. When

these restrictive conditions are not met, alternative methods for obtaining xt are needed.

Expressed in Bayesian terms, the problem is to estimate recursively in time the posterior

distribution, p(xt |Zt), by the following prediction and update stages. The prediction stage

involves calculating the prior distribution, p(xt |Zt−1), of the state being in xt based on the

previous observations,

p(xt |Zt−1) =

p(xt |xt−1)p(xt−1|Zt−1)dxt−1. (2.3)

When the new measurement, zt , has been observed, the update stage involves calculating

the posterior distribution by Bayes’ Rule,

p(xt |Zt) =p(zt |xt)p(xt |Zt−1)

p(zt |Zt−1)=

p(zt |xt)p(xt |Zt−1)R

p(zt|xt)p(xt |Zt−1)dxt, (2.4)

where Zt is the σ-algebra generated by the measurements up to time t and p(zt |xt) is the

likelihood of observing zt given signal xt . The estimated signal xt can, for example, be

taken to be conditional mean of xt ,

xt = E(xt |Zt) =Z

xt p(xt |Zt)dxt . (2.5)

2.3 Kalman Filtering Techniques

In this section, the Kalman filter [1] is derived and variants of this technique for non-linear

scenarios are described including the extended Kalman filter (EKF) [38], unscented Kalman

filter (UKF) [39] and Gaussian sum filter [40].

2.3.1 The Kalman Filter

The Kalman Filter [1] recursively calculates the exact posterior distribution based on the

assumptions that the posterior distribution is Gaussian, the process and observation noises vt

and εt are uncorrelated, white noise sequences with mean zero, and state and measurement

equations ft and ht are linear functions. The state and measurement equations are

xt = Ftxt−1 +Γvt−1 (2.6)

zt = Htxt + εt (2.7)

where vt−1 and εt are uncorrelated, white Gaussian noise sequences with mean zero and

covariance matrices Q and R respectively, and Ft and Ht are matrices defining the linear

functions ft and ht respectively. The expectation and covariance of the signal given the set

of measurements up to time t are denoted E(xt |Zt) := xt and Cov(xt |Zt) = Pt . The notation

used for Gaussians shall be

N (x;m,P) := 1(2π)d/2 det(P)1/2 exp−1/2(x−m)T P−1(x−m), (2.8)

with variable x, mean m and covariance P.

The following two Theorems establish the prediction and update steps required in the

Kalman filter based on the Bayesian derivation of the Kalman filter by Ho and Lee [37].

THEOREM 1 Given the Gaussian posterior distribution at time t − 1 and the linear state

model, the prior probability distribution at time t is the Gaussian

p(xt |Zt−1) = N (xt ; xt|t−1,Pt|t−1), (2.9)

where the predicted state estimate and covariance to time t are

xt|t−1 := Fxt−1 (2.10)

Pt|t−1 := FPt−1FT +ΓQΓT . (2.11)

The expectation of the state at time t, given measurements up to time t −1, is, by equation

(2.6),

E(xt |Zt−1) = E(Fxt−1 +Γvt−1|Zt−1), (2.12)

which, by the linearity of expectation,

= E(Fxt−1|Zt−1)+E(Γvt−1|Zt−1) = FE(xt−1|Zt−1)+ΓE(vt−1|Zt−1) = Fxt−1. (2.13)

The last equality holds since xt−1 := E(xt−1|Zt−1) and E(vt−1) = 0. The prediction covari-

ance is calculated with

Cov(xt|Zt−1) (2.14)

= E((Fxt−1 +Γvt−1)(Fxt−1 +Γvt−1)T |Zt−1) (2.15)

= E(Fxt−1xTt−1FT +Γvt−1vT

t−1Γ|Zt−1) (2.16)

(since the cross terms are zero)

= FPt−1F +ΓQΓT = Pt|t−1. (2.17)

Combining the mean and covariance gives the required Gaussian 1

THEOREM 2 Given that the prior probability density to time t is Gaussian and that the

dynamic model is linear, the posterior distribution at time t is also Gaussian, and is given

p(xt |Zt) = N (xt ; xt ,Pt). (2.18)

The state estimate xt and covariance Pt are obtained by

xt = xt|t−1 +Kt(zt −Hxt|t−1) (2.19)

Pt = [I −KtH]Pt|t−1, (2.20)

1The blacksquare shall be used to denote the end of a proof of a Theorem or Lemma throughout the thesis.

where Kt is known as the Kalman gain,

Kt = Pt|t−1HT [HPt|t−1HT +R]−1. (2.21)

To prove this theorem, we need Lemmas 1 and 2, which are given after the proof of the

main result.

By Bayes rule, equation (2.4),

p(xt |Zt) =p(zt |xt)p(xt |Zt−1)

p(zt |Zt−1), (2.22)

which, by Lemmas 1 and 2, and Theorem 1,

=N (zt;Hxt ,R)

N (zt;Fxt ,HPtHT +R)N (xt ;Fxt−1,Pt|t−1) = N (xt ; xt ,Pt), (2.23)

where the final equality comes from completing the square, which results in the state esti-

mate xt and covariance Pt given in equations (2.19) and (2.20)

LEMMA 1 The probability distribution of zt based on measurements up to time t − 1 is

given by the Gaussian

p(zt |Zt−1) = N (zt;Fxt ,HPt|t−1HT +R). (2.24)

We compute the expectation of zt given the measurements up to time t − 1, by first using

the measurement equation,

E(zt |Zt−1) = E(Hxt + εt |Zt−1), (2.25)

and then the state equation,

= E(H(Fxt−1 +Γwt−1)+ εt |Zt−1) = HFxt−1, (2.26)

where the last equality holds since E(vt−1) = 0 and E(εt) = 0. Now consider the covariance,

Cov(zt|Zt−1) = E((Hxt + εt)(Hxt + εt)T |Zt−1) = HPtHT +R. (2.27)

The Lemma is proved by combining the mean and covariance above

LEMMA 2 The likelihood of observing zt given state xt is the Gaussian

p(zt |xt) = N (zt ;Hxt ,R) (2.28)

The expectation is computed using the measurement equation,

E(zt |xt) = E(Hxt + εt |xt) = Hxt , (2.29)

since E(εt) = 0. Similarly, the covariance is calculated,

Cov(zt |xt) =E((Hxt + εt)(Hxt + εt)T |xt) (2.30)

= E((HxtxTt HT + εtεt)

T |xt) = HCov(xt |xt)HT +R = R. (2.31)

Combining the mean and covariance gives the required Gaussian likelihood function

2.3.2 The Extended Kalman Filter

If the process to be estimated or the relationship between the measurement and the state

is non-linear, then the conditions required for the Kalman filter are no longer valid. The

extended Kalman filter linearises about the current mean and covariance using Taylor ap-

proximations [38]. The state and observation equations are now described by the equations,

xt = ft(xt−1,vt−1) (2.32)

zt = ht(xt ,εt), (2.33)

where the functions ft and ht can be non-linear. The noise sequences vt−1 and εt are

zero mean white Gaussian noises, as with the Kalman filter. For simplicity, we define the

notation ht(xt) := ht(xt ,0) and ft(xt) := ft(xt ,0).

To derive the extended Kalman filter, the following partial derivatives are required for

the state equation,

Ft−1 =∂ ft(xt−1,0)

∂xt−1

xt−1=xt−1

,Gt−1 =∂ ft(xt−1,wt−1)

∂wt−1

vt−1=0, (2.34)

and for the measurement equation,

Ht =∂ht(x)

∂x x=xt|t−1,Ut =

∂ht(xt|t−1,εt)

∂εt

εt=0. (2.35)

The nonlinear functions, ft and ht can then be expanded in terms of their Taylor series,

ft(xt) = ft(xt|t)+Ft(xt − xt|t)+ . . . (2.36)

ht(xt) = ht(xt|t)+Ht(xt − xt|t)+ . . . , (2.37)

and the model can be approximated with

xt+1 = Ftxt +Gtvt−1 +( ft(xt|t)−Ft xt|t), (2.38)

zt = Htxt + εt +(ht(xt|t)−Ht xt|t). (2.39)

Prediction Step

The predicted state and covariance are given by the extended Kalman prediction equations,

xt|t−1 = ft(xt−1,0), (2.40)

Pt|t−1 = Gt−1Qt−1[Gt−1]T +Ft−1Pt−1[Ft−1]

T . (2.41)

Measurement Update

The updated state and covariance are computed with the extended Kalman update equa-

tions,

xt =xt|t−1 +Pt|t−1Ht[HTt Pt|t−1Ht +Rt ]

−1HTt Pt|t−1 (2.42)

Pt =[I −KtHt ]Pt|t−1, (2.43)

where the extended Kalman filter gain is

Kt =Pt|t−1[Ht]T [UtRtUT

t +HtPt|t−1HTt ]−1. (2.44)

In Theorems 1 and 2, it was shown that when the dynamic model is linear and Gaussian,

the Kalman prediction and update equations result in another Gaussian. Since the prediction

for the Extended Kalman Filter produce approximations for xt|t−1 and Pt|t−1 and is no longer

Gaussian, it is useful to know under what circumstances the approximation is accurate. The

following results, from Anderson and Moore [44], give conditions for the convergence of

the filter (we omit the proofs here).

LEMMA 3 If the prior probability at time t is the Gaussian

p(xt |Zt−1) = N (xt ; xt|t−1,Pt|t−1), (2.45)

then for fixed ht , xt|t−1 and Rt

p(xt |Zt) → N (xt ; xt|t,Pt|t) (2.46)

uniformly in xt and zt as Pt|t−1 → 0.

LEMMA 4 If the posterior density at time t is the Gaussian

p(xt |Zt) = N (xt ; xt|t,Pt|t), (2.47)

p(xt+1|Zt) → N (xt+1; xt+1|t ,Pt+1|t) (2.48)

as Pt → 0.

These Lemmas will also be invoked to give convergence results for the Gaussian Sum

filter and the Gaussian Mixture Probability Hypothesis Density Filter to be described in a

later chapter.

2.3.3 The Unscented Kalman Filter

A new linear estimator was developed in the mid-nineties by Julier [50] called the Un-

scented Kalman filter which uses a set of discretely sampled points to parameterise the

mean and covariance. This technique does not require the linearisation steps needed for the

extended Kalman filter and it was shown that the performance is analytically superior to the

extended Kalman filter.

The idea behind the unscented transform is that it is easier to approximate a Gaussian

distribution than an arbitrary non-linear function. A set of sigma-points are chosen so that

their mean and covariance are xt−1 and Pt−1. The non-linear transform is applied to each

of the points to obtain a set of transformed points with mean xt|t−1 and covariance Pt|t−1. If

the state dimension is n, then 2n+1 sample points are chosen deterministically with

χ(0)t−1 =xt−1, w(0)

t−1 = κ/(n+κ), (2.49)

χ(i)t−1 =xt−1 +

(n+κ)Pt−1)

i, w(i)

t−1 = κ/2(n+κ), (2.50)

χ(i+n)t−1 =xt−1 −

(n+κ)Pt−1)

i, w(i+n)

t−1 = κ/2(n+κ), (2.51)

where κ ∈ R and w(i)t−1 is the weight associated with the ith sigma point at time t −1. Simi-

larly, a set of points are computed for the observation equation h.

The predicted mean and covariance are computed with the following summations,

xt|t−1 =2n∑i=0

w(i)t−1 f (χ(i)

t−1) (2.52)

Pt|t−1 =2n∑i=0

w(i)t−1( f (χ(i)

t−1)− xt|t−1)( f (χ(i)t−1)− xt|t−1)

T , (2.53)

and these are updated with the usual Kalman filter when a measurement is received. Similar

calculations are performed to find the predicted observation and innovation covariance.

The mean and covariance are calculated accurately up to the second order whereas the

EKF is only accurate up to first order. The main difference between this and the EKF is

that the distribution is being approximated instead of the state function ft . Numerically

stable and efficient methods can be used to compute the sigma points and there is no need

to calculate complicated Jacobian matrices. Practical examples have demonstrated the use

of the UKF in real tracking scenarios and it compares favourably with the EKF both in

accuracy and ease of implementation.

2.3.4 The Gaussian Sum Filter

The Gaussian Sum filter [44] was developed to allow for more general probability distri-

butions than just a unimodal Gaussian distribution. The state estimate is a weighted sum

of the filter outputs from a set of Extended Kalman filters. The justification behind this

approach is due to a consequence of Wiener’s Theorem on Approximation that any prob-

ability density can be approximated to an arbitrary degree with a sum of Gaussians, see

Theorem 3.

THEOREM 3 Any density on Rd can be approximated as closely as desired in L1 by a linear

combination of Gaussian densities,

v(x) = limn→∞

n∑i=1

αiN (x;µi,Pi) (2.54)

This result is due to Wiener’s theorem on approximation [51]

This means that given any ε > 0, a positive integer N can be found such that

|v(x)−n∑i=1

αiN (x;µi,Pi)|dx ≤ ε, (2.55)

for n ≥ N.

Assume that the posterior at time t is given by the Gaussian sum

p(xt |Zt) =Jt

∑i=1

w(i)t N (x;m(i)

t ,P(i)t ). (2.56)

Then the mean and covariance are

xt =Jt

∑i=1

w(i)t m(i)

t , (2.57)

Pt = E[(xt − xt)(xt − xt)T ] =

∑i=1

w(i)t [P(i)

t +(xt −m(i)t )(xt −m(i)

t )T ], (2.58)

and the sum of the weights is 1,

∑i=1

w(i)t = 1. (2.59)

The algorithm is initialised with a set of Gaussians and follows the prediction and update

recursion given below.

Prediction Step

The individual components are predicted into the next time step using the Extended Kalman

Filter prediction equations (2.40) and (2.41).

LEMMA 5 If the posterior at time t −1 is given by the sum of Gaussians

p(xt−1|Zt−1) =Jt−1

∑i=1

w(i)t−1N (x;m(i)

t−1,P(i)t−1), (2.60)

then the predicted density approaches the Gaussian sum

p(xt |Zt−1) →Jt−1

∑i=1

w(i)t−1N (x;m(i)

t|t−1,P(i)t|t−1), (2.61)

uniformly in xt as P(i)t−1 → 0 for each component i.

Each component converges uniformly using Lemma 3 for the Extended Kalman Filter and

the result follows

Measurement Update

When the new measurement, zt , becomes available at time t, the Gaussian components are

updated with the Extended Kalman Update. The weights are recomputed according to the

Gaussian likelihood function,

w(i)t = w(i)

t−1N (zt ;ht(m(i)

t ,HtP(i)t|t−1HT

∑Jtl=1 N (zt;ht(m(l)

t ,HtP(l)t|t−1HT

t +R). (2.62)

LEMMA 6 Suppose that the predicted density to time t is given by the Gaussian sum

p(xt |Zt−1) =Jt

∑i=1

w(i)t−1N (x;m(i)

t|t−1,P(i)t|t−1). (2.63)

Then the updated density approaches the Gaussian sum

p(xt |Zt) =Jt

∑i=1

w(i)t N (x;m(i)

t ,P(i)t ). (2.64)

Each term converges uniformly by Lemma 4 from the Extended Kalman filter and the result

follows

2.4 Sequential Monte Carlo Filtering

The Bayesian filtering equations can not usually be computed analytically for general prob-

ability distributions and so Sequential Monte Carlo techniques or particle filters have proved

to be a successful method for approximating them.

Particle filters are sequential Monte Carlo methods based on point mass or particle rep-

resentations of probability densities with weightings of the particles corresponding to the

probability distribution. The basic concept is a recursive Bayesian filter by Monte Carlo

simulations. The Bootstrap Filter was proposed by Gordon [2] for implementing recursive

Bayesian filters with empirical representations of the probability densities. The density of

the state vector is represented by particles updated and propagated by the algorithm. The

idea is to eliminate particles having low importance weights and multiply particles having

high importance weights. A tutorial on particle filters and its variants is given in [52] and

review of Sequential Monte Carlo Methods with applications is presented in [41]. This

section describes sequential importance sampling, how this relates to particle filtering algo-

rithms and the convergence properties of these algorithms.

2.4.1 Sequential Importance Sampling and Resampling

A common technique for approximating a probability distribution is by Importance Sam-

pling. Suppose that we wish to draw samples from a probability distribution p(x) ∝ π(x)

which is difficult to sample from but it is possible to sample from π(x). Let q(x) be an

importance density for which we can generate N samples from. Then a weighted approxi-

mation to p(x) is given by

p(x) ≈N∑i=1

ω(i)δ(x− x(i)), (2.65)

ω(i) ≈ π(x(i))

p(x(i)), (2.66)

where ω(i) is the normalised weight of particle x(i).

The importance sampling distribution π(xt |Zt) at time t is given by

π(xt |Zt) = π(x0)t

∏k=1

π(xk|xk−1,Zk). (2.67)

The weights can be calculated recursively by

ω(i)t ∝ ω(i)

t−1p(zt |x(i)

t )p(x(i)t |x(i)

t−1)

π(x(i)t |x(i)

t−1,Zt). (2.68)

This technique can be applied sequentially when the prior distribution is the importance

sampling distribution π0

π(xt |Zt) = p(xt) = π0(x0)t

∏k=1

p(xk|xk−1), (2.69)

then the weights satisfy

ω(i)t ≈ ω(i)

t−1 p(zt |x(i)t ), (2.70)

and we only need calculate the likelihood function p(zt |x(i)t ).

This technique suffers from a problem called degeneracy which is when, after a few

iterations, the particles have negligible weights. This problem is resolved by resampling

from the weighted distribution to obtain an unweighted particle set which approximates the

posterior distribution.

2.4.2 The Particle Filter Algorithm

The Particle Filter or Bootstrap Filter was proposed by Gordon[2] for implementing recur-

sive Bayesian filters. The density of the state vector is represented by the discrete samples,

or particles. A description of the algorithm is given below.

•Step 0: Initialisation Step at t = 0.

In the initialisation step, we assume that we can sample N particles directly from the prior

π0, each one is assigned a mass of ω(i)0 = 1/N, hence

πN0 =

N∑i=1

δx(i)0

, (2.71)

where δx(i)0

represents the dirac delta function located at particle position x(i)0 . We have

assumed that we can sample directly from π0 so by the Glivenko-Cantelli Theorem [53],

which states that empirical distributions converge almost surely to their true distributions,

limN→∞

πN0 = π0 a.s. (2.72)

Set t = 1.

•Step 1: Prediction Step at t ≥ 1.

A predicted state for each particle x(i)t−1 is obtained by projecting it with the Markov transi-

tion kernel f (x(i)t−1, ·),

x(i)t = f (x(i)

t−1, ·) (2.73)

The set of particles gives a discrete approximation to prior probability density p(xt |Zt−1).

•Step 2: Update Step at t ≥ 1.

When the new measurement zt is obtained, weights are updated for the particles by using

the likelihood function g(·|·),

ω(i)t =

Ng(zt |x(i)t )

∑Nj=1 g(zt |x( j)

t )(2.74)

The posterior distribution p(xt |Zt) := πt is represented by the measure,

πNt =

N∑i=1

ω(i)t δx(i)

t. (2.75)

•Step 3: Resampling Step, for t ≥ 1.

N new particles, x(i)t , i = 1, . . . ,N are created by resampling from x(i)

t , i = 1, . . . ,N according

to their weights. Thus particles with large weights will tend to be resampled more often than

those with low weights, and particles with low weights may be eliminated. This creates an

unweighted representation of the posterior distribution πt ,

πNt =

N∑i=1

δx(i)t

. (2.76)

If Zt is the σ-algebra generated by the measurements up to time t, and Gt is the σ-algebra

generated by the particles at time t, then the estimated state xt is approximated by

xNt = EN(xt |Zt) = E(xt |Gt) =

N∑i=1

x(i)t (2.77)

where EN(xt |Zt) represents the approximation to the true expectation E(xt |Zt) given by the

particles.

Set t = t +1 and repeat from Step 1.

2.4.3 Convergence Properties

One of the crucial considerations for the particle filter algorithm is whether it converges,

that is to say that as the number of particles increases does the empirical distribution given

by the particles tend to the true distribution in some sense, and can the errors in the ap-

proximation be bounded. Convergence studies by Crisan [54], [55], [56] amongst others

have demonstrated convergence of the mean square errors and weak convergence of the

empirical measures to the true measures at each step in the algorithm. When the density

in the inner product 〈., .〉 is continuous, it defines the integral inner product, and when it is

discrete, it defines the summation inner product, so that:

〈πt ,ϕ〉 =

πt(xt |Z1:t)ϕ(xt)dxt (2.78)

〈πNtt ,ϕ〉 =

∑i=1

ω(i)t ϕ(x(i)

t ) (2.79)

If πt is the posterior distribution at time t, then it can be shown that at time t, there is a

constant c such that

(〈πNt ,ϕ〉−〈πt,ϕ〉)2]≤ c‖ϕ‖2

N , (2.80)

for any bounded function ϕ.

To prove that an empirical distribution converges to its true distribution, we need to

have a notion of convergence for measures. This type of convergence is called weak con-

vergence, which is fundamental to the study of probability and statistics. With this type

of convergence, the values of the random variables are not important; it is the probabili-

ties with which they assume those values that matter. Thus, the probability distributions of

the random variables will be converging, not the values themselves [57]. Let µN and µ be

probability measures on Rd . Then, the sequence µN converges weakly to µ if

f (x)µN(dx)

converges toR

f (x)µ(dx) for each real-valued continuous and bounded function f on Rd .

The empirical measures considered here are the particles that approximate the true mea-

sures, where N is the number of particles. Let Cb(Rd) be the set of real-valued continuous

bounded functions on Rd . If (µN) is a sequence of measures, then µN converges weakly to

µ if:

limN→∞

〈µN,ϕ〉 = 〈µ,ϕ〉. (2.81)

We can write this as

limN→∞

πNt = πt a.s. (2.82)

where a.s. stands for almost surely, i.e. true for all values outside the null set.

In chapter 4, a study of the convergence properties of the particle implementation of

the Probability Hypothesis Density (PHD) filter is presented based on the results derived

for particle filters. One of the main differences between the two algorithms is that in the

particle filter, the total particle mass is 1 whereas in the PHD filter the particle mass gives

the expected number of targets.

2.5 Summary

This chapter has presented basic filtering theory from a Bayesian perspective. The Kalman

filter has been derived using the linear/ Gaussian assumptions on the state and measurement

models and Bayes’ rule. The extended Kalman filter and unscented Kalman filter are shown

for situations when the assumptions for the Kalman filter can be relaxed to accommodate

mildly non-linear models. The Gaussian sum filter is then introduced for non-Gaussian

distributions by representing the probability distribution as a mixture of Gaussians. The

sequential Monte Carlo approach to filtering is described with the example of a particle

filter, which approximates the probability distribution with a set of discrete samples. All of

these algorithms were presented in the context of single-target tracking. The PHD filter is

presented in the next chapter as a means of tracking multiple targets.

Chapter 3

The Probability Hypothesis Density

Filter

3.1 Introduction

This chapter describes the Probability Hypothesis Density (PHD) Filter. The PHD filter is a

first-moment filter which propagates the first-order moment of a dynamic point process. In

order to obtain a closed-form recursion, a Poisson point process assumption is made after

the prediction and update steps.

In single target tracking problems, the constant gain Kalman filter provides the compu-

tationally fastest solution for approximate filtering which propagates the first-order moment

of the posterior distribution. The PHD filter was proposed to provide an analogous solu-

tion in multiple target tracking problems. The first-order statistical moment of the multiple

target posterior distribution, known as the PHD, is propagated instead of the posterior. The

integral of the PHD over the state space provides an estimate of the number of targets and

the target states can be estimated by determining the peaks of this distribution.

In this chapter, the random set filtering framework is presented as a multiple-target

Bayesian recursion analogous to the single target case given in the previous chapter. The

point process theory required for the derivation is given and the PHD filter is then derived

using standard results from probability theory.

3.2 Random Set Filtering

The multiple target tracking framework based on random-sets was first proposed by Mahler[12]

as a rigorous mathematical model which attempts to unify the problems of detection, clas-

sification and tracking. The approach is a Bayesian model for recursively estimating and

updating a multi-target density function based on measurements received at each time-step.

Multiple-target filtering requires the unobserved signal process X0:t = X0, ...,Xt to

be estimated based on the σ-algebra generated by the sets of observations up to time t,

Zt := σ(Z1, ...,Zt), i.e. to obtain Xt = xt,1, ..., xt,Tt, where xt,i are the individual target

estimates and Tt is the estimate of the number of targets at time t. This is done by recursively

calculating the posterior distribution, or filtering distribution, pt(·|Zt).

The set of objects tracked at time t is modelled by the point process or Random Finite

Set (RFS)

x∈Xt−1

St|t−1(x))

x∈Xt−1

Bt|t−1(x))

∪Γt . (3.1)

where St|t−1 is the RFS of targets survived at time t from multi-target state Xt−11 at time

1Note that Xt represents the RFS and Xt represents its realisation.

t−1, Bt|t−1 is the RFS of targets spawned from Xt−1 and Γt is the RFS of targets that appear

spontaneously at time t. The multi-target measurement at time t is modelled by RFS

Zt = Kt ∪(

x∈Xt

Θt(x))

, (3.2)

where Θt(Xt) is the RFS of measurements from multi-target state Xt and Kt is the RFS of

measurements due to clutter.

The optimal multi-target Bayes filter propagates the multi-target posterior density pt(·|Zt)

conditioned on the sets of observations up to time t, Zt , with the following recursion

pt|t−1(Xt|Zk−1) =

ft|t−1(Xt |X)pt−1(X |Zt)µs(dX), (3.3)

pt(Xt |Zt) =gt(Zt |Xt)pt|t−1(Xt|Zt−1)

gt(Zt |X)pt|t−1(X |Zt−1)µs(dX), (3.4)

where the dynamic model is governed by the transition density ft|t−1(Xt|Xt−1) and multi-

target likelihood gt(Zt |Xt) and µs takes the place of the Lebesgue measure, as described

in [17].

The function gt|t(Zt |Xt) is the joint multi-target likelihood function, or global density,

of observing the set of measurements, Z, given the set of target states, X , which is the

total probability density of association between measurements in Z and parameters in X .

The parameters for this density are the set of observations, Z = z1, ...,zk, the unknown

set of target states, X = x1, ...,xt, the sensor noise distribution, or observation noise, the

probabilities of detection, PD, and false alarm, PFA, clutter models, and the detection profile

of sensor or field of view (FoV). For example suppose that the sensor noise density or single

target likelihood function is g(z|x), and that there are no false alarms and the probability of

detection is constant, then the joint multi-target likelihood is given by,

g(z1, ...,zk|x1, ...,xt) = pkD(1− pD)t−k ∑

1≤i1 6=...6=ik≤tg(z1|xi1) . . .g(zk|xik). (3.5)

The computational complexity of the joint multi-target likelihood grows exponentially with

the number of targets and so becomes numerically intractible [58]. The PHD filter was

derived to provide a sub-optimal strategy for determining the set of target states at each

iteration by using the first-order statistical moment of the multi-target posterior distribu-

tion [12].

3.3 Point Process Theory

It is a requirement for multiple target tracking problems that the number of targets and their

states are estimated. The set of target states can be modelled by a Point Process, where the

state of the population is defined as an unordered set of points X = x1, . . . ,xN. The points

are located in the state space χ, a complete separable metric space, where the number and

their locations are random.

To formulate the multiple target tracking model as a point process, we need the follow-

ing assumptions. A distribution pn,n ∈ N is given which determines the total number

of targets and satisfies ∑n∈N pn = 1. For each n ≥ 1, a probability distribution dn is given

which determines the joint distribution of the n targets. In target tracking, the realisation

of the point processes involved are unordered sets. Thus we require that dn are symmetric,

that is, all permutations are given equal weight. If the distributions dn are not symmetric,

they can be symmeterised,

dsymn (A1 × . . .×An) = ∑

permdn(Ai1 × . . .×Ain) (3.6)

where (A1, . . . ,An) is any partition of the state space, ∑perm is taken over all partitions

(i1, . . . , in) of integers (1, . . . ,n). dsymn now has the desired symmetric property [15].

3.3.1 Janossy Measures

It is convenient to introduce the Janossy measure which is symmetric by definition:

Jn(A1× . . .×An) = n!pndsymn (A1× . . .×An), (3.7)

Let jn(x1, . . . ,xn) denote the density of Jn, then

∑n=0

jn(x1, . . . ,xn)dx1 . . .dxn = 1 (3.8)

These densities directly relate to the multi-target posterior density of a random set Θ by

ft|t(x1, . . . ,xn|Zt) = j(x1, . . . ,xn), (3.9)

where x1, . . . ,xn is the random-set of target states and Zt is the random-set of observa-

tions [12]. The random-set x1, . . . ,xn and vector (x1, . . . ,xn) can be used interchangeably

since the Janossy density is symmetric.

3.3.2 Probability Generating Functionals

Let ξ be any bounded complex-valued Borel measurable function defined on complete sep-

arable metric space χ. Then, for any realisation, (x1, . . . ,xN) of a finite point process, the

product ∏Ni=1 ξ(xi) is well defined. For convenience, we define the notation

pX(dX)∏X

[ξ] :=Z

pX(dx1 . . .dxN)N∏i=1

ξ(xi) (3.10)

The probability generating functional (PGFL), GX , of a point process X is well defined, and

is given by

GX [ξ] := E(

N∏i=1

ξ(xi)

pX(dX)∏X

[ξ], (3.11)

where pX is the density of probability measure PX defined on point process X . The PGFL

characterises the point process entirely. For instance, one can recover the Janossy measures

by expanding GX [ξ] as

GX [ξ] =∞

∑n=0

1n!J(n)

X [ξ, . . . ,ξ], (3.12)

where the functional J(n)X is defined by

J(n)X [ξ, . . . ,ξ] :=

χ1...χnJn(dx1 × . . .×dxn)

n∏i=1

ξ(xi). (3.13)

Let M be the linear vector space of all bounded measurable complex-valued functions

defined on χ and let ‖η‖ < 1. Let ξ and η be fixed elements of M and let ‖η‖ < 1. If r is

the largest real number such that η + λξ ∈ Sg = φ : ‖φ‖ ≤ r for |λ| < r, then G[η + λξ]

can be written as

G[η+λξ] =∞

∑k=0

λk∞

∑n=k

k∏i=1

ξ(xi)n

∏i=k

η(xn)p(n)X (dx1 × . . .×dxn) (3.14)

The nth-order variation of generating functional G, or functional derivative, is defined to be

δnξ1,...,ξn

G[η] :=[

∂λ1, . . . ,λnG[

η+n∑i=1

λiξi

λ1=...=λn=0, (3.15)

where supx |η(x)| < 1 (see [59]). The nth order Janossy measure can be determined from

GX by evaluating δnξ1,...,ξn

G[η] at η = 0,

(d(n)GX)0[ξ1, . . . ,ξn] := limη→0

δnξ1,...,ξn

G[η] = J(n)X [ξ1, . . . ,ξn]. (3.16)

The intensity measure VX can be obtained by differentiating GX at η = 1,

(dGX)1[ξ] := limη→1

δξG[η] = VX ·ξ :=Z

ξ(x)VX(dx). (3.17)

VX is a measure in the conventional sense, i.e., non-negative and countably additive. If

this measure admits a density, then it defines the Probability Hypothesis Density. VX can

be defined in a few different ways For instance, it can also be defined as the first-order

moment or expectation measure via Random Counting Measures. Let x = x1, . . . ,xn be a

collection of points in X . We define the counting measure N(·|x) to be

N(A|x) =n∑i=1

IA(xi), A ⊂ X , (3.18)

where IA(xi) = 1 if xi ∈ A and 0 otherwise. A point process X has an equivalent repre-

sentation in terms of the counting measure it induces. To see this, note that the product

∏Ni=1 ξ(xi) can be expressed as

N∏i=1

ξ(xi) = expZ

logξN(dy|x). (3.19)

Let N1 and N2 be the counting measure representation of two independent point processes.

We can define a third point process to be the superposition of these two [60],

N(A) = N1(A)+N2(A), A ⊂ X (3.20)

It follows that

GN[ξ] = GN1[ξ]GN2[ξ]. (3.21)

A joint probability generating functional (JPGFL), GX ,Y of point processes X and Y can be

defined by

GX ,Y [g,h] :=Z Z

pX ,Y (dx,dy)∏X

[g]∏Y

[h], (3.22)

and has the following properties,

(dnGX ,Y [1, ·])0[η1, . . . ,ηn] = J(n)Y [η1, . . . ,ηn], (3.23)

(dnGX ,Y [g, ·])h=0[η1, . . . ,ηn] = pX · (J(n)Y |X [η1, . . . ,ηn|x]Π(·)[g]), (3.24)

where η1, . . . ,ηn are complex-valued functions which are defined on the complete separable

metric space on which elements of point process Y are located. The second property is valid

provided that differentiation and expectation can be interchanged. This can be verified using

the Lebesgue Dominated Convergence Theorem [42]. If the Janossy measure JY |X admits a

density jY |X then it can be replaced with this in the second property above. We can now use

these results to define the conditional probability generating functional GX |y[η] = GX |Y [η|y]

using Bayes rule,

PX |Y (dx|y) =PX(dx)PY |X(y|x)

PX(dx)PY |X(y|x) . (3.25)

The conditional PGFL of X given Y = y is defined to be

GX |Y [η|y] := PX |Y (·|y) ·Π(·)[η] (3.26)

= PX ·pY |X(y|·)Π(·)[η]

PX · pY |X(y|·) , (3.27)

=(dnGX ,Y [g, ·])0[δy1 , . . . ,δyn ]

(dnGX ,Y [1, ·])0[δy1, . . . ,δyn], (3.28)

where δy represents the dirac delta function centred at y.

The theory presented in this section will be used in the next section to derive the PHD

filter.

3.4 PHD Filter Derivation

The PHD filter can be considered as the first-order moment of a dynamic point process

with Markov shifts [42]. A Poisson assumption is made in order to derive a closed form

solution. This section provides a derivation of the PHD recursion from point process theory

described in section 3.2 using standard probability theory, based on the derivations by Vo

and Singh [42]. The probability generating functional of the prediction distribution, ft|t−1,

is shown to be a transformation of the PGFL for the multi-target posterior at time t, ft|t .

The formula for the prediction PHD is then found in terms of posterior PHD by taking the

functional derivative of the probability generating functionals. The update equation for the

PHD filter is found by assuming that the prediction density ft|t−1 is approximately Poisson,

and finding the relationship with the posterior in terms of its joint probability generating

functional.

3.4.1 The PHD Prediction Equation

The dynamics of the system evolve according to the probability that a given target state

x ∈ Xt−1 will survive, pS,t , and the transition kernel ft|t−1.

The RFS Bt|t−1(x) models the set of target states spawned from target x ∈ Xt−1 and

Γt models the set of new target states which appear spontaneously. The random finite set

Xt , which models the multi-target state at time t, is the union of the targets which have

survived from t − 1, those which have been spawned by existing targets and those which

appear spontaneously at time t.

THEOREM 4 Suppose that the RFS of targets at time t is given by

x∈Xt−1

St|t−1(x))

x∈Xt−1

Bt|t−1(x))

∪Γt . (3.29)

If the intensity measures of the multi-target probability distributions admit densities, then

the prediction equation for the PHD is given by

Dt|t−1(x|Zt−1) = γt(x)+Z

φt|t−1(x,xt−1)Dt−1|t−1(xt−1|Zt−1)dxt−1, (3.30)

where Dt−1|t−1 is the density of intensity measure Vt−1|t−1 of the multi-target posterior at

time t − 1, an Dt|t−1 is the predicted density to time t. The transition kernel φt|t−1 is given

φt|t−1(x,ξ) = pS,t(ξ) ft|t−1(x|ξ)+βt|t−1(x|ξ), (3.31)

γt is the PHD for spontaneous birth of a new target at time t, βt|t−1 is the PHD for spawned

target birth of a new target at time t, PS,t is the probability of target survival and ft|t−1 is the

single target motion distribution.

Let the intensity measure of the multi-target prediction density pt|t−1(·|Zt−1) be denoted

Vt|t−1(·|Zt−1), and let the intensity measure of the multi-target posterior density pt(·|Zt−1)

be denoted Vt(·|Zt). Furthermore, let these intensity measures admit densities Dt|t−1 and Dt

resepctively, which are the PHDs. Let VSt|t−1 , VBt|t−1 and VΓt denote the intensity measures of

St|t−1, Bt|t−1 and Γt respectively. The Random Sets St|t−1, Bt|t−1 and Γt are independent, so

by superposition (equation 3.21)(

x∈Xt−1 St|t−1(x))

x∈Xt−1 Bt|t−1(x))

has probability

generating functional GSt|t−1GBt|t−1GΓt and intensity measure VSt|t−1 +VBt|t−1 +VΓt . Using

the fact that St|t−1 and Bt|t−1 are formed from Markov shifts and Proposition 8.2IV [15],

Vt|t−1(A) = 〈Vt−1,VSt|t−1 +VBt|t−1〉+VΓt , ∀A ∈ B(Y ), (3.32)

which gives the desired PHD prediction, by taking the densities

3.4.2 The PHD Measurement Update Equation

The set of noisy observations at time t, which may include false alarms and have missed

detections, is modelled by random finite set, or point process, Zt. A measurement, detected

with probability pD,t , is distributed according to conditional probability, Lt ,

Lt(x,B) = P(zt |xt = x), (3.33)

which it is assumed admits a density known as the likelihood function, lt , determined from

the Radon-Nikodym derivative,

lt(x, ·) =dLt(x, ·)

dλZ, (3.34)

on observation space Z. The random finite set representing the set of measurements is

given by the union, of Θt(x), the RFS which is either empty or is distributed according to

Lt(x, ·), and Kt , the set of false alarms known as clutter points. The point process Zt can be

described by the conditional probability measure

P(Zt ∈ V |Xt = x) (3.35)

for all sets V on the observation space.The likelihood of observation z given target state

x is written gt(z|x) := lt(x,z) (note that there will be no confusion with the multi-target

likelihood since the parameters will relate to single targets).

THEOREM 5 Suppose that the set of measurements at time t is given by

Zt = Kt ∪(

x∈Xt

Θt(x))

, (3.36)

where Kt and Xt are Poisson point processes. Furthermore, assume that the prediction

distribution is Poisson, and that the intensity measures admit densities. Then the PHD

Measurement Update Equation is given by

Dt|t(x|Zt) =

(1− pD,t)+ ∑z∈Zt

ψt,z(x)κt(z)+ 〈Dt|t−1,ψt,z〉

Dt|t−1(x|Zt−1), (3.37)

κt(z) = λtct(z), (3.38)

is the clutter model with λt being the Poisson parameter specifying the expected number

of false alarms and ct is the probability distribution over the observation space of clutter

points and

ψt,z = pD,t(x)g(z|x), (3.39)

where g is the single target likelihood function and pD,t is the probability of detection.

Using the independence of point processes Kt and Θt(x), by superposition, the conditional

probability generating functional GZt|Xt [h|x] is given by

GZt|Xt [h|x] = GKt [h] ∏x∈Xt

GΘt(x)[h], (3.40)

and since x is distributed according to Lt with probability pD,t that it is detected, PGFL

GΘt(x)[h] is given by

GΘt(x)[h] = (1− pD,t)+ pD,tH[h](x) (3.41)

where H[h] is the functional defined by

H[h](x) :=Z

h(z)g(z|x)dz, (3.42)

so that equation (3.40) is equal to

GKt [h] ∏x∈Xt

((1− pD,t)+ pD,tH[h](x)) . (3.43)

Hence, using 3.24, the joint probability generating functional GXt ,Zt [ξ,h] is given by

GXt ,Zt [ξ,h] = Pt|t−1 · (GZt|Xt [h|·]Π(·)[ξ]) = GKt [h]GXt [ξ(1− pD,t + pD,tH[h])], (3.44)

and since Xt and Kt are Poisson point processes,

= exp(VKt · (h−1)+VXt · (ξ(1− pD,t + pD,tH[h]))) . (3.45)

Using equation 3.28, from Bayes rule,

GXt |Zt[y] =(dnGXt ,Zt [Zt , ·])0[δz1, . . . ,δzn]

(dnGXt ,Zt [1, ·])0[δz1, . . . ,δzn], (3.46)

so that the conditional measure VXt |Zt [η|y] is

VXt |Zt [η|y] = VXt ·η =(d(dnGXt ,Zt [y, ·])0[δz1 , . . . ,δzn])ξ=1[η]

(dnGXt ,Zt [1, ·])0[δz1, . . . ,δzn]. (3.47)

For ease of notation, define the functional Fy,h[ξ] as,

Fy,h[ξ] :=(dnGXt ,Zt [y, ·])h[ξ, . . . ,ξ], (3.48)

so that the conditional intensity measure is

VXt |Zt [η|y] =(dFy,0)1[η]

Fy,0[1]. (3.49)

Evaluating Fy,h[ξ], by taking the functional derivatives, gives

Fy,h[ξ] = GXt ,Zt [ξ,h] ∏z∈Zt

(VXt ·ξψt,z + vKt (y)), (3.50)

where ψt,z(x) = pD,tg(z|x). Taking the derivative of this at h = 0 and ξ = 1 using the chain

rule, we get

(dFy,0)1[η] = (3.51)

VXt ·η(1− pD,t)Fy,0[1]+GXt ,Zt [1,0] ∏z∈Zt

(VXt ·ξψt,z + vKt (z)) ∑z∈Zt

VXt ·ηψt,zVXt ψt,z + vKt (z)

, (3.52)

which by 3.50,

= VXt η · (1− pD,t)Fy,0[1]+Fy,0[1] ∑z∈Zt

VXt ·ηψt,zVXt ψt,z + vKt (z)

, (3.53)

and taking out a factor of Fy,0[1],

= Fy,0[1]VXt ·η(

(1− pD,t)+ ∑z∈Zt

ψt,zVXt ψt,z + vKt (z)

. (3.54)

Then, by 3.47, the conditional measure VXt |Zt [η|z] is

VXt |Zt [η|z] = VXt ·η(

(1− pD,t)+ ∑z∈Zt

ψt,zVXt ψt,z + vKt (z)

. (3.55)

Since VXt = Vt|t−1(·|Zt−1) and VXt |Zt = Vt , we have

Vt [η|z] = Vt|t−1 ·(

(1− pD,t)+ ∑z∈Zt

ψt,zVt|t−1 ·ψt,z + vKt (z)

η, (3.56)

which, by taking the densities gives the PHD measurement equation

3.5 Summary

The framework for multiple target tracking used in this thesis has been described in this

chapter. This has been presented in the context of point processes, which enables results

from standard probability to be invoked. The PHD filter has been derived from point process

theory as the first-order moment of the optimal multiple-target Bayes filter with a dynamic

point process, from which a set-valued estimate can be determined at each time-step based

on a set-valued observation. The relationships between point process theory and random

counting measures have been shown.

In the next two chapters, two different implementations of the PHD filter are given.

The first of which is the Particle PHD filter, which extends the particle filter from chapter

2 to a multiple-target environment using a sequential Monte Carlo algorithm. The asymp-

totic convergence properties of this algorithm are established in the next chapter and error

bounds are determined for the mean square errors. The second implementation is the Gaus-

sian Mixture PHD filter described in chapter 5, which is similar in style to the Gaussian

Sum filter from chapter 1, where in this case the PHD is represented by a finite weighted

mixture of Gaussians in which the means and covariances are predicted and updated with

the Kalman filter equations and the weights are updated according to the PHD filter equa-

tions. The uniform convergence properties of this approximation to the PHD are derived.

Chapter 4

The Particle PHD Filter

4.1 Introduction

Sequential Monte Carlo approximations of the optimal multiple-target filter are computa-

tionally expensive. A practical suboptimal alternative to the optimal filter is the Probability

Hypothesis Density (PHD) filter, which propagates the first-order statistical moment in-

stead of the full multiple-target posterior. The integral of the PHD in any region of the state

space is the expected number of targets in that region [12].

Particle filter methods for the PHD-filter have been devised by Vo [18] and Zajic [19].

Practical applications of the filter include tracking vehicles in different terrains [61], track-

ing targets in passive radar located on ellipses [62], and tracking a variable number of targets

in forward-scan sonar [24]. The Sequential Monte Carlo implementation, or Particle PHD

Filter algorithm, is given in the next section.

It was noted in chapter 2 that one of the crucial considerations for particle filter algo-

rithms is convergence as the number of samples from the posterior distribution increases. It

is required that the empirical distribution represented by the particles tends to the true dis-

tribution and that the errors in the approximation can be bounded. Convergence studies by

Crisan [54], [55], [56] amongst others have demonstrated convergence of the mean square

errors and weak convergence of the empirical measures to the true measures at each step in

the algorithm. This chapter presents convergence results for the Particle PHD filter. Bounds

are established for the mean square error and weak convergence of the empirical particle

measure to the true PHD measure is shown.

4.2 The Particle PHD Filter Algorithm

The implementation of the PHD Particle filter is an adaptation of the method described by

Vo et al. [17], based on a Sequential Monte Carlo algorithm for multi-target tracking. The

algorithm can be informally described by the following stages. In the initialisation stage,

particles are distributed across the field of view according to the prior. The particles are

propagated in the prediction stage using the dynamic model with added process noise and,

in addition, particles are added to allow for incoming targets. When the measurements

are received, weights are calculated for the particles based on their likelihoods, which are

determined by the statistical distance of the particles to the set of observations. The sum

of the weights gives the estimated number of targets. Particles are then resampled from the

weighted particle set to give an unweighted representation of the PHD.

The Sequential Monte Carlo implementation of the PHD Filter is given here. The algo-

rithm is initialised in Step 0 and then iterates through Steps 1 to 3.

Step 0: Initialisation at t=0

The filter is initialised with N0 particles drawn from a prior distribution. The number of

particles is adapted at each stage so that it is proportional to the number of targets. Let N

be the number of particles per target. The mass assigned to each particle is T0/N0, where

T0 is the expected initial number of targets, which will be updated after an iteration of the

algorithm.

•∀i = 1, . . . ,N0 sample x(i)0 from D0|0 and set t = 0.

Let DN00|0 be the measure:

DN00|0(dxt) := T0

∑i=1

δx(i)t

(dxt), (4.1)

where δx(i)t

is the Dirac delta function centred at x(i)t .

Step 1: Prediction Step, for t ≥ 0

In the prediction step, samples are obtained by two importance sampling proposal den-

sities, qt and pt :

•∀i = 1, ..,Nt−1, sample x(i)t from a proposal density qt(.|x(i)

t−1,Zt), and evaluate the pre-

dicted weights ω(i)t|t−1:

ω(i)t|t−1 =

φt|t−1(x(i)t ,x(i)

t−1)

qt(x(i)t |x(i)

t−1,Zt)ω(i)

t−1. (4.2)

M new-born particles are also introduced from the spontaneous birth model to detect new

targets entering the state space.

•∀i = Nt−1 + 1, ..,Nt−1 + M, sample x(i)t from another proposal density pt(.|Zt), and com-

pute the weights of new born particles ω(i)t|t−1:

ω(i)t|t−1 =

γt(x(i)t )

pt(x(i)t |Zt)

. (4.3)

Let DNt−1t|t−1 and DNt−1,M

t|t−1 be the measures:

DNt−1t|t−1(dxt) :=

Nt−1

∑i=1

ω(i)t|t−1δx(i)

t(dxt), (4.4)

DNt−1,Mt|t−1 (dxt) :=

Nt−1+M

∑i=1

ω(i)t|t−1δx(i)

t(dxt). (4.5)

Step 2: Update Step, for t ≥ 0

After the new measurements are obtained, the weights are recalculated using the likeli-

hood function g(·|·) to update the distribution based on new information:

• Let Rt = Nt−1 +M. ∀z ∈ Zt , compute:

〈ωt|t−1,ψt,z〉 =Rt

∑i=1

ψt,z(x(i)t )ω(i)

t|t−1. (4.6)

•∀i = 1, . . . ,Rt , update weights:

ω(i)t =

(1− pD)+ ∑z∈Zt

ψt,z(x(i)t )

κt(z)+ 〈ωt|t−1,ψt,z〉

ω(i)t|t−1. (4.7)

Let DRtt|t be the measure:

DRtt|t(dxt) :=

∑i=1

ω(i)t δx(i)

t(dxt). (4.8)

Step 3: Resampling Step

The particles are resampled to obtain an unweighted representation of Dt|t . This is un-

weighted since the resampled representation of Dt|t is given by the particle density.

• Compute the mass of the particles:

Tt =Rt

∑i=1

ω(i)t , (4.9)

and set Nt = N · int(Tt) (where int(Tt) is the integer nearest to Tt). Target estimates are taken

at this stage because the resampling stage introduces further approximations, resulting in

less descriptive posterior distributions. In the PHD Filter algorithm, the weights are not

normalized as in the standard particle filter algorithm as they do not sum to one but, instead,

to the expected number of targets.

• Resample

ω(i)t

Tt, x(i)

i=1to get

Tt/Nt ,x(i)t

The particles each have weight Tt/Nt after resampling. Let DNtt|t be the measure:

DNtt|t(dxt) :=

∑i=1

ω(i)t δx(i)

t(dxt). (4.10)

4.3 Convergence for the Particle PHD Filter Algorithm

Convergence properties for the Particle PHD Filter will now be established. First, we con-

sider the rate of convergence of the average mean square error E[

(〈DNtt|t ,ϕ〉−〈Dt|t,ϕ〉)2

for any function ϕ ∈ B(Rd), where B(Rd) is the set of bounded Borel measurable functions

on Rd . Then we show almost-sure convergence of DNt

t|t to Dt|t . When the measure in the in-

ner product 〈., .〉 is continuous, it defines the integral inner product, and when it is discrete,

it defines the summation inner product, so that:

〈Dt|t ,ϕ〉 =Z

Dt|t(xt |Z1:t)ϕ(xt)dxt (4.11)

〈DNtt|t ,ϕ〉 =

∑i=1

ω(i)t ϕ(x(i)

t ) (4.12)

The norm ‖ϕ‖ used here is the supremum norm.

4.3.1 Criteria for Convergence

To show convergence, certain conditions on the functions need to be met:

• The transition kernel φt|t−1 satisfies the Feller Property, i.e. ∀t > 0,R

ϕ(y)φt|t−1(x,dy) is

continuous ∀ϕ ∈Cb(Rd), where Cb(R

d) are the continuous bounded functions on Rd .

•ψt,z ∈Cb(Rd)

• For any rational-valued random variables Q(i)t such that there exists p > 1, some constant

C, and α < p−1,

N∑i=1

(Q(i)t −Nω(i)

t )q(i)

≤CNα‖q‖p (4.13)

for all vectors q = (q(1), ..,q(N)) and ∑Ni=1 Q(i)

t = N.

• We assume that the importance sampling ratios are bounded, i.e. there exists constants B1

and B2 such that ‖γt/pt‖ ≤ B1 and ‖φt|t−1/qt‖ ≤ B2.

• The resampling strategy is multinomial and hence unbiased [54], i.e. the resampled par-

ticle set is i.i.d. according to the empirical distribution before resampling.

The data update equation assumes a Poisson model, and hence is only an approximation.

The clutter parameter κt,z needs to be determined from the data and cannot be inferred from

the recursion. For the purpose of these proofs, it has been assumed that we know the correct

density ct and average number of Poisson distributed clutter points λt .

4.3.2 Convergence of the Mean Square Errors

If µN,N = 1, . . . ,∞, is a sequence of measures that depend on the number of particles, then

we say µN converges to µ if ∀ϕ ∈ B(Rd),

limN→∞

(〈µN,ϕ〉−〈µ,ϕ〉)2]= 0. (4.14)

We show that, in the case of the PHD, this depends only on T , the number of targets, and

Nt , the number of particles. Let the likelihood function g ∈ B(Rd) be a bounded function.

At each stage of the algorithm, the approximation admits a mean square error on the order

of the number of particles. We proceed by first showing that equation (4.15) is satisfied in

the initialisation step. Then we show that if equation (4.15) holds, then after the prediction

step equation (4.16) holds. If equation (4.16) holds, then equation (4.17) holds after the

update step. Finally, we show that if equation (4.17) holds, then equation (4.18) holds after

resampling.

(〈DNt−1t−1|t−1,ϕ〉−〈Dt−1|t−1,ϕ〉)2

≤ ct−1|t−1‖ϕ‖2

Nt−1, (4.15)

(〈DNt−1,Mt|t−1 ,ϕ〉−〈Dt|t−1,ϕ〉)2

≤ ‖ϕ‖2(ct|t−1Nt−1

+dtM ), (4.16)

(〈DRtt|t,ϕ〉−〈Dt|t,ϕ〉)2

≤ ct|t‖ϕ‖2

Rt, (4.17)

≤ ct|t‖ϕ‖2

Nt. (4.18)

In deriving the proofs, we use the Minkowski inequality, which states that for any two

random variables X and Y in L2:

E[(X +Y )2]12 ≤ E[X2]

12 +E[Y 2]

12 . (4.19)

LEMMA 7 For any ϕ ∈ B(Rd), there exists some real number c0|0 such that at Step 0 (Ini-

tialization), condition (4.18) holds at time t = 0.

Proof:

We assume that at time t = 0, we can sample exactly from the initial distribution D0|0. Then,

〈DN00|0,ϕ〉−〈D0|0,ϕ〉 (4.20)

∑i=1

(ϕ(x(i)t )−〈D0|0,ϕ〉).

Let ξi = T0ϕ(x(i)t )−〈D0|0,ϕ〉. Then E [ξi] = 0, and ξ1, . . . ,ξN0 is a sequence of independent

integrable random variables. From the Marcinkiewicz and Zygmund inequalities (see, for

example, p 498 [53]), there exists a constant c such that

∑i=1

≤ cE[

∑i=1

. (4.21)

and hence

∑i=1

≤ c‖ξ‖2

N0, (4.22)

where ‖ξ‖ is the supremum norm. Using the definition of ξ, we have

‖ξi‖ ≤ 2T0‖ϕ‖, (4.23)

since 〈D0|0,ϕ〉 ≤ ‖ϕ‖R

D0|0(dx) = ‖ϕ‖T0, by Holder’s Inequality. Therefore, at time t = 0,

there is a real number c0|0, dependent on the initial number of targets T0, such that

(〈DN00|0,ϕ〉−〈D0|0,ϕ〉)2

≤ c0|0‖ϕ‖2

N0(4.24)

so condition (4.18) holds at the beginning of the algorithm

LEMMA 8 Assume that for any ϕ ∈ B(Rd), (4.15) holds. Then, after Step 1 (Prediction),

for any ϕ ∈ B(Rd), (4.16) holds for some constant dt and some real number ct|t−1 that

depends on the number of spawned targets.

Before proving this Lemma, some considerations are given below. The Sequential

Monte Carlo implementation involves sampling from two densities: qt , the density propa-

gated from the previous time step, and pt , the de nsity for spontaneous birth. Suppose that

the spontaneous birth density is sampled by M particles and the propagated density by Nt

particles.

To prove convergence, we use the fact that the sum of two sequences converges weakly

to the sum of the limits of those sequences, which follows from a basic result of Real

Analysis on the convergence of sequences of real numbers. It then suffices to establish

weak convergence of the two sequences independently.

We have assumed that we can sample exactly from the spontaneous birth density γt ,

so using the same argument for showing that the initial distribution is bounded (Lemma

7), and using the assumption that the importance ratio ‖γt/pt‖ is bounded, then there is a

constant dt such that

(〈γMt ,ϕ〉)2]≤ dt

‖ϕ‖2

M . (4.25)

Define D′t|t−1 to be Dt|t−1 − γt . We now show that D′

t|t−1(x), the density propagated from

the previous time step, is bounded.

Proof:

By the triangle inequality, we have

|〈DNt−1t|t−1,ϕ〉−〈D′

t|t−1,ϕ〉| (4.26)

≤|〈DNt−1t|t−1,ϕ〉−〈DNt−1

t−1|t−1,φt|t−1ϕ〉|+ |〈DNt−1t−1|t−1,φt|t−1ϕ〉−〈Dt−1|t−1,φt|t−1ϕ〉|. (4.27)

Let Gt−1 be the σ-algebra generated by the particles x(i)t−1. Then

(〈DNt−1t,t−1,ϕ〉|Gt−1

= 〈DNt−1t−1,t−1,φt|t−1ϕ〉, (4.28)

(〈DNt−1t,t−1,ϕ〉−E

(〈DNt−1t,t−1,ϕ〉|Gt−1

)2|Gt−1]

(4.29)

(〈DNt−1t,t−1,ϕ〉−〈DNt−1

t−1,t−1,φt|t−1ϕ〉)2|Gt−1]

(4.30)

(〈DNt−1t,t−1,ϕ〉(〈D

Nt−1t,t−1,ϕ〉−〈DNt−1

t−1,t−1,φt|t−1ϕ〉)]

(4.31)

−〈DNt−1t−1,t−1,φt|t−1ϕ〉E

〈DNt−1t,t−1,ϕ〉−〈DNt−1

t−1,t−1,φt|t−1ϕ〉|Gt−1]

The second term in (4.31) is zero, so the above simplifies to

〈DNt−1t,t−1,ϕ〉2

−〈DNt−1t−1,t−1,φt|t−1ϕ〉2 (4.32)

Writing out this as a sum, and using the independence of the particles, (4.32) equals

Tt−1Nt−1

)2 Nt−1

∑i=1

ϕ(x(i)t )

t−1)

qt(x(i)t |x(i)

t−1,Zt)

|Gt−1

− (φt|t−1ϕ)(x(i)t−1)

(4.33)

≤T 2

t−1Nt−1

‖ϕ‖2

φt|t−1qt

2+‖φt|t−1‖2

. (4.34)

Using Minkowski’s inequality, we obtain

(〈DNt−1t|t−1,ϕ〉−〈D′

t|t−1,ϕ〉)2] 1

2 (4.35)

(〈DNt−1t|t−1,ϕ〉−〈DNt−1

t−1|t−1,φt|t−1ϕ〉)2]

12 (4.36)

(〈DNt−1t−1|t−1,φt|t−1ϕ〉−〈Dt−1|t−1,φt|t−1ϕ〉)2

≤ 1√Nt−1

‖ϕ‖

Tt−1

φt|t−1qt

2+‖φt|t−1‖2

+√ct−1|t−1

. (4.37)

The transition kernel φt|t−1 is bounded by the single-target transition, ft|t−1, and the PHD

of spawned targets, bt|t−1:

φt|t−1(x,xt−1) = PS(xt−1) ft|t−1(x|xt−1)+bt|t−1(x|xt−1). (4.38)

Therefore ‖φt|t−1ϕ‖≤ 1+Tt|t−1, where Tt|t−1 is the number of spawned targets. By assump-

tion, the ratio ‖φt|t−1/qt‖ is bounded by some constant B2, and so the lemma is proved:

(〈DNt−1,Mt|t−1 ,ϕ〉−〈Dt|t−1,ϕ〉)2

≤ ‖ϕ‖2(ct|t−1

Nt−1+

, (4.39)

where ct|t−1 =(

Tt−1(B22 +(1+Tt|t−1)

2)12 +

√ct−1|t−1

LEMMA 9 Assume that for any ϕ ∈ B(Rd), (4.16) holds. Then, after Step 2 (Data Update),

for any ϕ ∈ B(Rd), (4.17) holds for some real number ct|t that depends on the number of

targets.

Proof:

From the definitions (4.11) and (4.12), we have:

〈DRtt|t,ϕ〉−〈Dt|t,ϕ〉 (4.40)

ν+ ∑z∈Zt

κt,z + 〈DNt−1,Mt|t−1 ,ψt,z〉

DNt−1,Mt|t−1 ,ϕ

−⟨[

ν+ ∑z∈Zt

ψt,zκt,z + 〈Dt|t−1,ψt,z〉

Dt|t−1,ϕ

(by linearity)

〈DNt−1,Mt|t−1 ,ϕν〉−〈Dt|t−1,ϕν〉

+ ∑z∈Zt

〈DNt−1,Mt|t−1 ,ϕψt,z〉

−〈Dt|t−1,ϕψt,z〉

κt,z + 〈Dt|t−1,ψt,z〉

(4.41)

(adding and subtracting a new term)

(4.42)

+ ∑z∈Zt

〈DNt−1 ,Mt|t−1 ,ϕψt,z〉

−〈DNt−1 ,M

t|t−1 ,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉

κt,z + 〈Dt|t−1,ψt,z〉−

〈Dt|t−1,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉

The modulus of the first bracket in the summation from (4.42) is:

−〈DNt−1,M

(4.43)

∣〈DNt−1,M

t|t−1 ,ϕψt,z〉(κt,z + 〈Dt|t−1,ψt,z〉)−〈DNt−1,Mt|t−1 ,ϕψt,z〉(κt,z + 〈DNt−1,M

t|t−1 ,ψt,z〉)∣

(κt,z + 〈DNt−1,Mt|t−1 ,ψt,z〉)(κt,z + 〈Dt|t−1,ψt,z〉)

(4.44)

∣〈DNt−1,M

t|t−1 ,ϕψt,z〉(κt,z + 〈Dt|t−1,ψt,z〉)−〈DNt−1,Mt|t−1 ,ϕψt,z〉(κt,z + 〈DNt−1,M

t|t−1 ,ψt,z〉)∣

〈DNt−1,Mt|t−1 ,ψt,z〉〈Dt|t−1,ψt,z〉

(4.45)

≤ ‖ϕ‖〈Dt|t−1,ψt,z〉

∣〈Dt|t−1,ψt,z〉−〈DNt−1,M

t|t−1 ,ψt,z〉∣

∣. (4.46)

The second bracket in the summation from (4.42) is

∣〈DNt−1,M

t|t−1 ,ϕψt,z〉−〈Dt|t−1,ϕψt,z〉∣

κt,z + 〈Dt|t−1,ψt,z〉(4.47)

∣〈DNt−1,M

t|t−1 ,ϕψt,z〉−〈Dt|t−1,ϕψt,z〉∣

〈Dt|t−1,ψt,z〉. (4.48)

Combining these, we get

+ (4.49)

∑z∈Zt

〈DNt−1 ,Mt|t−1 ,ϕψt,z〉

−〈DNt−1,M

≤ |〈DNt−1,Mt|t−1 ,ϕν〉−〈Dt|t−1,ϕν〉| (4.50)

+ ∑z∈Zt

‖ϕ‖〈Dt|t−1,ψt,z〉

∣〈Dt|t−1,ψt,z〉−〈DNt−1,Mt|t−1 ,ψt,z〉

∣〈DNt−1,Mt|t−1 ,ϕψt,z〉−〈Dt|t−1,ϕψt,z〉

〈Dt|t−1,ψt,z〉

From Minkowski’s inequality,

]12 ≤ E

(〈DNt−1,Mt|t−1 ,ϕν〉−〈Dt|t−1,ϕν〉)2

]12 (4.51)

+ ∑z∈Zt

‖ϕ‖〈Dt|t−1,ψt,z〉

(〈Dt|t−1,ψt,z〉−〈DNt−1,Mt|t−1 ,ψt,z〉)2

(〈DNt−1,Mt|t−1 ,ϕψt,z〉−〈Dt|t−1,ϕψt,z〉)2

≤√ct|t−1‖ϕ‖‖ν‖

+ ∑z∈Zt

2‖ϕ‖‖ψt,z‖√ct|t−1

〈Dt|t−1,ψt,z〉√

(4.52)

≤√ct|t−1‖ϕ‖

1+ ∑z∈Zt

2‖ψt,z‖〈Dt|t−1,ψt,z〉

. (4.53)

ψt,z is a bounded function, since g is bounded by assumption. Lemma 9 follows from this

where ct|t = ct|t−1

1+∑z∈Zt

2‖ψt,z‖〈Dt|t−1,ψt,z〉

LEMMA 10 Assume that ∀ϕ ∈ B(Rd), (4.17) holds. Then after Step 3 (Resampling), there

exists a real number ct|t , that depends on the number of targets, such that ∀ϕ ∈ B(Rd),

(4.18) holds.

Proof:

Adding and subtracting the term 〈DRtt|t ,ϕ〉 from the Data Update step, we have

〈DNtt|t ,ϕ〉−〈Dt|t,ϕ〉 = (〈DNt

t|t ,ϕ〉−〈DRtt|t,ϕ〉)+(〈DRt

t|t,ϕ〉−〈Dt|t,ϕ〉), (4.54)

so by Minkowski’s inequality,

]12 ≤ E

(〈DNtt|t ,ϕ〉−〈DRt

t|t,ϕ〉)2]

(4.55)

Let Ft be the σ-algebra generated by x(i), i = 1, . . . ,Rt Then the expectation of the inner

product 〈DNtt|t ,ϕ〉 conditioned on Ft is

〈DNtt|t ,ϕ〉|Ft

= 〈DRtt|t ,ϕ〉. (4.56)

Hence there exists a number c such that

(〈DNtt|t ,ϕ〉−〈DRt

t|t,ϕ〉)2|Ft

≤ cNt

‖ϕ‖2. (4.57)

This follows from the assumption that the resampling strategy is unbiased. Using Minkowski’s

inequality, as above, we have

]12 ≤ (

√c+√

ct|t)‖ϕ‖√

Nt. (4.58)

Lemma 10 is then proved with ct|t = (√

THEOREM 6 ∀t ≥ 0, there is a real number ct|t , that depends on the number of new targets

but is independent of the number of particles, such that ∀ϕ ∈ B(Rd), (4.18) holds.

Proof:

Combining the above proofs, we have shown that ∀t ≥ 0,∃ct|t independent of Nt , but de-

pendent on the number of targets, such that ∀ϕ ∈ B(Rd):

≤ ct|t‖ϕ‖2

Nt (4.59)

4.3.3 Convergence of Empirical Measures

To prove that an empirical distribution converges to its true distribution, we need to have

a notion of convergence for measures. This type of convergence is called weak conver-

gence, which is fundamental to the study of probability and statistics. With this type of

convergence, the values of the random variables are not important; it is the probabilities

with which they assume those values that matter. Thus, the probability distributions of the

random variables will be converging, not the values themselves [57].

Let µN and µ be probability measures on Rd . Then, the sequence µN converges weakly

to µ ifR

f (x)µN(dx) converges toR

f (x)µ(dx) for each real-valued continuous and bounded

function f on Rd .

This definition can be extended to more general measures, not just probability distribu-

tions. In our case, we will be considering the PHD measure, where the notion still applies.

(Further details on weak convergence for measures can be obtained from Billingsley [63].)

The empirical measures considered here are the particles that approximate the true mea-

sures, where Nt represents the number of particles. Let Cb(Rd) be the set of real-valued

continuous bounded functions on Rd . If (µN) is a sequence of measures, then µN converges

weakly to µ if:

limN→∞

〈µN,ϕ〉 = 〈µ,ϕ〉 (4.60)

This section shows that after each stage of the PHD Filter algorithm, the measures

converge weakly. We proceed by first showing that equation (4.61) is satisfied in the ini-

tialisation step. Then we show that if equation (4.61) holds, then after the prediction step

equation (4.62) holds. If equation (4.62) holds, then equation (4.63) holds after the up-

date step. Finally, we show that if equation (4.63) holds, then equation (4.64) holds after

resampling.

limNt−1→∞

DNt−1t−1|t−1 = Dt−1|t−1 a.s. (4.61)

limNt−1,M→∞

DNt−1,Mt|t−1 = Dt|t−1 a.s. (4.62)

limRt→∞

DRtt|t = Dt|t a.s. (4.63)

limNt→∞

DNtt|t = Dt|t a.s. (4.64)

(a.s. stands for almost surely, i.e. true for all values outside the null set.)

We assume that at time t = 0, we can sample exactly from the initial distribution D0|0.

Then, from the Glivenko-Cantelli Theorem [57], which states that empirical distributions

converge to their actual distributions almost surely,

limN0→∞

DN00|0 = D0|0 a.s. (4.65)

LEMMA 11 Suppose (4.61) holds, then after Step 1 (Prediction), (4.62) holds.

Proof:

Define D′t|t−1 to be Dt|t−1 − γt . It suffices to prove that

limM→∞

γMt = γt a.s. (4.66)

limNt−1→∞

DNt−1t|t−1 = D′

t|t−1 a.s. (4.67)

We have assumed that we sample M i.i.d. particles from γt for the first of these, so by

the Glivenko-Cantelli Theorem, (4.66) is true. Let Gt−1 be the σ-algebra generated by

x0:t−1Nt−1i=1 , then

〈DNt−1t|t−1,ϕ〉|Gt−1

= 〈DNt−1t−1|t−1,φt|t−1ϕ〉. (4.68)

Since E[

ϕ(x(i)t )|Gt−1

= (φt|t−1ϕ)(x(i)t−1) and x0:tNt−1

i=1 are i.i.d. random variables which

are conditional on Gt−1, we have

(〈DNt−1t|t−1,ϕ〉−E

)4|Gt−1]

(4.69)

Tt−1Nt−1

Nt−1

∑i=1

(ϕ(x(i)t )

t−1)

qt(x(i)t |x(i)

t−1,Zt)− (φt|t−1ϕ)(x(i)

t−1)

|Gt−1

. (4.70)

For notational simplicity, define the measure Φ as

Φ(x(i)t ,x(i)

t−1) = ϕ(x(i)t )

t−1)

qt(x(i)t |x(i)

t−1,Zt)− (φt|t−1ϕ)(x(i)

t−1), (4.71)

which has an expectation of zero. Then, expanding the above quartic gives

Tt−1Nt−1

Nt−1

∑i=1

Φ(x(i)t ,x(i)

t−1)4|Gt−1

(4.72)

+Nt−1

∑i6= j

Φ(x(i)t ,x(i)

t−1)3Φ(x( j)

t ,x( j)t−1)+Φ(x(i)

t ,x(i)t−1)

2Φ(x( j)t ,x( j)

t−1)2|Gt−1

+Nt−1

∑i, j,k distinct

Φ(x(i)t ,x(i)

t−1)2Φ(x( j)

t ,x( j)t−1)|Gt−1

Φ(x(k)t ,x(k)

t−1)|Gt−1]

+Nt−1

∑i, j,k,l distinct

Φ(x(i)t ,x(i)

t−1)Φ(x( j)t ,x( j)

t−1)Φ(x(k)t ,x(k)

t−1)Φ(x(l)t ,x(l)

t−1)|Gt−1]

Tt−1Nt−1

)4(Nt−1

∑i=1

Φ(x(i)t ,x(i)

t−1)4|Gt−1

+Nt−1

∑i6= j

Φ(x(i)t ,x(i)

t−1)2Φ(x( j)

t ,x( j)t−1)

2|Gt−1]

(4.73)

where the last equality holds because Φ(x(i)t ,x(i)

t−1) are mutually independent random vari-

ables with mean zero. Taking expectations of (4.70) and (4.71), there exists a constant C

such that

(〈DNt−1t|t−1,ϕ〉−E

≤CT 4

t−1(B42 +(1+Tt|t−1)

4)‖ϕ‖4

N2t−1

, (4.74)

since there are O(N2t−1) terms bounded by CT 4

t−1(B42 +(1+Tt|t−1)

4)‖ϕ‖4, following a sim-

ilar argument as in Lemma 7.

It then follows that

t−1|t−1,φt|t−1ϕ〉)4]

≤CT 4

t−1(B42 +(1+Tt|t−1)

4)‖ϕ‖4

N2t−1

, (4.75)

and hence

limNt−1→∞

t−1|t−1,φt|t−1ϕ〉) = 0. (4.76)

Using the result from Real Analysis, we have

limNt−1,M→∞

〈DNt−1,Mt|t−1 ,ϕ〉 = lim

Nt−1→∞〈DNt−1

t|t−1,ϕ〉+ limM→∞

〈γMt ,ϕ〉 (4.77)

= 〈D′t|t−1,ϕ〉+ 〈γt,ϕ〉 = 〈Dt|t−1,ϕ〉 (4.78)

LEMMA 12 Suppose (4.62) holds, then after Step 2 (Data Update), (4.63) holds.

Proof:

By definition,

〈DRtt|t ,ϕ〉 = 〈DNt−1

t|t−1,ϕν〉+ ∑z∈Zt

〈DNt−1t|t−1,ϕψt,z〉

κt,z + 〈DNt−1t|t−1,ψt,z〉

(4.79)

By continuity and Lemma 11, we have

limNt−1→∞

〈DNt−1t|t−1,ϕν〉 = 〈Dt|t−1,ϕν〉, (4.80)

limNt−1,M→∞

〈DNt−1,Mt|t−1 ,ϕψt,z〉 = 〈Dt|t−1,ϕψt,z〉, (4.81)

limNt−1→∞

〈DNt−1t|t−1,ψt,z〉 = 〈Dt|t−1,ψt,z〉. (4.82)

Hence,

limRt→∞

〈DRtt|t ,ϕ〉 = 〈Dt|t−1,ϕν〉+ ∑

z∈Zt

= 〈Dt|t ,ϕ〉 (4.83)

and therefore

limRt→∞

DRtt|t = Dt|t a.s. (4.84)

LEMMA 13 Suppose (4.63) holds, then after Step 3 (Resampling), (4.64) holds.

Proof:

Let P(i)t be the number of times that particle x(i)

t is resampled and let Q(i)t = P(i)

t ·Rt/Nt .

Then, from our assumption, we have

|〈DNtt|t ,ϕ〉−〈DRt

t|t,ϕ〉|p]

∑i=1

|(Q(i)t −Rtω

(i)t )ϕ(x(i)

t )|)p]

≤ C‖ϕ‖p

Rt1+ε , (4.85)

where ε = p−α−1 ≥ 0. Hence,

limNt→∞

〈DNtt|t ,ϕ〉−〈DRt

t|t,ϕ〉 = 0 a.s. (4.86)

THEOREM 7 For all t ≥ 0,(4.64) holds.

Proof:

The above three proofs have shown that for all t ≥ 0, limNt→∞ DNtt|t = Dt|t

4.4 Conclusions

It has been shown, under the assumption that ϕ(x) is bounded above, that it is possible

to find bounds for the mean square error of the PHD Particle filter at each stage of the

algorithm. These depend on the number of targets introduced at each iteration, but if the

order of the number of targets is much lower than the order of the number of particles, i.e.

T << N, then the error tends to zero as N tends to infinity.

It has also been shown, under the additional assumptions that the transition kernel sat-

isfies the Feller property and the likelihood function is a continuous bounded function, that

the empirical distribution, represented by the particles, converges almost surely to the true

PHD distribution. These results are not dependent on the state dimension. The data update

equation assumes a Poisson model, and hence is only an approximation. The clutter param-

eter κt,z needs to be determined from the data and cannot be inferred from the recursion.

For the purpose of these proofs, it has been assumed that we know the correct density ct

and average number of Poisson clutter points λt .

The assumption that ϕ(x) is bounded above may be too restrictive for practioners, and

the additional assumptions on the likelihood and transition kernel may be unrealistic for

practical applications; although applications of the PHD filter have demonstrated its poten-

tial for real-world applications. Despite these reservations, these results give justification to

the Sequential Monte Carlo implementation of the PHD filter given in chapter 3, and show

how the order of the mean squared error is reduced as the number of particles increases.

An implementation of the algorithm on forward-looking sonar for estimating a variable

number of targets in forward-looking sonar shall be demonstrated in chapter 6 and novel

methods for introducing track continuity into the algorithm are presented in chapter 7 which

will then be demonstrated on real sonar data in chapter 9.

Chapter 5

The Gaussian Mixture PHD Filter

5.1 Introduction

The second implementation of the PHD filter studied in this thesis is described in this

chapter, called the Gaussian Mixture PHD filter. The closed-form solution to the PHD

(Probability Hypothesis Density) filter was recently derived to provide a solution for multi-

ple target tracking with linear/Gaussian models without the need for measurement-to-track

data association [34, 33] where it was shown that when the initial prior intensity of the

random-set of targets is a Gaussian mixture, the posterior intensity at any time step is also

a Gaussian mixture.

This chapter demonstrates the uniform convergence of the errors for each of the stages

of the Gaussian Mixture PHD Filter [34, 33] using results already established for the particle

implementation of the PHD filter [21] and Wiener’s Theory of Approximation [51]. Error

bounds are provided in L1 for the pruning and merging stage of the algorithm, based on

those established for the Gaussian Sum filter [40].

Extensions of the Gaussian Mixture PHD filter proposed in [33], namely the Extended

Kalman PHD filter and the Unscented Kalman PHD filter, are also discussed. Conver-

gence results for the Extended Kalman PHD filter are given based on the Gaussian Sum

filter developed by Sorenson and Alspach [40], the L1 convergence properties discussed in

Anderson and Moore [44], and the fact that densities can be represented by a linear com-

bination of Gaussians in L1 [51]. Taken with the convergence results of the L1 error, the

Gaussian mixture approximation then converges to the true posterior intensity.

The results show that, under linear Gaussian assumptions of the dynamic model, the

Gaussian Mixture posterior intensity can approximate the true posterior intensity to any

desired degree of accuracy. In addition, error bounds have been established for the pruning

and merging stages of the algorithm which ensure that the accuracy of these stages can be

controlled.

5.2 The Gaussian Mixture PHD Filter Algorithm

In this section, we describe the linear-Gaussian multiple target model and the recently de-

veloped Gaussian Mixture PHD filter.

The multiple target model for the PHD recursion is described here. Each target follows

a linear Gaussian dynamical model,

ft|t−1(x|ζ) = N (x;Ft−1ζ,Qt−1), (5.1)

gt(z|x) = N (z;Htx,Rt), (5.2)

where N (·;m,P) denotes a Gaussian density with mean m and covariance P, Ft−1 is the

state transition matrix, Qt−1 is the process noise covariance, Ht is the observation matrix,

and Rt is the observation noise covariance.

The survival and detection probabilities are state independent, pS,t(x) = pS,t , and pD,t(x) =

pD,t . The intensities of the spontaneous birth and spawned targets are Gaussian mixtures,

γt(x) =Jγ,t

∑i=1

w(i)γ,t N (x;m(i)

γ,t ,P(i)γ,t ), (5.3)

βt|t−1(x|ζ) =

∑j=1

w( j)β,t N (x;F( j)

β,t−1ζ+d( j)β,t−1,Q

( j)β,t−1), (5.4)

where Jγ,t , w(i)γ,t , m(i)

γ,t , P(i)γ,t , i = 1, . . . ,Jγ,t , are given model parameters that determine the

shape of the birth intensity, similarly, Jβ,t , w( j)β,t , F( j)

β,t−1, d( j)β,t−1, and Q( j)

β,t−1, j = 1, . . . ,Jβ,t ,

determine the shape of the spawning intensity of a target with previous state ζ.

THEOREM 8 Under the assumptions that each target follows a linear Gaussian dynamical

model, the survival and detection probabilities are constant, the intensities of the birth and

spawned targets are Gaussian mixtures, and that the posterior intensity at time t − 1 is a

Gaussian mixture of the form

Dt−1|t−1(x) =Jt−1

∑i=1

w(i)t−1N (x;m(i)

t−1,P(i)t−1). (5.5)

Then the predicted intensity to time t is also a Gaussian mixture, and is given by

Dt|t−1(x) = DS,t|t−1(x)+Dβ,t|t−1(x)+ γt(x), (5.6)

where DS,t|t−1(x) is the PHD of existing targets, Dβ,t|t−1(x) is the PHD for spawned targets,

and γt(x) is the PHD of spontaneous birth targets. The density for existing targets, DS,t|t−1,

is determined from the linear Gaussian model using the Kalman prediction equations,

DS,t|t−1(x) = pS,t

Jt−1

∑j=1

w( j)t−1N (x;m( j)

S,t|t−1,P( j)S,t|t−1), (5.7)

m( j)S,t|t−1 = Ft−1m( j)

t−1, (5.8)

P( j)S,t|t−1 = Qt−1 +Ft−1P( j)

t−1FTt−1, (5.9)

(5.10)

and similarly for the spawned target density, Dβ,t|t−1,

Dβ,t|t−1(x) =Jt−1

∑j=1

w( j)t−1w(`)

β,t N (x;m( j,`)β,t|t−1,P

( j,`)β,t|t−1), (5.11)

m( j,`)β,t|t−1 = F(`)

β,t−1m( j)t−1 +d(`)

β,t−1, (5.12)

P( j,`)β,t|t−1 = Q(`)

β,t−1 +F(`)β,t−1P( j)

t−1(F(`)β,t−1)

T . (5.13)

Proof:

In the PHD prediction equation,

Dt|t−1(x|Ft−1) = γt(x)+Z

φt|t−1(x,xt−1)Dt−1|t−1(xt−1|Ft−1)dxt−1, (5.14)

we substitute the Gaussian state equation, constant probability of survival pS,t , Gaussian

spontaneous birth and spawned target PHDs and the posterior at time t −1 to give

Dt|t−1(xt−1|Ft−1) =Jγ,t

∑i=1

w(i)γ,t N (x;m(i)

γ,t ,P(i)γ,t ) (5.15)

pS,t N (xt−1;Ft−1ζ,Qt−1)+

∑j=1

w( j)β,t N (xt−1;F( j)

β,t−1ζ+d( j)β,t−1,Q

( j)β,t−1)

Jt−1

∑i=1

w(i)t−1N (xt−1;m(i)

t−1,P(i)t−1)dζ

(expanding the integral gives)

=Jγ,t

∑i=1

w(i)γ,t N (x;m(i)

γ,t ,P(i)γ,t ) (5.16)

+ pS,t

Jt−1

∑i=1

w(i)t−1

N (xt−1;Ft−1ζ,Qt−1)N (xt−1;m(i)t−1,P

(i)t−1)dζ (5.17)

∑j=1

Jt−1

∑i=1

w( j)β,t w(i)

N (xt−1;F( j)β,t−1ζ+d( j)

β,t−1,Q( j)β,t−1)N (xt−1;m(i)

t−1,P(i)t−1)dζ. (5.18)

Clearly, the first summation is the PHD for spontaneous birth. The integrals in the second

and third summations can be simplified using the following property for Gaussians,

N (x;Fζ+d,Q)N (ζ;m,P)dζ = N (x;Fm+d,Q+FPFT ), (5.19)

which gives the required result

THEOREM 9 Under the above assumptions, and that the predicted intensity to time t is a

Gaussian mixture of the form

Dt|t−1(x) =

Jt|t−1

∑i=1

w(i)t|t−1N (x;m(i)

t|t−1,P(i)t|t−1). (5.20)

Then the posterior intensity at time k is also a Gaussian mixture, and is given by

Dt|t(x) = (1− pD,t)Dt|t−1(x)+ ∑z∈Zt

Jt|t−1

∑j=1

w( j)t (z)N (x;m( j)

t|t (z),P( j)t|t ) (5.21)

where the weights are calculated according to the closed form PHD update equation,

w( j)t (z) =

pD,t w( j)t|t−1N (z;Htm( j)

t|t−1,Rt +HtP( j)t|t−1HT

κt(z)+ pD,t ∑Jt|t−1`=1 w(`)

t|t−1N (z;Htm(`)t|t−1,Rt +HtP(`)

t|t−1HTt )

, (5.22)

(5.23)

and the mean and covariance are updated with the Kalman filter update equations,

m( j)t|t (z) = m( j)

t|t−1 +K( j)t (z−Htm( j)

t|t−1), (5.24)

P( j)t|t = [I −K( j)

t Ht]P( j)t|t−1, (5.25)

K( j)t = P( j)

t|t−1HTt (HtP( j)

t|t−1HTt +Rt)

−1. (5.26)

Proof:

The PHD update equation is given by

Dt|t(x|Ft) =

(1− pD,t)+ ∑z∈Zt

ψt,z(x)κt(z)+ 〈Dt|t−1,ψt,z〉

Dt|t−1(x|Ft−1). (5.27)

Substituting the Gaussian likelihood and Gaussian Mixture Prediction PHD, we get

Dt|t(x|Ft) = (1− pD,t)Jt|t−1

∑i=1

t|t−1,P(i)t|t−1) (5.28)

Jt|t−1

∑i=1

∑z∈Zt

pD,t w(i)t|t−1N (z;Ht x,Rt)

κt(z)+ pD,t ∑Jt|t−1i=1 w(i)

t|t−1R N (ζ;m(i)

t|t−1,P(i)t|t−1),N (z;Ht ζ,Rt)dζ

N (x;m(i)t|t−1,P

(i)t|t−1).

Using the property of integrals of Gaussians in Theorem 1 and the following property of

Gaussians,

N (z;Hx,R)N (x;m,P) = N (z;Hm,R+HPHT )N (x;m+K(z−Hm),(I−KH)P)

(5.29)

(where K is the Kalman gain, K = PHT (HPHT +R)−1), the result follows

5.3 Convergence of the Errors

This section shows the L1 convergence of the Gaussian mixture PHD filter; in other words,

proving that each step in time of the PHD filter will maintain a suitable approximation

error that converges to zero as the number of Gaussians in the mixture tends to infinity.

This is achieved through the successive application of triangle inequalities and Holder’s

inequality. Finally the observation update is shown to converge using an adaptation of the

previous result on particle PHD convergence [21].

Results for the convergence properties of the Gaussian Mixture PHD Filter are now

established. Convergence of the L1 error is first shown, limJt→∞ |〈DJtt − Dt ,ϕ〉| = 0, for

any function ϕ, where DJtt is the Gaussian mixture approximation to Dt with Jt Gaussian

components. The 〈., .〉 notation defines the usual inner product

〈Dt ,ϕ〉 =Z

Dt(xt |Zt)ϕ(xt)dxt . (5.30)

and the operator notations ft|t−1ϕ, v ft|t−1 are defined by

( ft|t−1ϕ)(xt−1) =Z

ft|t−1(xt |xt−1)ϕ(xt)dxt , (5.31)

(v ft|t−1)(xt) =

D(xt−1) ft|t−1(xt |xt−1)dxt−1. (5.32)

Note that 〈v ft|t−1,ϕ〉= 〈v, ft|t−1ϕ〉. Also the PHD prediction equation (5.14) can be written

Dt|t−1 = (pS,tDt−1) ft|t−1 +Dt−1βt|t−1 + γt . (5.33)

In the proofs, we use an instance of Holder’s Inequality (see, for example pp. 27 [64]),

|〈v,ϕ〉| ≤ ‖v‖1‖ϕ‖∞. (5.34)

The data update equation assumes a Poisson model and, hence, is only an approxima-

tion. The clutter parameters need to be determined from the data and cannot be inferred

from the recursion. For the purpose of these proofs, it has been assumed that the correct

density ct and average number of Poisson clutter points λt are known.

THEOREM 10 Any density on Rd can be approximated as closely as desired in L1 by a

linear combination of Gaussian densities,

D(x) = limn→∞

n∑i=1

w(i)N (x;µi,Pi) (5.35)

This result is due to Wiener’s theorem on approximation [51]

This means that given any ε > 0, a positive integer N can be found such that

|D(x)−n∑i=1

w(i)N (x;µi,Pi)|dx ≤ ε, (5.36)

for n ≥ N. This result shall be used to establish bounds for the error in the Gaussian ap-

proximation to the posterior intensity.

5.3.1 Initialisation

It is assumed that the initial intensity is known. By Theorem 1, this initial intensity can

be approximated to any arbitrary degree of accuracy, so that, for any bounded measurable

function ϕ and any given ε0 > 0, there is a positive integer J such that

|〈D0 −DJ00 ,ϕ〉| ≤ ε0‖ϕ‖∞, (5.37)

for any J0 > J, using Holder’s Inequality where

‖D0 −DJ00 ‖1 ≤ ε0. (5.38)

The notation vJ is used to denote the Gaussian mixture approximation to the density v,

where J is the number of Gaussians in the mixture.

5.3.2 Prediction Equation

Let us assume that the approximation of the posterior intensity, DJt−1t−1 , by a sum of Gaussians

converges uniformly to the true posterior intensity Dt−1. Then, given any εt−1 > 0, an

integer J can be found such that

|〈Dt−1 −DJt−1t−1 ,ϕ〉| ≤ εt−1‖ϕ‖∞, (5.39)

for Jt−1 ≥ J, using Holder’s Inequality.

LEMMA 14 After the prediction step, there exist real numbers bt|t−1, dt and et|t−1 such that

|〈DJt|t−1t|t−1−Dt|t−1,ϕ〉| ≤ (bt|t−1εt−1 +dt + et|t−1)‖ϕ‖∞, (5.40)

where dt and et|t−1 are dependent on the models for the spontaneous birth and spawned

target models.

Expanding the prediction density using equation (5.33) and using the triangle inequality,

|〈DJt|t−1t|t−1−Dt|t−1,ϕ〉| ≤ |〈(pS,tDt−1 ft|t−1)

Jt−1 − pS,tDt−1 ft|t−1,ϕ〉|

+ |〈(Dt−1βt|t−1)Jt−1Jβ,t −Dt−1βt|t−1,ϕ〉|+ |〈γJγt

t − γt ,ϕ〉| (5.41)

Taking the first term on the right hand side, which concerns the predicted intensity for

existing targets, adding and subtracting 〈pS,tDJt−1t−1 , ft|t−1ϕ〉, and using the triangle inequality

again we get

|〈(pS,tDt−1 ft|t−1)Jt−1 − pS,tDt−1 ft|t−1,ϕ〉| ≤ |〈(pS,tDt−1 ft|t−1)

Jt−1 ,ϕ〉−〈pS,tDJt−1t−1 , ft|t−1ϕ〉|

+ pS,t |〈DJt−1t−1 −Dt−1, ft|t−1ϕ〉|, (5.42)

the first term on the right hand side is zero due to the linear Gaussian prediction model.

Moreover,

( ft|t−1ϕ)(xt−1) =Z

ft|t−1(xt |xt−1)ϕ(xt)dxt (5.43)

≤ ‖ϕ‖∞

ft|t−1(xt |xt−1)dxt

= ‖ϕ‖∞

where the last equation follows from the fact that ft|t−1(xt |xt−1) is a transition density.

Hence,

|〈DJt−1S,t|t−1− pS,tDt−1 ft|t−1,ϕ〉| ≤ pS,t‖ϕ‖∞εt−1, (5.44)

for Jt−1 ≥ J.

Now consider the birth model; there exists a constant dt and integer J such that

|〈γJγ,tt − γt ,ϕ〉| ≤ dt‖ϕ‖∞, (5.45)

for Jγ,t ≥ J, since we assume that we can model this exactly.

Finally, for the spawned target model, adding and subtracting 〈DJt−1t−1 ,β

Jβ,tt|t−1ϕ〉 and ap-

plying the triangle inequality gives

|〈(Dt−1βt|t−1)Jt−1Jβ,t −Dt−1βt|t−1,ϕ〉| ≤ |〈(Dt−1βt|t−1)

Jt−1Jβ,t ,ϕ〉−〈DJt−1t−1 ,β

Jβ,tt|t−1ϕ〉|

+ |〈DJt−1t−1 ,β

Jβ,tt|t−1ϕ〉−〈Dt−1,βt|t−1ϕ〉|, (5.46)

the first term on the right is zero due to the linear Gaussian spawned target model. Using an

argument similar to the prediction for existing targets, equation 5.43, there exists a number

et|t−1, such that the second term is less than or equal to et|t−1‖ϕ‖∞, for Jβ,tJt|t−1 ≥ J. This

number, et|t−1, is dependent on the L1 norm of the spawned target intensity, ‖βt|t−1‖1. The

lemma is proved by combining the three results above and setting bt|t−1 = pS,t

5.3.3 Measurement Equation

Let us assume that the approximation of the prediction intensity, DJt|t−1t|t−1, by a sum of Gaus-

sians converges uniformly to the true prediction intensity Dt|t−1. Then, using the same

arguments as in (5.37), we have for any εt|t−1 > 0, an integer J can be found such that

|〈DJt|t−1t|t−1−Dt|t−1,ϕ〉| ≤ εt|t−1‖ϕ‖∞, (5.47)

LEMMA 15 After the measurement update step, there exists a real number bt , dependent

on the number of measurements such that

|〈DJtt −Dt ,ϕ〉| ≤ btεt|t−1‖ϕ‖∞. (5.48)

We assume that the predicted intensity Dt|t−1 is non-zero. This is a reasonable assumption

as there would be no intensity to update when the measurements are received if it were zero.

Using the convergence result for the particle PHD filter (Lemma 9), we have the inequality

|〈vJtt −Dt ,ϕ〉| ≤ (1− pD,t)|〈D

Jt|t−1t|t−1−Dt|t−1,ϕ〉| (5.49)

+ ∑z∈Zt

1〈Dt|t−1,ψt,z〉

‖ϕ‖∞

∣〈Dt|t−1−DJt|t−1

t|t−1,ψt,z〉∣

∣+∣

∣〈DJt|t−1

t|t−1−Dt|t−1,ϕψt,z〉∣

using the assumption, we find that it is less than or equal to

εt|t−1‖ϕ‖∞

(1− pD,t)+ ∑z∈Zt

(2‖ψt,z‖∞)

, (5.50)

so that the lemma is proved with

(1− pD,t)+ ∑z∈Zt

(2‖ψt,z‖∞)

(5.51)

5.4 Pruning and Merging of Gaussian components

Since the number of Gaussians used to represent the Gaussian mixture increases at each

time step, methods are required to ensure that the complexity of the algorithm is controlled.

This is achieved through pruning, to eliminate the Gaussians with low weights, and merg-

ing, to combine Gaussians with similar means [33]. This section considers the errors

introduced in these stages and shows that they can be controlled. The first approximation

shows that a bound can be placed on the error introduced by eliminating terms with neg-

ligible weights, using a result for the Gaussian sum filter [40]. The second approximation

arises from the tendency of many terms to converge to the same result so that they can be

combined by adding their weights. When two terms are approximately equal, a bound on

the error can be introduced so that the errors introduced in the merging stage are within

tolerable limits, using another result for the Gaussian sum filter [43].

The number of Gaussian components used to represent the Gaussian mixture increases

without bound; at time t, the posterior intensity requires

(Jt−1(1+ Jβ,t)+ Jγ,t)(1+ |Zt|) = O(Jt−1|Zt |) (5.52)

Gaussian components, where |Zt | is the number of measurements at time t and O(·) rep-

resents the asymptotic complexity. Clearly this has implications for the complexity of the

algorithm, so it would be useful to reduce the total number of components required to

represent the PHD. To alleviate these problems, components with small weights, w(i)t , are

pruned, and components with similar means, m(i)t ≈ m( j)

t , are merged. The full procedure

is given in Vo and Ma [33]. It is shown here that bounds can be put on the L1 error when

these methods are used.

5.4.1 Pruning

The pruning stage of the algorithm allows us to drop terms with negligible weights. It is

shown here that the error introduced in this stage can be bounded. Suppose that the posterior

intensity at time t is given by the sum of Gaussians,

Dt(x) =Jt

∑i=1

w(i)t N (x;m(i)

t ,P(i)t ). (5.53)

Assume, without loss of generality, that the components with indices i = 1, . . . ,NP are those

with weights, w(i)t , less than some specified threshold δ1. Prune these components and

replace Dt(x) by DPt (x),

DPt (x) =

∑Jtl=1 w(l)

∑Jtj=NP+1 w( j)

∑i=NP+1

w(i)t N (x;m(i)

t ,P(i)t ), (5.54)

where components with indices i = NP +1, . . . ,Jt are the surviving components. The fol-

lowing bound can be established (from Sorenson and Alspach [40]),

‖Dt −DPt ‖1 ≤ 2

∑i=1

w(i)t ≤ 2NPδ1. (5.55)

This shows that the L1 error can be selected to fall within specified bounds for the pruning

stage of the algorithm.

5.4.2 Merging

Several methods for Gaussian mixture reduction using merging techniques have been pro-

posed for Gaussian sum filters. The first of which was derived by Alspach who provided

an L1 error bound for approximating two Gaussian components with the same covariance

as follows [43]. Suppose that two components have the same covariance Pt := P(1)t = P(2)

and similar means, m(1)t ≈ m(2)

t , so that for some threshold δ2,

(m(1)t −m(2)

t )T P−1t (m(1)

t −m(2)t ) ≤ (δ2)

2. (5.56)

Consider approximating Dt(x) by

DMt (x) =

∑i=3

w(i)t N (x;m(i)

t ,P(i)t )+ w(l)

t N (x; m(l)t ,Pt), (5.57)

where the weight and mean of the new component are given by

w(l)t = w(1)

t +w(2)t , (5.58)

m(l)t =

t(w(1)

t m(1)t +w(2)

t m(2)t ), (5.59)

then the following bound holds,

‖Dt −DMt ‖1 ≤

2w(1)t w(2)

w(1)t +w(2)

tδ2. (5.60)

Note that as the covariance decreases, the distance between the terms must also decrease to

retain the same bound. Unfortunately, this requires that both of the covariance matrices are

the same which may be an unrealistic assumption.

Salmond proposed two techniques for merging Gaussian components named Joining

and Clustering algorithms [65]. In the Joining algorithm, the two components, i and j,

which are closest using the distance measure

δ23 =

w(i)t w( j)

w(i)t +w( j)

t(m(i)

t −m( j)t )T P−1

t (m(i)t −m( j)

t ) (5.61)

are merged, where Pt is the covariance of the entire mixture. It was shown that the minimum

distance increases monotonically as the reduction proceeds, and that it is bounded by the

dimension of the state space where a threshold is chosen to be a constant fraction of this.

In the Clustering algorithm, the Gaussians with the largest weights are chosen as principal

components which define cluster centres. The covariance in equation (5.61) is replaced

with the covariance of the principal component and components in set L within a specified

threshold can be merged with the following calculations to preserve the overall covariance

of the cluster.

w(`)t = ∑

i∈Lw(i)

t , (5.62)

m(`)t = 1

∑i∈L

w(i)t m(i)

t , (5.63)

P(`)t = 1

∑i∈L

w(i)t (P(i)

t +(m(`)t −m(i)

t )(m(`)t −m(i)

t )T ). (5.64)

This procedure was used in the original formulation of the Gaussian mixture PHD filter [33]

and is appropriate since the intensity is multi-modal, where the principal components rep-

resent the expected target states.

Williams developed a reduction algorithm which considered the overall change in the

probability distribution by evaluating the cost of each possible action and selecting the one

which has the minimum effect on the extire mixture in an L2 sense [66]. The components

are merged with equations (5.62-5.64) above, which preserves the mixture mean and co-

variance. This is good for probability distribution but it may not be desirable for intensities

as this has the effect of smearing out the modes.

5.5 Non-linear Target Dynamic Models

This section considers the convergence for the nonlinear extensions of the Gaussian mixture

PHD filter proposed in [33]. As with the linear case, the survival and detection probabilities

are assumed constant and the intensities of the birth and spawned target intensities are

Gaussian but the state and observation processes can be relaxed to the nonlinear model:

xt = ϕt(xt−1,νt−1), (5.65)

zt = ht(xt ,εt), (5.66)

where ϕt and ht are known nonlinear functions, νt−1 and εt are zero-mean Gaussian pro-

cess noise and measurement noise with covariances Qt−1 and Rt respectively. Due to the

nonlinearity of ϕt and ht , the posterior intensity can no longer be represented as a Gaussian

mixture. However, the proposed Gaussian mixture PHD filter can be adapted to accommo-

date models with mild nonlinearities.

The results here show that the intensity function can be approximated by a set of ex-

tended Kalman filters where the covariance of each separate Gaussian component is suffi-

ciently small for the time evolution of its mean and covariance to be calculated accurately.

These are based on the results established for the Gaussian sum filter [44]. In a low noise

environment, the EK PHD filter can be nearly optimal. In a high noise environment, it may

be necessary to reinitialise the algorithm such that the error covariance of each Gaussian is

sufficiently small. If these conditions can not be met, then it may be more appropriate to use

the particle PHD filter [17], which can use non-linear dynamic models and non-Gaussian

state and observation noises, although this will result in a higher computational complexity.

We now establish the conditions for uniform convergence of the extended Kalman (EK)

PHD filter. It is shown that, as the covariance term tends to zero, the approximation is

optimal. In addition, convergence for the Unscented Kalman (UK) PHD filter is discussed.

5.5.1 Extended Kalman Prediction Equation

Using the PHD prediction equation,

Dt|t−1(x) =

φt|t−1(x,ζ)Dt−1(ζ)dζ+ γt(x), (5.67)

we show that the predicted intensity for the EK PHD filter can be given by a sum of Gaus-

sians. The extended Kalman prediction tools for existing targets are given by

w( j)S,t|t−1 = pS,t w( j)

t−1, (5.68)

m( j)S,t|t−1 = ϕt(m( j)

t−1,0), (5.69)

P( j)S,t|t−1 = G( j)

t−1Qt−1[G( j)t−1]

T +F ( j)t−1P( j)

t−1[F( j)

t−1]T , (5.70)

F ( j)t−1 =

∂ϕt(xt−1,0)

∂xt−1

xt−1=m( j)t−1

,G( j)t−1 =

∂ϕt(m( j)t−1,νt−1)

∂νt−1

νt−1=0

. (5.71)

LEMMA 16 If we know the dynamic model, and the posterior intensity at time t − 1 is

given by the sum of Gaussians,

Dt−1(x) =Jt−1

∑i=1

w(i)t−1N (x;m(i)

t−1,P(i)t−1), (5.72)

then the predicted intensity approaches a sum of Gaussians in L1,

DEKt|t−1(x) → DS,t|t−1(x)+Dβ,t|t−1(x)+ γt(x), (5.73)

as P(i)t−1 → 0. 1

We assume that we know the birth intensity, γt , so that by Theorem 10, we can represent

this by a sum of Gaussians as closely as we wish in L1,

γt(x) =Jγ,t

∑i=1

w(i)γ,t N (x;m(i)

γ,t ,P(i)γ,t ). (5.74)

For the existing targets, using the intensity at time t − 1, Dt−1(x), and the extended

Kalman filter prediction equations, we obtain an approximate expression for the predicted

estimate of each Gaussian component, N (x;m(i)t−1,P

(i)t−1), to a new Gaussian component,

N (x;m(i)S,t|t−1,P

(i)S,t|t−1). Then using the result for the EK Gaussian Sum filter [44], we find

DEKS,t|t−1(x) → pS,t

Jt−1

∑j=1

w( j)t−1N (x;m( j)

S,t|t−1,P( j)S,t|t−1), (5.75)

uniformly in x as P(i)t−1 → 0 for i = 1, . . . ,Jt−1, where

m( j)S,t|t−1 = ϕt(m( j)

t−1,0), (5.76)

P( j)S,t|t−1 = G( j)

t−1Qt−1[G( j)t−1]

T +F( j)t−1P( j)

t−1[F( j)

t−1]T . (5.77)

Finally, we come to the predicted intensity for spawned targets, βt|t−1(x|ζ). Using the

PHD prediction equations for the EK PHD filter, each of the Gaussian components at time

1The EK superscript refers to the extended Kalman approximation.

t −1 produces Jβ,t Gaussian components,

βt|t−1(x|m(i)t−1) =

∑l=1

w(l)β,tN (x;F(l)

β,t−1m(i)t−1 +d(l)

β,t−1,Q(l)β,t−1). (5.78)

Similar to the result used for the prediction of existing targets, the sum over the Jt−1 com-

ponents approaches a Gaussian sum

DEKβ,t|t−1(x) →

Jt−1

∑j=1

w( j)t−1w(`)

β,t N (x;m( j,`)β,t|t−1,P

( j,`)β,t|t−1), (5.79)

m( j,`)β,t|t−1 = F (`)

β,t−1m( j)t−1 +d(`)

β,t−1, (5.80)

P( j,`)β,t|t−1 = G( j)

β,t−1Qβ,t−1[G( j)β,t−1]

T +F ( j)β,t−1P( j)

β,t−1[F( j)β,t−1]

T (5.81)

5.5.2 Extended Kalman Measurement Update

Using the EK PHD filter measurement update equation,

Dt(x) = [1− pD,t(x)]Dt|t−1(x)+ ∑z∈Zt

ψt,z(x)Dt|t−1(x)κt(z)+

ψt,z(ξ)Dt|t−1(ξ)dξ, (5.82)

we show that the posterior intensity converges to a sum of Gaussians uniformly in L1. The

PHD update components are given by

S( j)t = U ( j)

t Rt [U ( j)t ]T +H( j)

t P( j)t|t−1[H

( j)t ]T , (5.83)

K( j)t = P( j)

t|t−1[H( j)t ]T [S( j)

t ]−1, (5.84)

P( j)t|t = [I −K( j)

t H( j)t ]P( j)

t|t−1, (5.85)

H( j)t =

∂ht(xt ,0)

xt=m( j)t|t−1

,U ( j)t =

∂ht(m( j)t|t−1,εt)

∂εt

. (5.86)

LEMMA 17 With the non-linear measurement equation zt = ht(xt ,εt) and the predicted

intensity given by the sum of Gaussians,

Dt|t−1(x) =

Jt|t−1

∑i=1

t|t−1,P(i)t|t−1), (5.87)

the updated density approaches the Gaussian sum

DEKt (x) → (1− pD,t)Dt|t−1(x)+ ∑

z∈Zt

DD,t(x;z), (5.88)

uniformly in xt and Zt as P(i)t|t−1 → 0 for i = 1, . . . ,Jt|t−1.

Clearly the term on the left of the measurement equation, [1− pD,t(x)]Dt|t−1(x), is a Gaus-

sian sum, since the probability of detection pD,t(x) = pD,t is assumed to be a constant and

Dt|t−1(x) is given by the predicted intensity.

Taking the numerator of the term on the right inside the summation and using the pre-

dicted intensity,

ψt,z(x)Dt|t−1(x) = pD,tgt(z|x)vEKt|t−1(x) (5.89)

= pD,tN (z;ht(x),Rt)Jt|t−1

∑i=1

t|t−1,P(i)t|t−1), (5.90)

(which, by Anderson and Moore [44] pp 215-216)

→ pD,t

Jt|t−1

∑i=1

w(i)t|t−1N (z;η(i)

t|t−1,Rt +H(i)Tt P(i)

t|t−1H(i)t )N (x;m(i)

t|t ,P(i)t|t ), (5.91)

uniformly as P(i)t|t−1 → 0 for all i = 1, . . . ,Jt|t−1.

Now consider the denominator,

κt(z)+Z

ψt,z(ξ)Dt|t−1(ξ)dξ. (5.92)

Taking the integral and the L1 convergence result discussed above,

ψt,z(ξ)Dt|t−1(ξ)dξ = pD,t

N (z;ht(ξ),Rt)Jt|t−1

∑i=1

w(i)t|t−1N (ξ;m(i)

t|t−1,P(i)t|t−1)dξ (5.93)

→ pD,t

Jt|t−1

∑i=1

w(i)t|t−1q( j)

t (z)N (ξ;m(i)t|t (ξ),P(i)

t|t )dξ, (5.94)

q( j)t (z) = N (z;η(i)

t|t−1,H(i)Tt P(i)

t|t−1H(i)t +Rt). (5.95)

Changing the order of the summation and integral, this is equal to

Jt|t−1

∑i=1

w(i)t|t−1

q(i)t (z)N (ξ;m(i)

t|t (ξ),P(i)t|t )dξ (5.96)

= pD,t

Jt|t−1

∑i=1

w(i)t|t−1q(i)

t (z)N (x;m(i)t|t ,P

(i)t|t ), (5.97)

so that,

Dt(x) → (1− pD,t)Dt|t−1(x)+ pD,t

Jt|t−1

∑i=1

w(i)t|t−1q(i)

κt(z)+ pD,t ∑Jt|t−1l=1 w(l)

t|t−1q(l)t (z)

N (x;m(i)t|t ,P

(i)t|t ),

(5.98)

uniformly as P(i)t|t−1 → 0 for all i = 1, . . . ,Jt|t−1, where

m( j)t|t (z) = m( j)

t|t−1 +K( j)t (z−ht(m( j)

t|t−1)) (5.99)

5.5.3 The Unscented Kalman PHD Filter

Instead of linearising the model, as is the case with the extended Kalman filter, the un-

scented Kalman filter [50] approximates the mean and covariance with a set of sigma points

using the unscented transform. It can be shown that the predicted mean converges to an esti-

mate which is accurate to a second order, which is more accurate than the estimate given by

the extended Kalman filter, and that the predicted covariance converges to the same as that

estimated through linearisation using the EKF. In the unscented PHD filter, the unscented

transform is applied in the prediction step to each term in the Gaussian mixture and the up-

date step is the same as the Gaussian mixture PHD filter update. The convergence analysis

of the UK PHD filter is omitted here, and the interested reader is referred to the work by

Julier and Uhlmann [39] for an analysis of the convergence of the unscented Kalman filter.

5.6 Conclusions

A consequence of Wiener’s Theory of Approximation is that density functions can be ap-

proximated uniformly with a sum of Gaussians. This result has been used to show that the

error for the recently proposed Gaussian mixture PHD filter converges uniformly for each

of the steps in the algorithm. Error bounds have been provided for the pruning and merging

stages, which are used to reduce the number of Gaussian components, based on those es-

tablished for the Gaussian sum filter. These results give further theoretical justification for

the use of the Gaussian mixture PHD filter in multiple target tracking problems.

Proofs of uniform convergence are also derived for the extended Kalman PHD filter.

The accuracy of the unscented Kalman PHD filter is discussed as an extension to the results

already established for the unscented Kalman filter.

Chapter 6

PHD Filter Target Estimation in Sonar

Images

6.1 Introduction

One of the goals of the sonar research community is to develop Autonomous Underwater

Vehicles (AUVs), self-navigating robots which operate underwater. Such vehicles can be

equipped with a range of sensors including forward-look sonar, sidescan sonar and video

to enable them to navigate autonomously and undertake a range of missions, for example

mine countermeasures, pipeline inspection or seabed habitat mapping. To enable AUVs to

do this successfully, methods for detecting and tracking objects on the seabed are required

to aid path planning and navigation as well as using these techniques as an integral part of

the mission. The obvious initial application is to enable the vehicle to sense its environment

and prevent collision with any object. Although AUVs are typically equipped with inertial

navigation systems, they are prone to drifting and errors in the measured vehicle position

increase during the mission.

This chapter demonstrates an application of the Particle PHD Filter from chapter 4

for estimating a variable number of targets in a sequence of sonar images in the presence

of clutter. The objects of interest to be tracked will either be stationary on the seabed

or moving through the water. The stationary objects will be moving with respect to the

AUV’s frame of reference image plane as it is the vehicle which is moving. Tracking the

stationary objects can aid registration of the sequence of images generated by the forward-

look sonar which could be useful for concurrent mapping and localisation of the underwater

terrain [67], AUV path planning [3] and navigation [4].

Traditional multi-target tracking is based on coupling trackers such as Kalman fil-

ters, extended Kalman filters or particle filters with a data association technique (see Bar-

Shalom [5] for example). The aim of the data association process is to interpret which

measurements are due to the targets and which are due to false alarms. Another technique

which has been applied to sonar imagery uses Optical Flow calculations to estimate direc-

tion motion [68].

The PHD Filter is a method of propagating a multi-modal measure within a unified

framework without associating the measurements and has the ability to estimate the number

and position of targets in data with clutter. Data association techniques are avoided as the

identities of the targets are not kept. This is a drawback of the PHD Filter tracker as often

continuity of identity is needed. Novel techniques for this shall be presented in chapter 7.

One of the advantages of the PHD Filter is its ability to track objects in clutter, which is

often the case in sonar data where there are many spurious measurements due to the noisy

data. The measurements are taken in the sonar reference plane so that a stationary object in

the global or world reference plane will be moving with respect to the underwater vehicle.

Whilst many of the objects to be tracked will be in the world reference plane, there could

also be moving objects which it may be necessary to track such as other vehicles, marine

mammals or fish. Thus the ability to track a variable number of targets in the presence of

missed detections and spurious measurements is advantageous in this application.

This section describes the method for multiple target tracking which has been imple-

mented for forward-looking sonar data. The Particle PHD-Filter can be viewed as an ex-

tension of the single target particle filter with the ability to track multiple targets without

data association. An application version of the Particle PHD-Filter from chapter 4 is given

here in pseudocode form. The system model describes the evolution of state with time i.e.

the motion of the underwater vehicle and the measurement model relates the measurements

to the state i.e. the objects on the seabed. New targets are introduced into the model by

the birth model which assigns M uniformly distributed particles at the end of the sector

for incoming objects into the FoV according to the rate in which the sonar moves across

the seabed (the new targets are assumed to come at the end of the sector as this is the new

section of seabed surveyed). Again, the particles represent a hypothesis about speed and

position of a target although they are not specifically attached to any particular target. In the

target location estimation stage the number of targets is estimated by taking the sum of all

particle weights, the nearest integer value is taken to be the number of targets. A Gaussian

mixture model is fitted to the data, to determine the target locations.

The Particle PHD Filter algorithm is first shown working on a simulated target trajectory

where the accuracy of the tracker can be determined. It is then demonstrated using real

data to show that this technique is applicable in a real scenario. The sonar data returns

from objects on the seabed have a much higher intensity than the surrounding region of

seabed due to a combination of higher reflectivity properties and the geometry of the sonar

imaging process which will result in multiple returns at the same time. The measurements

for the tracker have been obtained by thresholding the images and finding the centroid of

the regions above the threshold. The signal to noise ratio for objects is high so this simple

technique is effective for finding the target locations although there is often a large number

of spurious measurements. However, as will be demonstrated in the results, the estimated

positions converge to the actual target locations despite the false alarms.

6.2 The Particle PHD Filter with State Estimation

The sequential Monte Carlo implementation of the PHD filter described in chapter 4 has

been implemented for multiple target state estimation of obstacles in sonar data. The algo-

rithm is initialised by distributing particles across the observation space, or field of view,

with randomly chosen states. The particles are predicted with the linear state equation into

the next time step and Gaussian noise is added. When the measurements are received, the

particle weights are updated with the PHD filter update equation. The sum of the weights

after the update step provides an estimate of the number of targets in the scene which is

used for the multiple target state estimation. The particles are resampled according to their

weights and reweighted so that each particle has the same weight. The multiple target

states are estimated using the Expectation-Maximisation algorithm to fit a Gaussian Mix-

ture model to the particle data to find the state estimates and covariances. The number of

Gaussian components fitted to the particles is the expected number of targets determined

from the total particle mass, or sum of the weights after the PHD update step.

Particle PHD filter Algorithm Implementation

step 0. (Initialisation, at t = 0.)

for i = 1, . . . ,N0

sample x(i)0 ∼ D0|0, the prior PHD .

assign particle weight, ω(i)0 , the mass ω(i)

0 = T0/N.

set t = 1.

step 1. (Prediction Step, for t ≥ 1.)

for i = 1, ..,Nt−1

Project particle with state equation ft|t−1(xt |ξit−1).

Assign weight ωit|t−1 = pS/N according to their probability of survival pS which

is dependent on the position in the FoV.

Introduce M particles at end of FoV for birth model

Assign weight ωit|t−1 = pB/M where pB is the probability of target birth.

Let Rt = Nt−1 +M.

step 2. (Update Step, for t ≥ 1.)

for z ∈ Zt ,

compute 〈ωt|t−1,ψt,z〉 = ∑Rti=1 ψt,z(x(i)

t )ω(i)t|t−1.

for i = 1, . . . ,Rt ,

update weights, ω(i)t =

(1− pD +∑z∈Ztψt,z(x

(i)t ))

κt(z)+〈ωt|t−1,ψt,z〉

ω(i)t|t−1.

step 3. (Resampling Step, for t ≥ 1.)

Compute the total particle mass, Tt = ∑Rti=1 ω(i)

set Nt = N.int(Tt) (where int(Tt) is the nearest integer to Tt ).

for i = 1, . . . ,Rt ,

Resample

ω(i)t

Tt, x(i)

i=1to get

Tt/N,x(i)t

The particles each have weight Tt/N after resampling.

step 4. (Target Estimation, for t ≥ 1.)

The locations of the targets are found by fitting a Gaussian mixture model to theparticles where the number of mixture components is the expected number of targetsat time t, Tt = ∑i ωi

6.3 Forward-Looking Sonar Implementation

The forward-looking sonar can be considered as having k beams, where the angular dis-

tance between their central axes is δθ degrees. The data for each of the beams is in the

form of acoustic intensity against time. The time values are related to the slant range to

the object. If isovelocity conditions are assumed then they can be translated into range

measurements. The measurements taken from the sonar are in polar co-ordinates and so

the tracker implemented for forward-scan sonar will track range and bearing measurements

obtained from thresholded sonar images.

The following state space model is used:

1 1 0 0

0 1 0 0

0 0 1 1

0 0 0 1

xt−1 +

vt−1, (6.1)

arctan(yt/xt)

(xt)2 +(yt)2

+nt . (6.2)

vt and nt are the process and measurement noises respectively, which are uncorrelated.

The state vector is defined as the 2D position and velocity vector of the target, relative to a

fixed external reference frame:

xt xt yt yt

)T. (6.3)

The observation at time t (ot) is the bearing angle and range from the fixed observer

towards the target. Although more complex noise models can be used, Gaussian observation

and state noise distributions have been used for initial investigation with the filter.

The motion of the sonar is assumed to be linear for the particle filter and so the objects

are moving towards the sonar. The FoV is a sector of 10 degrees with range 20m to 60m.

Due to the linear motion of the sonar and the FoV, new objects are most likely to appear

at the end of the sector and disappear at the beginning with the objects moving towards

the sonar and so the probabilities of birth pB and survival pS have been defined to reflect

this. The birth model will allow for new targets entering the FoV by distributing particles

uniformly at the end of the sector.

6.3.1 Simulated Data

To demonstrate the performance of the technique the tracker is firstly tested on simulated

data. The advantage of using simulated data is that it allows various different realistic

scenarios and trajectories to be created easily. The exact locations of the vehicle and objects

are known and thus the accuracy of the tracker can be determined.

A sequence of forward-looking sonar images is simulated using the Sonar Simulator

developed by Bell [69] which has the capability of modelling sonar in complex underwater

terrain. The artificial seabed is modelled by a 100× 100m2 textured image. Spherical

shaped objects of radius 0.5m have been placed on the seabed.

Figure 6.1 shows the sequence of simulated sonar images from the above scenario,

where the highlights are created by the objects. The sonar has followed an approximately

linear trajectory with a small amount of deviation from this. For ease of display, the images

are shown on a rectangular grid although the data is in polar form. Blank lines separate the

images in the sequence.

The results of the tracking for the simulated sonar run have been displayed in the sonar

image reference frame and the global reference frame. In the sonar image reference frame

the objects are moving towards the sonar, figure 6.2 shows the measurements and estimated

positions with respect to the sonar. In the global reference frame, figure 6.3, the sonar

positions are marked on a global co-ordinate map and the objects are stationary. The actual

positions are shown along with the measurements and estimated positions. The tracker

estimates well the number of targets in each image which is between 1 and 4 targets. There

are a few outliers where the position has wrongly estimated a target location, this can often

happen in the initialisation stage where the particles have been uniformly spread and the

distribution of particles are not sufficiently localised onto the target.

Obtaining accurate navigation information can be a significant problem during AUV

missions. However, this technique is still robust when no navigation information is present.

The same scenario as above has been repeated but the sequence of sonar images have been

simulated with the AUV following a sinusoidal trajectory. The tracking was then repeated

with the system having no knowledge of the actual motion, and the results in the global

reference frame have been displayed in figure 6.4. The measurements taken were not as

accurate as the linear path (figure 6.3) but the tracker seems to perform well tracking the

measurements with only a few false estimates.

Figure 6.1: Sequence of Simulated Forward Scan Sonar Images with Objects: range (xaxis), bearing (y axis)

0 10 20 30 40 50 60−2.5

−1.5

−0.5

Measurements and Estimated Positions in Sonar Reference Frame

measured positionsinferred positionsobserver position

Figure 6.2: Linear Tracking in Sonar Image Reference Frame

0 10 20 30 40 50 6045

Motion of Sonar with Targets and Estimated Positions

target locationsmeasured positionsinferred positionssonar position

Figure 6.3: Linear Tracking in Global Reference Frame

0 10 20 30 40 50 6045

Motion of Sonar with Targets and Estimated Positions

target locationsmeasured positionsinferred positionssonar position

Figure 6.4: Sinusoidal Sonar Tracking in Global Reference Frame

6.3.2 Real Data

The tracker was then applied to a sequence of real forward-look sonar images. The data

was obtained from a forward-looking sonar device fitted to an underwater vehicle where

the sonar scans a sector of seabed in the direction of the vehicle motion. The sequence of

18 images (figure 6.5) was obtained as the vehicle was flown towards a cylindrical object

lying on the seabed. The images are very noisy but the object can be seen in the sequence as

the small bright highlight moving from top right to bottom left in the sequence. The seabed

over which the sonar traverses appears to be composed of two different sediment types, one

of which provided higher intensity returns. This can be seen in the region before the target

in the first 10 images of the sequence.

The results of the tracking is illustrated in figure 6.6. The measurements and tracked

positions are in the sonar reference frame, the global positions are unknown since no navi-

Figure 6.5: Sequence of Real Forward-Scan Images: range (x axis), bearing (y axis)

gation information was provided with the data and no accurate ground truth of the object’s

location was available. The location of the cylinder is given by the sequence of points from

the lower right hand region of figure 6.6, moving towards the centre of the figure, as the

vehicle moves closer to the object. In the first few images, there were a lot of false targets,

or clutter points, due to bad observations of the cylinder as a result of high intensity returns

from a region of seabed. These are the group of measurements in the top half of figure 6.6.

To show how well the implementation works with clutter, the tracker has been run on

the data forward in time (where there are a lot of clutter points initially and fewer in the

later images) and backward in time (where there are few clutter points initially and more in

the later images). This demonstration is to show how the initial conditions affect the per-

formance of the algorithm and the convergence as the clutter density at the start is different

in this sequence of images when run backward and forward. Running the algorithm in

the forward direction results in poor estimation initially but afterwards converges onto the

correct target location (see figure 6.6). When the algorithm is run on the data backwards,

the algorithm quickly converges onto the correct target and manages to predict the correct

location through the clutter (see figure 6.7).

These results show that there is a sequence of images where there are few false alarms

and then encounter a cluttered region then the algorithm can predict the correct target but

it works poorer if the cluttered region is at the start. This can be expected as the distri-

bution of the particles is propagated from one frame to the next and so if the particles are

predominantly located in the region of the true target they are more likely to track it well.

0 10 20 30 40 50 60 70 80−8

PHD Tracking Example on Real Sonar Data

Figure 6.6: Tracked Cylinder in Forward Direction

6.4 Discussion

An application of the particle PHD filter has been implemented for estimating a variable

number of objects in a sequence of forward-looking sonar images. The filter was shown

working on simulated data where the results could be displayed on a global map and then

on real sonar data with clutter for tracking a cylindrical object on the seabed. The simu-

lated data provided a test case scenario where accurate ground truth was available, but the

simulated sonar images contain significantly less noise and clutter than real data.

This technique also has the capacity to incorporate measurements obtained from other

sensing equipment such as video data although the implementation here is restricted to

sonar.

The identities of the objects are not determined in this implementation and so data

association techniques are not used. In many applications, knowledge of which target in

0 10 20 30 40 50 60 70 80−10

PHD Tracking Example on Real Sonar Data

Figure 6.7: Tracked Cylinder in Backward Direction

the current frame relates to which target in the previous frame is important and so the data

association problem needs to be addressed. Methods for incorporating this into the PHD

filter framework are presented in chapter 7. One of the advantages of the PHD Filter is its

ability to filter clutter and so the number of spurious measurements is reduced. Sonar data

is very noisy which gives rise to many spurious measurements and the PHD Filter copes

well with this.

Chapter 7

State Estimation and Track Continuity

for the Particle PHD Filter

7.1 Introduction

The output from the PHD filter provides a multimodal density from which we need to

estimate the states of the targets at each iteration. This chapter considers and compares

the different approaches for clustering data.We consider two techniques for estimating the

target states at each iteration, namely k-means clustering and mixture modelling via the

expectation-maximization algorithm.

The advantage of the PHD filter is that it can track a variable number of targets, estimat-

ing both the number of targets and their locations. It avoids the need for data association

techniques, since the identities of the individual targets are not required. Whilst this may

be advantageous if the main concern is where the targets are, it is a major drawback if it is

necessary to identify the trajectories of the different targets. Two methods for associating

the targets between frames have been reported in the literature. The first of these, by Panta

et al. [29], used the PHD filter for pre-filtering the data input to a Multiple Hypothesis

Tracker. The second technique, proposed by Lin [30], represents the PHD in a resolution

cell to differentiate the peaks of the PHD posterior, and validation gating was used to de-

termine the weights of the particles. The PHD filter estimated the number and locations of

the targets and the results of data association determined the peaks of the PHD.

This chapter presents two techniques for enabling track continuity with the particle PHD

filter based on the pseudo-code for the Particle PHD filter given in the table 1.

Table I: Pseudo-code for the Particle PHD filter with track continuity.

step 0. (Initialization at t = 0.)

for i = 1, . . . ,N0

sample x(i)0 ∼ D0|0, the prior PHD .

assign particle weight, ω(i)0 , the mass ω(i)

0 = T0/N.

set t = 1 .

for i = 1, ..,Nt−1

sample x(i)t from a proposal density qt(.|x(i)

t−1,Zt).

evaluate the predicted weights ω(i)t|t−1 =

t−1)

qt(x(i)t |x(i)

t−1,Zt )ω(i)

t−1.

for i = Nt−1 +1, ..,Nt−1 +M

sample x(i)t from another proposal density pt(.|Zt).

compute the weights of newborn particles ω(i)t|t−1 = 1

Mγt(x

(i)t )

pt(x(i)t |Zt)

Let Rt = Nt−1 +M.

for z ∈ Zt ,

compute 〈ωt|t−1,ψt,z〉 = ∑Rti=1 ψt,z(x(i)

t )ω(i)t|t−1.

for i = 1, . . . ,Rt ,

update weights, ω(i)t =

(1− pD)+ ∑z∈Ztψt,z(x

(i)t ))

κt(z)+〈ωt|t−1,ψt,z〉

ω(i)t|t−1.

Compute the total particle mass, Tt = ∑Rti=1 ω(i)

set Nt = N.int(Tt) (where int(Tt) is the nearest integer to Tt ).

for i = 1, . . . ,Rt ,

Resample

ω(i)t

Tt, x(i)

i=1to get

Tt/N,x(i)t

The particles each have weight Tt/N after resampling.

step 4. (Target Estimation, for t ≥ 1.)

Target state estimates obtained from PHD by clustering (tables II and III).

step 5. (Association, for t ≥ 2.)

Estimates associated with existing target tracks,

tracks initiated/ deleted (tables V and VI).

7.2 Multi-Target State Estimation

This section addresses step 4 of the Particle PHD filter algorithm summarised in Table I,

namely estimating the target states from the particle representation of the PHD density. We

consider different techniques for clustering the data and compare the two preferred methods

for estimating the target states. The accuracy of the algorithms is measured against the true

target trajectories and the run-time is compared for each of the algorithms.

7.2.1 Cluster Analysis

The aim of cluster analysis is to separate data points into homogeneous groups, or clusters,

based on some discriminative criteria. Three main classes of technique are used in the lit-

erature [70]: agglomerative hierarchical clustering techniques, fitting mixture models, such

as the Expectation-Maximization (EM) algorithm, and optimization methods, such as the

k-means algorithm (e.g. [71]). If N is the number of particles, then the time complexity of

the hierarchical clustering algorithm is O(N2 logN) which is impractical for large numbers

of particles. If T is the number of targets and τ is the number of iterations in the clustering

algorithm, then the time complexity of k-means is O(τT N) and the time complexity of the

EM algorithm is O(τT 2N). These two techniques will be considered further.

Gaussian Mixture Modelling with the Expectation-Maximization Algorithm

Since we are using Gaussian noise for the state and observation equations, we can model the

posterior particle distribution as a multi-modal Gaussian and try to determine its parameters.

Let Tt be the estimated number of targets determined from the total particle mass. Then

the set of parameters which specify the Gaussian mixture is

θt = (πt,n,mt,n,St,n)Tkn=1, (7.1)

where the tuple θt,n = (πt,n,mt,n,St,n) contains the probability that a particle is in the nth

Gaussian and the mean and covariance of this Gaussian respectively. See Table II for a

description of this algorithm.

Clustering with the k-means Algorithm

The k-means clustering algorithm takes a set of points, in this case the particles, and sepa-

rates them into k partitions, Pt,1, ...,Pt,k, with means Mt = mt,1, . . . ,mt,k, called centres,

such that the mean squared distance from each point to its nearest centre is minimized.

One of the most common algorithms for k-means clustering is Lloyd’s algorithm which

is based on the observation that the best placement of a centre is at the centroid of the as-

sociated cluster. Each stage of Lloyd’s algorithm moves every centre, mt, j, to the centroid

of its partition, Pt, j, and then updates the partition by recomputing the distance from each

point to its nearest centre. These steps are repeated until a convergence criterion is met. See

Table III for a description of this algorithm.

Table II: The EM Algorithm (step IV of Table I).

given: particles x(1)t , . . . ,x(Nt )

t and estimated target number Tt .

step 0. (Initialization.)

For n = 1, . . . , Tt

Initialize (π(1)t,n ,m(1)

t,n ,S(1)t,n ) with

π(1)t,n = 1

m(1)t,n = x(i)

t , where i = b(k−1)(Nt −1)/(Tt −1)c+1,

S(1)t,n = 1

Nt ∑Nti=1 x(i)

t x(i)t

Compute p(x|n,θ(1)) = N(x;m(1)t,n ,S(1)

t,n ).

Set j := 2.

repeat:

step 1. (Expectation.)

For n = 1, . . . , Tt

Calculate (provisional) probability particle has Gaussian n, πt,n, mean, mt,n,and covariance, St,n:

πt,n = 1Nt ∑Nt

i=1 p(n|x(i)t ,θ( j))

mt,n = 1Nt πt,n ∑N

n=1 x(i)t p(n|x(i)

t ,θ( j))

St,n = 1Nt πt,n ∑Nt

n=1(x(i)t − mt,n)(x(i)

t − mt,n)t p(n|x(i)t ,θ( j))

Provisional Gaussian is p(x|n,θ( j)) = N(x;m( j)t,n , S( j)

t,n ).

Compute: Q(θ;θ(i)) = E[log p(x|θ)|x,θ( j)], by expanding theexpectation, = ∑Tt

n=1 log p(x,n|θ)p(n|x,θ( j)), using p(x,n|θ) =

p(n|x,θ)p(x|θ), = ∑Ttn=1 ∑Nt

i=1 log[πn p(x|θ)]p(n|x,θ( j)), and Bayes’ rule,= ∑Tt

n=1 ∑Nti=1 log[πn p(x|θ)] p(x|n,θ( j))πn

∑Ttl=1 p(x|l,θ(l))

step 2. (Maximization.)

Maximize Q(θ;θ( j)) with respect to θ ∈ Ω(K) using Lagrange multipliers,

(π( j+1),m( j+1),S( j+1)) = arg max(π,m,S)∈Ω(K)

Q(θ;θ( j)).

Set j := j +1.

until: |Q(θ;θ( j))−Q(θ;θ( j−1))| < ε, for specified threshold ε.

output: means and covariances of Tt partitions (xt,1,St,1), . . . ,(xt,Tt ,St,Tt ).

Table III: The k-means Algorithm. (step IV of Table I). given: particles

x(1)t , . . . ,x(Nt )

t and estimated target number Tt .

step 0. (Initialization.)

Choose k = Tt particles at random to be the initial centres, m(1)t,1 , . . . ,m(1)

t,Tt :=

x(k1)t , . . . ,x

(kTt )t .

Set j := 2.

repeat:

step 1. (Partition.)

Partition the particles, P( j)t,1 , . . . ,P( j)

t,Tt, such that x(i)

t ∈ P( j)t,1 if argminn ‖x(i)

t −m( j)

t,n ‖ = j.

step 2. (Recalculate centres.)

Calculate means m( j)t,1 = mean(P( j)

t,1 ).

Set j := j +1.

until: |∑Nti=1 ∑Tt

n=1 ‖x(i)t −m( j)

t,n‖−∑Nti=1 ∑Tt

n=1‖x(i)t −m( j−1)

t,n ‖| < εstep 3. (Calculate covariances of partitions.)

Calculate covariances m( j)t,1 = cov(P( j)

t,1 ).

output: means and covariances of Tt partitions (xt,1,St,1), . . . ,(xt,Tt ,St,Tt ).

7.2.2 Multi-target Miss Distance Metrics

To compare the accuracy of the two techniques (k-means and EM algorithms) for estimating

the target states, we need appropriate metrics. We consider both the Hausdorff distance and

the Wasserstein distance.

The Hausdorff distance is a common method for measuring the distance between two

sets, originating from pure mathematics. The Hausdorff distance provides a good means of

assessing overall localisation performance.

While the Hausdorff distance is good at assessing localization performance, it is insen-

sitive to different numbers of targets. Hoffman et al. [72] adopted the Wasserstein distance

from theoretical statistics as a means of defining a metric for multitarget distances that

penalizes estimation of an incorrect number of targets. It has been used to assess the per-

formance of the PHD filter [17] [29]; we will compare it with the Hausdorff distance. Table

IV describes the two metrics.

Table IV: Multi-target Miss Distance Metrics

Hausdorff Distance

Let Xt and Xt be the finite sets of target states and estimated target states at time t.

The Hausdorff distance between the two sets is defined as

dH(Xt , Xt) = max

maxxi∈Xt minx j∈Xt d(xi, x j),maxx j∈Xt minxi∈Xt d(x j,xi)

Wasserstein Distance

Let Xt and Xt be the finite sets of target states and estimated target states at time t.

The LP Wasserstein distance between the two sets is defined as

dWP (Xt , Xt) = infC

∑xi∈Xt ∑x j∈Xt Ci jd(xi, x j)P) 1

P , where C is an |Xt |× |Xt | matrixCi j such that

∀i = 1 . . . |Xt |, j = 1 . . . |Xt | :

∑|Xt |i=1 Ci j = 1

|Xt |, ∑|Xt |

i=1 Ci j = 1|Xt | , Ci j ≥ 0.

The L∞ Wasserstein distance is defined as

dW∞ (Xt , Xt) = inf

xi∈Xt ,x j∈XtCi jd(xi, x j),

where Ci j = 1 if Ci j > 0 and Ci j = 0 if Ci j = 0.

7.2.3 Simulated Examples

This section demonstrates results on estimating target locations from the estimated PHD

(step 4 of the algorithm). Trajectories of targets have been simulated, and noise has been

added to generate the measurements. The k-means and EM algorithms have been run on the

particle cloud outputs within the iteration of the PHD filter to obtain target estimates. The

run time has been measured for both of the algorithms at each iteration. To determine the

accuracy of the algorithms, we compute the Hausdorff and Wasserstein distances between

the estimates and the true trajectories, using the Euclidean distance to detemine individual

distances between targets. Comparing the error metrics between the outputs of the k-means

and EM algorithms will show if there is any significant difference in the performance.

For simplicity, our simulated examples use a linear Gaussian dynamics with the follow-

ing state space model:

1 T 0 0

0 1 0 0

0 0 1 T

0 0 0 1

xt−1 +

T 2/2 0

0 T 2/2

vt−1, (7.2)

and observation model:

1 0 0 0

0 0 1 0

xt +nt . (7.3)

vt and nt are the uncorrelated process and measurement noises, respectively.

The state vector is defined as the 2D position and velocity vector of the target:

xt xt yt yt

)T. (7.4)

Example 1

The first example demonstrates how the time complexities of the algorithms are affected by

increasing the number of targets. The program was implemented in C and run on a 2.8GHz

Mobile Pentium 4 HT with 512Mb system memory and 1Mb cache, although the code was

not designed to be optimal. We assign 50 particles per target and an additional 40 particles

for newborn targets randomly distributed across the state space. We start with one target,

and introduce another at increments of 30 iterations, until there are nine in total (see Figure

7.2). The graph in Figure 7.1 shows the time taken to estimate the target locations using the

EM and k-means algorithms. The EM algorithm has a quadratic complexity in the number

of targets, whereas the k-means algorithm has a linear complexity in the number of targets.

When the number of targets is low, i.e. between one and three, the computation time to

estimate is fairly similar, but as the number of targets increases, the EM algorithm rapidly

becomes infeasible for realtime operations.

Because there is no clutter in this example, the estimated number of targets is the same

as the number of measurements, and the Hausdorff and Wasserstein distances are the same.

The simulated measurements and estimated positions for the k-means and EM algorithms

are shown in Figures 7.3 and 7.4. By inspection, we can see that there are fewer spurious

estimates using the k-means algorithm. Figures 7.5 and 7.6 show the Hausdorff distances

for each of the algorithms, as well as the maximum measurement errors. The error between

the estimates and the true positions is generally better than the error between the measure-

ments and the true positions with both algorithms. Because we are introducing a new target

at every iteration multiple of 30, there are spikes in the Hausdorff distance corresponding

to a poor initial estimate for the target. This rapidly decreases to below the measurement

error after a few iterations, when all the targets are well estimated. The k-means algorithm

provides fewer spurious estimates than the EM algorithm (which can be seen through fewer

spikes in the Wasserstein distance). Overall, in this example, the k-means algorithm has a

much better run-time and is more accurate than the EM algorithm.

0 30 60 90 120 150 180 210 240 270

Iteration Number

Time for Target Estimation

k-meansEM algorithm

Figure 7.1: Target Estimation Example 1. Time Comparison: EM vs k-means.

Example 2

In our second example, we have four targets with an average of one clutter point per itera-

tion, distributed according to a Poisson model. Figures 7.7 and 7.8 show the measurements,

including clutter points, and target estimates for each of the algorithms. The PHD filter esti-

mated the correct number of targets (four) except in two of the iterations, when it estimated

that there were five targets. Figures 7.9 and 7.10 show that the Hausdorff distance for the

k-means algorithm is generally lower than the measurement error, but the EM algorithm has

estimated poorly in a couple of cases (spikes in the graph above the measurement error).

Figures 7.11 and 7.12 show the Wasserstein L∞ distance, which is the same as the Hausdorff

distance except in the couple of iterations where the PHD filter has incorrectly estimated

the number of targets, which are shown by the two spikes on the right of the graph. In this

case, all the weights are assigned to the outliers. Again, the k-means algorithm outperforms

the EM algorithm in run-time, with results similar to Figures 7.1 and 7.2 from iterations

90−120, where there are four targets.

0 30 60 90 120 150 180 210 240 270

Iteration Number

Number of Targets Estimate

target mass

Figure 7.2: Target Estimation Example 1 True number of targets from PHD filter.

7.2.4 PHD Filter Estimated Target Number

The estimated target number from the PHD filter is obtained by computing the total particle

mass (Table I, step 3). If this number is different to the actual number of particle clusters,

then incorrect state estimates may be obtained. A simple example of this with k-means is

when the estimated number of targets is 1 and the actual number of clusters is 2. The state

estimate in this case will be the mean, which will be somewhere between the two clusters.

Similar poor performance will be exhibited with the EM algorithm. When the number of

estimates is higher than the actual number, incorrect state estimates will be given.

In our examples, we have used a high probability of detection (pD ≈ 1) and low clutter.

We have assumed that the false alarms have been filtered out, which is only reasonable in

low uncorrelated clutter. In high clutter environments, smaller clusters of particles form

around false alarms and the estimated number of targets and their states can be incorrect.

A practical application on sonar data recently demonstrated that the PHD filter with track

continuity gives comparable performance to tracking with Kalman filters when the clutter

-800 -600 -400 -200 0 200 400 600 800 1000 1200

x position

Target Estimates

measurementsEM targets estimates

Figure 7.3: Target Estimation Example 1. Measurements and estimated positions: EMalgorithm.

-800 -600 -400 -200 0 200 400 600 800 1000 1200

x position

Target Estimates

measurementsk-means targets estimates

Figure 7.4: Target Estimation Example 1. Measurements and estimated positions: k-means.

0 30 60 90 120 150 180 210 240 270

Iteration Number

Hausdorff Target Error

measurement errorEM error

Figure 7.5: Target Estimation Example 1. Hausorff Distances: EM algorithm.

is low [26].

If the number of clusters is not known, it would be useful to be able to determine this

directly from the data. In Bouman’s unsupervised algorithm for modelling Gaussian mix-

tures [73], a measure of goodness of fit is found called the Rissanen criterion or Minimum

Description Length (MDL) estimator. This works by attempting to find the model order

which minimizes the number of bits required to code the data samples and parameter vector.

The final number of clusters chosen is the value which minimizes the MDL over possible

values of k.

A method of determining the correct number of clusters for the k-means algorithm is

called v-fold cross validation which computes the mean squared distance between the par-

ticles and their nearest target state estimate for each value of k. This is plotted against the

number of k, which exhibits a scree-plot pattern, and decreases rapidly as the number of

0 30 60 90 120 150 180 210 240 270

Iteration Number

measurement errork-means error

Figure 7.6: Target Estimation Example 1. Hausorff Distances: k-means.

clusters increases and levels off around the true value.

Although we do not use these techniques here, they could provide an alternative to

using the target number estimate from the PHD filter. An empirical analysis has been

demonstrated on the PHD particle distribution [25]. These techniques involve running the

respective algorithms over a range of values for the number of targets and so they increase

the run-time of the algorithms.

7.2.5 Time Complexity of PHD filter Tracker

The algorithm is initialised with N0 particles drawn from a prior distribution which requires

O(N0) calculations. In the prediction step, Nt particles are sampled from one proposal dis-

tribution and Mt from a birth proposal distribution which requires O(Nt +Mt) calculations.

In the update step, the weights are recalculated, requiring O((Nt + Mt)|Zt |) calculations.

0 100 200 300 400 500 600 700 800

x position

Target Estimates

measurementsEM targets estimates

Figure 7.7: Target Estimation Example 2. Measurements and estimated positions: EMalgorithm.

0 100 200 300 400 500 600 700 800

x position

Target Estimates

measurementsk-means targets estimates

Figure 7.8: Target Estimation Example 2. Measurements and estimated positions: k-means.

0 10 20 30 40 50 60 70 80 90 100

Iteration Number

Figure 7.9: Target Estimation Example 2. Hausorff Distances: EM algorithm.

0 10 20 30 40 50 60 70 80 90 100

Iteration Number

Figure 7.10: Target Estimation Example 2. Hausorff Distances: k-means.

0 10 20 30 40 50 60 70 80 90 100

Iteration Number

Wasserstein Target Error

Figure 7.11: Target Estimation Example 2. Wasserstein Distances: EM algorithm.

The resampling step requires O(Nt+1) calculations, where Nt+1 = NTt which depends on

the estimated number of targets Tt . We have used the k-means algorithm here to estimate

the target locations, which has linear time complexity in the estimated number of targets

and number of particles, O(TtNt+1n), where n is the number of iterations for the k-means

algorithm.

7.2.6 Summary

In our experiments, the k-means has outperformed the EM algorithm for our task of extract-

ing the target states from the PHD filter particles in a linear Gaussian tracking scenario. The

k-means algorithm is faster both in terms of the asymptotic time complexity (k-means is lin-

ear in the number of targets whereas the EM algorithm is quadratic) and in the empirical

analysis shown here. The results from the different error metrics comparing the true trajec-

0 10 20 30 40 50 60 70 80 90 100

Iteration Number

Wasserstein Target Error

Figure 7.12: Target Estimation Example 2. Wasserstein Distances: k-means.

tories with the estimated trajectories using the two clustering algorithms showed that the

k-means algorithm has also made fewer errors than the EM algorithm. For a real-time track-

ing algorithm with a variable number of targets, the quadratic time complexity for the EM

algorithm rapidly becomes infeasible. In the following section, we opt to use the k-means

algorithm for providing the estimated target states for track continuity and data association.

7.3 Track Continuity

This section addresses step 5 of the algorithm shown in Table I, namely associating target

state estimates with target tracks. The data association problem in multiple target tracking

usually involves ensuring that the correct measurement is given to each stochastic filter so

that the trajectories of each target can be accurately estimated. This is referred to as mea-

surement to track association. The three main approaches in the literature are the Nearest

Neighbour Standard Filter (NNSF), the Joint Probabilistic Data Association Filter (JPDAF)

and the Multiple Hypothesis Tracking filter (MHT filter) [5]. Before describing these, the

required terminology is briefly outlined.

Each track i has an associated innovation covariance It,i, which defines a validation

region

Vt,i(γ) := z : [z− zt,i]T (It,i)−1[z− zt,i] ≤ γ. (7.5)

The predicted measurement zt|t−1,i = Hxt|t−1,i is obtained by projecting the previous esti-

mate using the motion model (xt|t−1,i = Fxt−1,i) and then using the observation function H.

The difference between the new observation and the predicted measurement is called the

innovation: ν ji = |zt, j − zt,i|. The set of targets at time t is Xt = xt,1, ...,xt,Tt, and the set

of measurements is Zt = zt,1, ...,zt,mt.

The NNSF simply takes the validated measurement nearest to the predicted measure-

ment for updating each of the target states. This can result in problems as the nearest

validated measurement may be the same for two different targets. The Joint Probabilistic

Data Association Filter computes the joint probabilities for all the pairings between the pre-

dicted measurements and estimated target states. This technique also has to consider false

alarms from spurious measurements.

The ideal MHT filter maintains probabilities of all possible associations at each time

step. Unlike the NNSF and JPDAF, MHT does not just consider the probabilities from the

previous time step, which allows for backtracking. It also allows for track initiation. In

practice, it is not feasible to keep track of all possible hypotheses as the time and compu-

tational complexity grows exponentially. Techniques for reducing the complexity include

gating (ignore measurements outside validated regions), pruning (eliminating low proba-

bility hypotheses) and merging (combining hypotheses into a single track).

The techniques for data association we will consider here are based on peak-to-track

association. In the PHD filter, estimates of the target locations are given at each time step,

but track continuity is not maintained. We present two methods for enabling identification

of the same targets between frames based on the target estimates provided by the PHD filter.

These techniques take advantage of the ability of the PHD filter to estimate the number and

locations of targets, and to filter out the clutter. The complexity of these techniques is less

than the MHT and JPDA filters because the target states are provided by the PHD filter

algorithm. We have assumed that the false alarms are filtered out by the PHD filter and

there is no backtracking.

The previous section showed that the k-means algorithm was a clear favourite for es-

timating the target states from the particle PHD estimate. We use it here as the preferred

choice for associating the target tracks between frames.

7.3.1 Particle Labelling Association

Our first method for data association is based on the observation that in the particle repre-

sentation of the multimodal density, the particles representing one of the modes will tend to

track that mode if the motion of the particles is modelled well. In previous implementations

of the Particle PHD filter, clustering techniques have been employed to extract the peaks of

the PHD distribution. Here, we extend this idea to assign labels to the particles based on a

partitioning of the data created by the k-means clustering technique.

The method can be informally explained as follows. At each iteration, partition the

particles in the position domain and give each particle in the same partition the same label

(partitioning is initially only done in the position domain as the variance of the particle

distribution is usually lower than in the velocity domain, so is a better discriminator, and

it is also computationally faster to do so). In subsequent iterations, when resampling, give

the children of a particle the same label as its parent. After resampling, repartition the data,

and if the majority of the particles in one partition have the same label, then associate these

partitions.

The example state model for the particles used here will be the two-dimensional position

and velocity vector. If the data is not well partitioned in the position domain (i.e. not within

2 standard deviations), then it can be partitioned in the velocity domain; this will help to

keep track of targets that cross each other. This technique is presented in Table V.

Panta et al. [32] presented techniques similar to those described in this chapter which

was based on labelling the clusters and finding the maximum sum of particles with the same

label from the previous time step (Table V, step 5, matrix A). If there are two partitions

in the current timestep which have mostly labels from the same partition in the previous

timestep then this can result in the wrong assignment. This problem was reduced here where

we also consider the number of particles which have been resampled from the previous

timestep (Table V, step 5, matrix C).

Table V: Track continuity Method 1 (used in conjunction with Table I).

For i = 1, . . . ,Nt−1

Assign prediction labels to the particles LPt (x(i)

t ) := Lt−1(x(i)t−1).

Define prediction partitions PPt,1, . . . ,PP

t,Tt−1 := Pt−1,1, . . . ,Pt−1,Tt−1

For i = 1, . . . ,M

Assign labels to newborn particles LPt (x(i)

t ) := LNEW .

Define newborn particle partition PPt,LNEW .

Define update partitions PUt,1, . . . ,PU

t,Tt−1+1 := PPt,1, . . . ,PP

t,Tt−1∪PP

If x( j)t ∈ Child(x(i)

t ), then assign label LRt (x( j)

t ) := LUt (x(i)

The resampling partitions are defined accordingly PRt,1, . . . ,PR

t,Tt−1+1.

step 4. (Multi-Target State Estimation, for t ≥ 1.)

Determine target state estimates and covariances (see tables II/III),(xt,1,St,1), . . . ,(xt,Tt ,St,Tt ).

If there are state estimates xt,i and xt, j such that exp−1/2(Hxt,i −Hxt,i)T (HT St,iH)(Hxt,i −Hxt,i) < 4 ,

then repartition based on velocity.

Assign labels to new partitions LEt,1, . . . ,LE

Step 5. (Association, for t ≥ 2.)

Define matrices A and C as follows,

∣i : x(i)t ∈ PR

t, j ∩PEt,k∣

∣> ε1N, A j,k = 1, else A j,k = 0.

C j,k =∣

∣i : Child(x(i)t ) ∈ PR

t, j ∩PEt,k∣

Associate estimates with tracks as follows,

If ∑k A j,k = 0, delete track Lt, j .

else if ∑k A j,k = 1, associate PRt, j with Lt,k for k such that A j,k = 1

else if ∑k A j,k > 1, associate PRt, j with Lt,k for k = argmaxk C j,k;

declare new tracks for k ∈ k1, . . . ,kn targets such that A j,k = 1.

7.3.2 Estimate-to-Track Association

At each stage of the PHD filter algorithm, the target states are estimated by clustering the

particles and obtaining means of the clusters. The second association method which we

present here uses these estimated states and finds the best association between them and the

predicted estimate derived from projecting the previous estimates with the motion model.

The method proceeds as follows. In step 4 of the particle filter algorithm, the estimated

target locations are found by clustering the data and taking the mean positions as the esti-

mated set of state vectors for the targets xt,1, .., xt,Tt. Let F be the transition function for

the dynamic model that is used in the prediction model for the particles. Then, before re-

ceiving any measurements, the predicted state vector for the estimated target xt−1, j at time

t is xt|t−1, j := Fxt−1, j and the estimated set is xt|t−1,1, .., xt|t−1,Tt−1.

The estimated target locations are known at each time step. The goal of the association

stage is to connect these locations between time steps so that there is continuity of identity

for each target. Thus, the association here does not involve the measurements, only the

estimated positions. This has three advantages. First, the estimated positions should give

better estimates than the measurements. Second, spurious measurements due to false alarms

should have been filtered out. Finally, the estimated state vectors have the unobservable di-

mensions such as velocity, which allows for better discrimination when the measurement

positions are close but velocities are different. This technique is presented in Table VI.

Table VI: Track continuity Method 2 (used in conjunction with Table I).

step 5. (Association, for t ≥ 2.)

For j = 1, . . . , Tt−1,

Use state equation to obtain predicted state estimate xt|t−1, j := Fxt−1, j.

Predicted target state estimates are xt|t−1,1, . . . , xt|t−1,Tt−1.

For i = 1, . . . , Tt ,

Create validation gate for target state estimate, Vt,i(γ) := x : [x− xt,i]T (St,i)

−1[x−xt,i] ≤ γ.

Evaluate βt , the set of validated 1 − 1 correspondences betweenxt|t−1,1, . . . , xt|t−1,Tt−1

and xt|,1, . . . , xt,Tt .

Find association bt ∈ βt such that bt = argmaxb∈βt ∑b exp−1/2(xt,i −xt|t−1, j)(St,i)−1(xt,i − xt|t−1, j).

Declare new target tracks, Lt,k1 , . . . ,Lt,kn, for target state estimates, xt,k1 , . . . , xt,knfor which no association is made.

7.3.3 Simulated Examples

We now demonstrate our two proposed methods for data association and compare the track-

ing results. Simulated trajectories of targets have been generated with added noise to gen-

erate measurements. We use a linear Gaussian model again, and present the estimated

trajectories of the targets. The k-means algorithm partitions the data and provides the target

estimates. The errors for the target estimation were shown in the previous section, so we

concentrate here only on tracking continuity. The particle output for the PHD-filter is the

same in both cases, so the target estimates are the same.

100 150 200 250 300 350 400 450 500 550 600

x position

Tracking with Data Association

measurementsTrack 1Track 2Track 3Track 4Track 5Track 6Track 7Track 8Track 9

Figure 7.13: Data Association Example 1. Method 1.

Example 1

Our first example demonstrates the tracking on measurements without clutter. Linear tra-

jectories for the targets have been randomly generated. The targets may enter and leave the

field of view, showing the capability for birth and death of targets. In this example, the data

starts with 3 targets and 2 targets enter at time t = 10, one at t = 15 and another at t = 20,

giving a total of 7 targets. Figures 7.13 and 7.14 show results for each of the methods, both

of which manage to follow the correct targets. In the first method, it is initially unable to

keep track of target 7 for a couple of iterations, but tracks it after this. Simlarly, for the

second method, target 7 is lost after it enters, but picked up after a couple of iterations. This

implies that there are an insufficient number of particles in the PHD filter around this target

to estimate it well initially, but the estimates improve quickly.

100 150 200 250 300 350 400 450 500 550 600

x position

measurementsTrack 1Track 2Track 3Track 4Track 5Track 6Track 7Track 8

Figure 7.14: Data Association Example 1. Method 2.

Example 2

This section compares the two methods for their ability to track in a cluttered scenario,

where the measurements may be due to noisy observations or false alarms. The data starts

with 3 targets, 2 targets enter at time t = 10, one at t = 15 and another at t = 20, giving

a total of 7 targets. In addition to the measurements from the targets, there are additional

points generated according to a Poisson model with an average of 1 per iteration. The

estimated number of targets from the PHD filter is given in Figure 7.17. This graph shows

that the PHD filter generally gives the correct number, but it may estimate incorrectly due to

false alarms. Figure 7.18 shows the particle output from the PHD filter, the measurements

and the true target positions. The particle clusters are around the correct targets and not

the two clutter points. Figure 7.15 and 7.16 show the tracking and data association for this

example for methods 1 and 2 respectively. These figures show, over the time period where

0 200 400 600 800 1000 1200 1400

x position

Track 10Track 11Track 12Track 13Track 14Track 15Track 16

Figure 7.15: Data Association Example 2. Method 1. Total number of tracks = 16.

the tracker is run, that the total number of tracks in in second method is higher than in

the first method. Since the estimates are detemined by the PHD filter and clustering, this

comparison shows us the relative performance of track maintenance. Therefore, as shown

in the figures, the first method is able to maintain tracks for longer and so performs better

than the second method.

7.3.4 Summary

We have proposed two methods to enable track continuity for the PHD filter, based on

the output of the particle filter algorithm with clustering to partition the particles and pro-

vide the target estimates. Both methods have shown their ability to track sets of targets in

clutter and also handle the introduction of new targets. In the example with no clutter, both

algorithms performed comparably well, but when there is clutter, the first method has main-

0 200 400 600 800 1000 1200 1400

x position

Track 10Track 11Track 12Track 13Track 14Track 15

Track 16Track 17Track 18Track 19Track 20Track 21Track 22Track 23Track 24Track 25Track 26Track 27Track 28Track 29Track 30

Figure 7.16: Data Association Example 2. Method 2. Total number of tracks = 30.

tained the tracks for longer. This can be explained by the nature of the two methods: the

first method uses the particles directly and keeps track of the individual particle movements

to validate and associate target tracks, while the second method uses the observation covari-

ance matrix to validate measurements, which is a much less flexible means of associating

measurements.

7.4 Discussion

This chapter has addressed two fundamental issues for the particle PHD filter algorithm:

target estimation from the PHD filter and data association of the target estimates between

frames. A comparison of the clustering techniques for target estimation from the particles

has shown that the k-means algorithm is an effective technique for extracting target states,

both in terms of the accuracy of the estimates and the time taken to provide these esti-

0 10 20 30 40 50 60

Iteration Number

Number of Targets

Estimated Target MassTrue Number of Targets

Figure 7.17: Data Association Example 2. Estimated Target Number.

mates. In particular, it provides a significant improvement in time complexity over the EM

algorithm when there is a higher number of targets.

This chapter proposed two novel methods for incorporating track continuity into the Par-

ticle PHD filter. These methods are simpler in complexity than other reported techniques

and have been illustrated using simulated data with clutter. The first of the techniques par-

titions the particles at the target extraction stage into clusters around the individual targets,

and these partitions are used between the frames to enable track continuity. The second

method estimates the target in the next frame via the previous target state estimate and the

motion model followed by a validation procedure. Newborn targets are located using the

additional particles provided by the birth-model. Both techniques have demonstrated their

potential for track continuity. The first technique performed better in the example with

clutter, as it uses the particles directly instead of estimating the targets and then using these

0 100 200 300 400 500 600 700 800 900

Clustering Output from Iteration 27

MeasurementsTrue Target Positions

Track 1Track 2Track 3

Track 10Track 11Track 13Track 16

Figure 7.18: Data Association Example 2. Clustering Output.

distilled results.

A recent study on the PHD filter has demonstrated that when the probability of detec-

tion, pD, is low, a track can be prematurely destroyed [74]. It may be possible to incorporate

a technique commonly used with the Kalman filter, updating a track when there is no de-

tection, into the PHD filter tracking framework. The track could be propagated with the

state equation so that the track could be maintained. This would involve maintaining more

tracks than those estimated at the current time step. Future work could also consider situa-

tions where false target states are estimated.

Chapter 8

The GM-PHD Filter Multiple Target

Tracker

8.1 Introduction

In the last chapter, track continuity methods were developed for the Particle PHD filter.

A similar method is developed in this chapter for the Gaussian mixture implementation

described in chapter 5

The Gaussian Mixture Probability Hypothesis Density Filter (GM-PHD Filter) described

in chapter 5 provided a closed form solution to the PHD filter recursion for multiple target

tracking [33]. The posterior intensity function is estimated by a sum of weighted Gaussian

components whose means, weights and covariances can be propagated analytically in time.

In particular, the means and covariances are propagated by the Kalman filter.

The original Gaussian Mixture PHD filter algorithm provided a means of estimating

the number of targets and their states at each point in time. The method for determining

the targets simply used the weights of the Gaussian components and did not take into ac-

count temporal continuity. We show that if a target is not detected at each iteration, the

Gaussian components can still track the targets in the presence of some missed detections.

Furthermore, the trajectory of the target in the past, before it has been detected, can also be

determined by keeping the trajectories of each of the Gaussian components.

The original formulation of the GM PHD filter allowed targets to be spawned from ex-

isting targets. For simplicity, we have removed this functionality, although it is anticipated

that the algorithm presented here could be extended to incorporate this scenario.

8.2 The Gaussian Mixture PHD Filter Multiple Target

Tracker

In the previous chapter, a method for enabling track continuity for the Particle PHD filter

was developed. This technique directly uses the empirical PHD distribution using a la-

belling process to identify clusters of particles representing a target. A similar technique

for the Gaussian Mixture PHD filter is presented here which labels each Gaussian instead

of each cluster of particles except that the technique presented here can operate in much

higher clutter levels. A comparison of the two techniques will be given in the next chapter

on real data.

The means and covariances of each Gaussian in the mixture are predicted and updated

with the Kalman filter equations. These are given here for convenience. The predicted state

estimate mt|t−1 and state covariance to time t are given by,

mt|t−1 = Ft−1mt−1, (8.1)

Pt|t−1 = Qt−1 +Ft−1Pt−1FTt−1, . (8.2)

When measurement z is received, the updated estimate mt|t and covariance Pt|t are given by,

mt|t(z) = mt|t−1 +Kt(z−Htmt|t−1), (8.3)

Pt|t = [I −KtHt ]Pt|t−1, , (8.4)

Kt = Pt|t−1HTt (HtPt|t−1HT

t +Rt)−1, (8.5)

where Kt is the Kalman gain.

The algorithm presented here is initialised in Step 0 and then iterates through Steps 1

Initialisation

Initialise the algorithm with the weighted sum of J0 Gaussians,

D0|0 =J0

∑i=1

w(i)0 N (x;m(i)

0 ,P(i)0 ), (8.6)

Each Gaussian in the mixture is given a label,

L0 = L(1)0 , . . . ,L(J0)

0 . (8.7)

The sum of weights,J0

∑i=1

w(i)0 = T0, (8.8)

is the expected number of targets at the start of the algorithm.

Prediction

In the prediction step, each Gaussian component is predicted with (8.1) and (8.2) to give,

DS,t|t−1(x) = pS

Jt−1

∑j=1

w( j)t−1N (x;m( j)

S,t|t−1,P( j)S,t|t−1), (8.9)

(8.10)

where pS is the probability of survival. In addition, new Gaussian components are added

for the spontaneous birth model

γt(x) =Jγ,t

∑i=1

w(i)γ,t N (x;m(i)

γ,t ,P(i)γ,t ), (8.11)

The intensity, Dt|t−1, to time t is then

Dt|t−1(x) = DS,t|t−1(x)+ γt(x), (8.12)

The set of labels from the previous time step are concatenated with new labels from the

Gaussians introduced for the spontaneous birth model to form the set of prediction labels,

Lt|t−1 = Lt ∪L(1)γt , . . . ,L(Jγt )

γt . (8.13)

Update

When the measurements, Zt = zt,1, . . . ,zt,mt, are received at time t, compute the posterior

intensity,

Dt|t(x) = (1− pD)Dt|t−1(x)+ ∑z∈Zt

Jt|t−1

∑j=1

w( j)t (z)N (x;m( j)

t|t (z),P( j)t|t ), (8.14)

where the means, m( j)t|t , and covariances, P( j)

t|t , are computed using (8.3) and (8.4), and the

weights are calculated with the PHD filter update equation [33],

w( j)t (z) =

pDw( j)t|t−1N (z;Htm( j)

t|t−1,Rt +HtP( j)t|t−1HT

κt(z)+ pD ∑Jt|t−1`=1 w(`)

t|t−1N (z;Htm(`)t|t−1,Rt +HtP(`)

t|t−1HTt )(z)

. (8.15)

PD is the probability of detection, λt is the expected number of clutter points and ct is the

distribution of these across the state space.

There are (1+ |Zt|)Jt|t−1 Gaussian components, (1+ |Zt |) for each prediction term. For

each component, the same label as its related prediction component is assigned to form the

set of update labels,

Lt,u = LDt|t−1t|t−1 ∪Lz1

t|t−1 ∪ . . .∪Lz|Zt |t|t−1. (8.16)

Pruning

The Gaussian components with low weights are eliminated in the pruning stage to en-

sure that the complexity of the algorithm does not grow exponentially. Let the weights

w(1)t , . . . ,w(NP)

t be those which are below the truncation threshold T , and let

Dt|t := ∑Jtl=1 w(l)

∑Jtj=NP+1 w( j)

∑i=NP+1

w(i)t N (x;m(i)

t ,P(i)t ). (8.17)

Merging

In the merging stage, Gaussian components whose distance between the means falls within

a threshold, U , defined by the covariance matrix are merged. For example, if the means of

components i and j are such that,

(m(i)t −m( j)

t )T (P(i)t )−1(m(i)

t −m( j)t ) ≤U. (8.18)

then merge. (The full procedure is given in [33] [46].) If two or more components still have

the same label L(i)t , then this is given to the one with the largest weight w(i)

t and new labels

are assigned to the other components.

State Estimation

Target states are determined from Gaussians whose weights are above a specific threshold.

In addition, the Gaussians that have previously had weights above this threshold are also

taken to be target states. These are identified by their label, i.e. the set of live tracks from

time t is

Lt = L(i)t : w(i)

t > 0.5 (8.19)

and the set of estimates is

Xt = m(i)t : L(i)

t ∈ L j, j = 1, . . .t. (8.20)

The above procedure allows the determination of the trajectories of the Gaussian com-

ponents in the mixture by keeping the means associated with each identifying tag. In the

original formulation of the GM PHD filter as described in chapter 5, estimates of the tar-

get states were taken at each stage of the algorithm by choosing the components with the

maximum weights. In the version here, we have temporal continuity which enables us to

keep track of targets when their weights fall below the desired threshold. In addition, the

trajectory of the targets in the past can be determined by looking at the previous trajectory

of the Gaussian after the weight is above a given threshold.Once the weight falls below

another threshold, the Gaussian component is deleted indicating that it does not contribute

significantly to the intensity function and so the target is not likely to still exist. Note that if

pD,k < 1, the component is not deleted when a measurement is not received for a target, so

that we can continue to track even with missed detections. If the space requirements for this

do not allow all of the Gaussians to be kept in memory, tracks could be deleted if the weight

was not above a threshold for a specified number of updates. This procedure is significantly

better than the estimate-to-track association used in the particle implementation of the fil-

ter, which only considered estimates in the last frame and relied on the prediction instead of

the updated Gaussian. This shows that the GM PHD filter has the inherent ability to track

multiple targets with track continuity which shall be demonstrated in the simulations.

The probability of survival pS,k is adjusted for the expected lengths of the target tracks.

When this is too low, target tracks are lost more often and when it is too high, the tracks

continue for longer after the target has died. In the SMC version of the PHD filter, it was

reported that when the probability of detection pD,k is low that targets are prematurely

destroyed [74]. This could have been due to the particle mass being used to determine

the number of targets and clustering to determine the state estimates. Similar problems

were not encountered here since the weights of the Gaussians were used to determine the

target states and these Gaussians were assumed to represent targets until the weight of the

Gaussian fell below the pruning threshold.

8.3 Simulations

Simulated examples have been created to test the performance of the GM PHD Filter Multi-

ple Target Tracker and the results of these are compared against the track-oriented Multiple

Hypothesis Tracker [75] with a batch of 10 frames where the log-likelihood ratio was used

to rank tracks and the best global hypothesis was selected for data outputs.

8.3.1 Example 1

In this example, a two-dimensional scenario with an unknown and time varying number of

targets has been simulated in clutter over the region [−1000,1000]× [−1000,1000]. The

state xt = [ px,t , py,t , px,t , py,t ]T , of each target consists of position (px,t , py,t) and velocity

(px,t , py,t), while the measurement is a noisy version of the position.

Each target has survival probability pS,k = 0.9, detection probability pD,k = 0.99 and

follows the linear Gaussian dynamics from the previous chapter.

We assume no spawning, and that the spontaneous birth intensity is Poisson with four

Gaussian terms distributed across the surveillance region,

γt(x) =4∑i=1

0.1N (x;mγ,i,Pγ).

Note that this does not need to sum to one but reflects the expected number of spontaneously

appearing targets at time k.

The detected measurements are immersed in clutter that can be modelled as a Poisson

RFS Kt with intensity

κt(z) = λtVu(z), (8.21)

where u(·) is the uniform density over the surveillance region, V = 4×106m2 is the area of

the surveillance region, and λt = 5×10−6m−2 is the average number of clutter returns per

unit area which relates to 20 clutter measurements per scan.

The Gaussian mixture PHD filter, with pruning parameters elimination threshold T =

10−5, merging threshold U = 4, and maximum number of Gaussian terms Jmax = 200.

Figure 8.1 (top) shows the simulated scenario with the true target trajectories and an

average of 20 clutter points per scan. Figure 8.1 (bottom) gives results of the PHD filter on

a set of measurements over 100 iterations. The dots show the true target locations and the

lines show the estimated trajectories. It can be seen that the GM PHD Filter Tracker has

very few false tracks, can pick up a track very quickly, does not drop the tracks while the

target still exists and eliminates tracks shortly after the target leaves the surveillance region.

Five hundred sets of measurements for these target trajectories have been generated

to compare the two algorithms. The Wasserstein multi-target miss distance, described in

chapter 7, has been used to compare the accuracy of the estimates and also the expected

absolute error in the estimated number of targets.

When the estimated number of targets is incorrect, the Wasserstein distance puts all the

weight on the outliers. Figure 8.2 shows the results of the mean Wasserstein distance over

the 500 measurement sets for each time step. The spikes in the result for the GM-PHD filter

usually indicate that either a new target has entered the scene but has not yet been detected

or has died and has not been eliminated.

Error in Estimating the Number of Targets

The expected absolute error on the number of targets has been calculated for each of the

algorithms,

E| |Xt|− |Xt| |.

Note that standard performance measures such as the mean square distance error are not

applicable to multi-target filters that jointly estimate number of targets and their states.

Figure 8.3 shows the absolute error in the estimation of the number of targets, averaged

over 500 measurement sets. The GM-PHD filter can reliably estimate the correct number

of targets, it has fewer false tracks and can initiate the correct tracks more easily.

8.3.2 Example 2

In this example, we consider the theoretical contraints of the algorithm and illustrate this

through a simulation. Consider a situation where we have two targets, then ideally this

would be represented by two Gaussians,

Dt(x) = w(1)t N (x;m(1)

t ,Pt)+w(2)t N (x;m(2)

t ,Pt). (8.22)

(For simplicity, it is assumed that the covariance matrix is the same for each Gaussian.

This can be achieved through diagonalisation, since the covariance matrix is symmetric,

nonnegative and semi-definite.)

Suppose that the targets cross, then Dt(x) is unimodal with mean (m(1)t +m(2)

t )/2 when

(m(1)t −m(2)

t )T P−1t (m(1)

t −m(2)t ) < 4, see [76]. This means that the PHD will fail to distin-

guish between targets within this separation. Furthermore, these components could actually

be merged into the same Gaussian if the means fall within the merging threshold, U . Thus,

if the tracks of the targets are to be maintained when the targets are too close, alternative

methods for data association need to be used. If the trajectories of the targets are known in

the past, these could be used to separate the tracks after the targets have crossed.

A simulation of the above scenario has been created but with Gaussians from the spon-

taneous birth are included to ensure that if a track is lost, then it can be recaptured. Targets 1

and 2 are born at the same time but at two different locations. These two targets travel along

straight lines and their tracks cross at k = 53s, see figure 8.4 for the paths of the targets.

Two sets of measurements have been generated to show how the tracker behaves with

crossing targets, figure 8.5 shows the crossing region with two outcomes. In the first out-

come, the target trajectories are correctly estimated through the crossing point. In the sec-

ond outcome however, whilst the estimates from the GM-PHD filter are not affected, the

tracks follow the wrong trajectories after the crossing point. It is anticipated that this prob-

lem could be resolved by associating tracks from predictions before the Gaussians are in

the merging region with estimates after the targets have crossed, similar to techniques used

with multiple hypothesis tracking.

8.4 Conclusions

An algorithm has been presented for tracking multiple targets in high clutter density which

has the ability to estimate the number of targets, track the trajectories of the targets over

time, operate with missed detections and give the trajectories of the targets in the past once

a target has been identified. It has been shown to outperform the track-oriented Multiple

Hypothesis Tracker in its ability to operate in clutter with fewer false tracks and can initiate

and eliminate targets more accurately. The theoretical constraints of the proposed tracking

algorithm have been discussed in the case of crossing targets. It is anticipated that the prob-

lem of retaining the correct target identity in this scenario can be resolved by considering

the previous trajectories of targets.

10 20 30 40 50 60 70 80 90 100−1000

−500

time step

10 20 30 40 50 60 70 80 90 100−1000

−500

time step

10 20 30 40 50 60 70 80 90 100−500

time step

10 20 30 40 50 60 70 80 90 100−400

−200

time step

Figure 8.1: True target positions (lines) and measurements (crosses). (top)GM-PHD estimated target trajectories (lines) and true positions (crosses). (bottom)

0 10 20 30 40 50 60 70 80 90 1000

Time Step

Figure 8.2: Mean Wasserstein Distance.

0 10 20 30 40 50 60 70 80 90 1000

Time Step

rror i

Figure 8.3: Absolute Error in Target Number Estimate.

−400 −200 0 200 400 600 800 1000−1000

−800

−600

−400

−200

x co−ordinate

Figure 8.4: Example 2

200 250 300 350 400 450 500

−500

−450

−400

−350

−300

−250

−200

x co−ordinate

−ord

250 300 350 400 450 500 550

−550

−500

−450

−400

−350

−300

−250

x co−ordinate

Figure 8.5: Example 2

Chapter 9

Multiple Target Tracking in Sonar

Images

9.1 Introduction

Underwater vehicles can be fitted with a range of sensing equipment, including sonar and

video. As the vehicles traverse through the water column, the sensing equipment is used

to provide sequences of images of the scene. The sequences of data obtained from the

underwater vehicles need to be interpreted to gain an understanding of the environment in

which the vehicles are deployed.

One of the important tasks is to identify objects on the seabed or in the water column

which would need to be avoided in path planning and navigation [3] [77]. In the case

where the navigation of the vehicle is determined by its current environment and the path

of the vehicle is determined by the incoming data, tracking algorithms are required so that

obstacles can be avoided. Three approaches for tracking obstacles in sequences of sonar

images are considered in this chapter, the first of which uses an association technique to

assign measurements to single-target filters, the second uses a the Particle PHD filter and

estimate-to-track association presented in chapter 7, and the third uses the Gaussian Mixture

Multiple Target Tracker presented in chapter 8. Measurements are found by pre-processing

the sonar data to find potential objects based on their size and reflected intensity.

The tracking output of each of the algorithms is compared on real and simulated sonar

data. The accuracy of the tracking algorithms can be compared directly since the target

locations are known in the simulated data. An initial comparison between the Kalman fil-

ters approach and the Particle PHD filter real sonar data was given in [26], and between

the Particle PHD filter and GM-PHD filter in [36]. It is shown that the Particle PHD

filter with estimate-to-track association gives comparable tracking performance to the con-

ventional Nearest Neighbour approach with Kalman filters, and that the GM-PHD filter is

demonstrated to give comparable performance in higher levels of clutter.

9.2 Tracking and Data Association

At each point in time t, we have a set of noisy measurements, Zt = zt,1, ...,zt,mt, where zt, j

represents a single target measurement or false alarm and mt is the number of observations

at time t. From this set of measurements, we must estimate how many targets Tt there are

and their set of locations, Xt = xt,1, ...,xt,Tt, where xt,i represents the state of an individual

target and Tt is the number of targets at time t. The first approach considered involves

assigning a single-target stochastic filter to each estimated target and uses a data association

technique to ensure that each filter is assigned the correct measurement. The mechanism

for distributing the correct measurement to each filter is called data association or, more

specifically, measurement-to-track association. This approach is compared with the two

different PHD filter implementations with track continuity.

A linear Gaussian dynamic model with following state space model is used:

xt+1 =

1 T 0 0

0 1 0 0

0 0 1 T

0 0 0 1

T 2/2 0

0 T 2/2

vt , (9.1)

and observation model:

1 0 0 0

0 0 1 0

xt +wt . (9.2)

vt and wt are the process and measurement noises, respectively, and are uncorrelated.

The state vector is defined as the 2D position and velocity vector of the target:

xt xt yt yt

)T. (9.3)

9.2.1 Tracking with Kalman filters

The first multiple tracking model assigns one Kalman filter per object and manages the

measurements for each filter with a measurement-to-track data association technique. The

Kalman filter has been chosen as the single-target filter, since it has been shown to be

effective for multiple-target tracking with measurement-to-track association in sonar [78].

Other techniques, such as the Extended Kalman filter or particle filter could also have been

used. The procedure used for the tracking algorithm is the Nearest Neighbour Standard

Filter (NNSF) [5], and the implementation is outlined in figure 9.1. After the sonar data

is acquired, it is segmented and features are extracted. Depending on whether targets are

expected in a region, a Kalman filter is either initialised or measurements are associated.

Regions of interest are set to determine which areas to segment more carefully in subsequent

iterations.

The measurement-to-track data association problem in multiple target tracking involves

ensuring that the correct measurement is given to each stochastic filter so that the trajecto-

ries of each target can be accurately estimated. The approach used for data association here

is based on a variant of nearest neighbour. The predicted measurements, zt|t−1, are calcu-

lated for each target by projecting the previous estimate using the state transition function,

xt|t−1 = Fxt−1, and then the observation function, zt|t−1 = Hxt|t−1. The innovation covari-

ance St is found for each track,

St = (HPt|t−1HT +R). (9.4)

A validation region is defined for each track given,

Vt(γ) := z : [z− zt]T (St)

−1[z− zt] ≤ γ. (9.5)

From this we can choose a validation gate ν. For each track, select the observations that

either fall within the validation gate or intersect the tracked object. Each track will have

a list of possible observations for update greater than or equal to zero: If there are no

observations to update the track, then the track is deleted. If there is one observation, then

update the track with the new observation. If there are two or more observations as possible

Track Estimates

Initialise Kalman Filters

Feature Extraction

Coarse Segmentation

Sonar Data Acquisition

Segmentation of Regions of Interest

Feature Extraction

Measurement−to−track Association

Update Kalman FiltersSet Regions of Interest

Figure 9.1: Kalman Filter Tracking Procedure.

updates, choose the closest.

9.2.2 Tracking with the Particle PHD filter

The Particle PHD filter algorithm with state estimation and estimate-to-track association

from chapter 7 is used here and the procedure for the tracking implementation is given in

figure 9.2.The main difference between this approach and the NNSF is that all the extracted

features are used directly as input to the filter and estimates are associated instead of mea-

surements. The number of particles adaptively changes to be proportional to the number of

targets, with N = 1000 particles per target. The particles are propagated with the prediction

and update steps and k-means is used to repartition the particles. Partitions are associated to

a target track if the majority of the particles in the new partition correspond to the particles

propagated from a partition in the previous time-step. This approach was chosen instead of

Sonar Data Acquisition

Coarse Segmentation

Feature Extraction

PHD filter Estimates

Set Regions of Interest

Segmentation of Regions of Interest

Feature Extraction

Estimate−to−track Association

Track Estimates

Figure 9.2: Particle PHD Filter Tracking Procedure.

Sonar Data Acquisition Segmentation Feature Extraction GM PHD Filter Estimates Track Estimates

Figure 9.3: GM-PHD Tracking Implementation.

the particle labelling association presented in chapter 7 since it is simpler.

9.2.3 Tracking with the GM-PHD Filter

The GM-PHD filter multiple target tracking algorithm presented in chapter 8 is used and

the procedure for the tracking implementation is given in figure 9.3. Since the GM-PHD

filter tracker can operate in higher density clutter than the particle PHD filter, the feature

extraction process used is simpler, as described in the next section.

9.3 Implementation on Forward-Looking Sonar

The multi-target tracking methods have been implemented for tracking obstacles in forward-

looking sonar. Since the Particle PHD filter requires the k-means algorithm, the clutter lev-

els have been reduced so that inaccurate estimates are minimized. The segmentation and

feature extraction methods for determining the measurements are the same for the Nearest

Neighbour and Particle PHD filter approaches but a simpler method is used for the GM-

PHD filter as it can operate successfully in higher clutter levels. An initial comparison of

the Particle PHD filter and Nearest Neighbour techniques is given on simulated data with

estimates of the errors before presenting the results on real sonar data.

9.3.1 Simulated Sonar Data

The tracking methods have been run on simulated forward-looking sonar data. The advan-

tage of using simulated data is that it allows various realistic scenarios and trajectories to

be created easily. The exact locations of the vehicle and objects are known, and thus the

accuracy of the tracker can be determined. This will allow us to directly compare the results

of the multi-target tracking algorithms to the ground truth data.

A sequence of forward-looking sonar images has been generated using the Sonar Simu-

lator developed by Bell [69] which has the capability of modelling sonar in complex under-

water terrain. An artificial seabed is modelled by a 100×100m2 textured image, see figure

9.6. Spherical shaped objects of radius 0.5m have been placed on the seabed.

The specification of the sonar has been modelled to be as close to the sonar equipment

used to provide the real data. The range of the sonar is 40m which scans a sector of 120

degrees, see figure 9.5 for an example image. A sinusoidal trajectory with added noise

50 60 70 80 90 100 110 120 130 140

Sonar TrajectoryObjects

Figure 9.4: Simulated Sonar Trajectory with Objects. x (metres) y (metres).

has been simulated for the sonar as though it were fitted onto an Autonomous Underwater

Vehicle (AUV), see figure 9.4 for the simulated trajectory with objects.

9.3.2 Real Sonar Data

The sequences of images were obtained from a forward looking multi-beam sonar which

was fitted to an Autonomous Underwater Vehicle (AUV). The vehicle was travelling at a

rate of approximately 1 knot over a region with stationary targets on the seabed. The sonar

was mounted on the front of the AUV scanning forwards for a range of 40m and was angled

towards the seabed. The sonar scanned an angular region of 120 degrees, using 120 beams

each with a vertical beam width of 1 degree and a horizontal beam width of 40 degrees.

The sonar had an operating frequency of 600 kHz.

Figure 9.5: Simulated Sonar Image.

Figure 9.6: Artificial seabed.

Figure 9.7: Original sonar image (top). Image after filtering (middle). Resulting imageafter segmentation with regions of interests shown as the boxes and potential targets as thewhite segmented areas (bottom). 183

Feature extraction for the Kalman Filter and Particle PHD Filter

Multi-beam sonar images can be very noisy, due to reverberation from the seabed, surface

or water column and so need to be filtered if they are to be of use. See figure 9.7 (top)

for an example sonar image. The objects which we wish to track have a higher reflectivity

property than the surrounding environment, and so the measurements can be determined

by thresholding the sonar images on intensity. A two-layer segmentation has been used to

identify areas of interest, the first of which uses a fast segmentation algorithm based on the

intensity of the returned energy. The second layer more selectively segments regions where

objects are expected based on previous knowledge.

To reduce the speckle noise, the images are first filtered. A mean filter was found to be

effective at removing the noise and has a relatively cheap computational cost, see figure 9.7

(middle) for an example of an image after filtering. A threshold is then applied to identify

regions with high reflected energy where there are potential objects.

A double threshold is applied by firstly using an adaptive threshold to identify regions

of high reflectivity and then using a higher threshold to identify the regions with the high-

est returns. Neighbouring pixels are grouped together to form regions, the centroids of

these regions are taken as the measurements which will be used as input to the tracking

algorithms. After the images have been segmented and the regions with high reflectivity

have been identified, features of the potential targets can be found, and regions which are

too small to be an obstacle are discarded. The features which we use for tracking in our

application here are the centroid positions of the segmented regions. Other features have

been used in the tracking such as the perimeter and area of the objects [78]; although, for

simplicity we restrict ourselves to the positions of the targets. See figure 9.7 (bottom) for

an example of a segmented image with regions of interest.

Feature Extraction for the GM-PHD filter tracker

The same double threshold approach is used as above but the measurements are determined

from taking the centroids and does not rely on the tracking prediction. Figure 9.8 shows the

sonar image, the image after filtering, and the segmented image from which the measure-

ments are determined by taking the centroids of high intensity regions. The procedure used

for tracking is shown in figure 9.3. In our approach used here, we simply use the double

thresholding described above which results in higher clutter levels. We demonstrate that

the GM-PHD filter copes well in these circumstances and compare the results with those

obtained for the Particle PHD filter in lower clutter levels.

9.4 Results

This section presents the results for both of the tracking algorithms on real and simulated

forward-looking sonar data. For the simulated data, the positions of the targets are known

which enables us to compare the methods. A direct comparison of the errors in the set of

target state estimates from the true target locations for each of the algorithms is given.

9.4.1 Simulated Data

The simulated sonar data provides us with a ground truth with which we can compare the

accuracy of the target estimation from each of the tracking methods. In this example, there

is no clutter and the number of estimates is the same as the number of targets in view.

Previous studies have demonstrated that the Particle PHD filter can operate successfully in

Figure 9.8: Forward-scan sonar image (top). Sonar image after filtering (middle). Imageafter segmentation for GM PHD measurements (bottom).

Tracking Technique Kalman Filters Particle PHD filterHausdorff Pixel Error 35.274 35.125RMS Pixel Error 28.26 28.83Pixel Standard Deviation 6.2199 6.5523

Figure 9.9: Comparison of Errors.

higher levels of clutter [79] [29] [30].

The true positions give the centres of the spherical objects in the image. The trackers,

however, estimate the position of the centroid of the highlight of the object from the re-

flected acoustic energy and therefore introduces an inherent bias which is reflected in the

results. Figures 9.10 and 9.11 show the images with tracking results superimposed. The re-

sults of the tracking are very similar, although the nearest neighbour approach with Kalman

filters managed to keep some of the tracks longer.

Let Xt be the set of target states at time t and Xt be the set of estimated target states. We

compare the performance of the algorithms using the L2 pixel errors,

d(xi, x j) =√(

(x1i − x1

j)2 +(x2

i − x2j)

, between the estimates and true positions. For each

iteration, the mean and maximum pixel errors have been calculated. The maximum error

here is the same as the Hausdorff distance [72], maxxi∈Xt minx j∈Xtd(xi, x j), which gives

the tracking error in the worst case. The errors have been averaged over the length of the

sequence and the table of results is given in figure 9.9.

The tracking techniques have given comparable performance in their ability to estimate

the correct position and in the standard deviation of errors. The average error throughout

the sequence was around 30 pixels with a standard deviation of 6 in both cases.

9.4.2 Real Data

The tracking algorithms have been tested on the same sequence of sonar data and in this

section a comparison of the different techniques is given. The images in the sequence are

24-bit colour of size 1276× 833 which was converted to grayscale. A mean filter of size

11×11 was used to reduce the impulse noise before segmenting the image by thresholding.

The measurements obtained by this process are fed into the tracking algorithms.

Kalman filters and Particle PHD filter

Selected frames from the sequence are presented in figures 9.12 and 9.13. In the first

frame shown, there are three targets being tracked in each image, the trajectories on the

left are fairly similar. The target on the right has been tracked for longer with the PHD

filter than the Kalman filter, although the ability to track without measurements has been

removed in the case of the Kalman filter [78]. This was to enable a fairer comparison,

since this functionality has not been used with the estimate-to-track PHD filter although

could be incorporated into future implementations. We notice in the next two images, both

techniques have similar target trajectories, although, the Kalman filter tracking is smoother.

This is due to the weight of the model on the Kalman filter.

GM PHD filter

Figures 9.13 and 9.14 show results of the Particle PHD and GM-PHD filters respectively.

The first of these uses a more complex pre-processing procedure for determining the mea-

surements to reduce clutter levels and the second uses simple thresho lding which gives

more clutter points. The average number of clutter points with the first method was less

Figure 9.10: Tracking results using Kalman filters on simulated data. Frames 34, 85 and 97in a sequence of 100 frames.

than 1 and with the second around 5.

While the empirical distribution from the Particle PHD filter can handle high clutter

levels and estimate the correct number of targets, estimating the target states relies on clus-

tering techniques which can lead to inaccurate and false estimates being obtained. The

target states are determined from the GM-PHD filter by taking the Gaussians with the high-

est weights. In simulations, it has been shown that individual Gaussians can accurately

track the correct targets [46] in high clutter levels. This has a number of advantages over

the particle implementation. First, the complexity of the algorithm is lower, the number of

Gaussians used (maximum of 200 after pruning and merging compared with 1000 particles

per tar get). Second, the means of the Gaussians are known and don’t need to be determined

through clustering techniques. Finally, individual Gaussians determine the target states and

are tracked more reliably through the labelling process compared to track continuity tech-

niques developed for the Particle PHD filter. For these reasons, the GM-PHD filter can

perform better under higher clutter levels.

The number of clutter points in the sequence ranged between 0 and 12 points. Figure

9.15 gives an example of the original, filtered and segmented images used. There are 5 false

alarms in this example.

Figure 9.11: Tracking results using PHD filter on simulated data. Frames 34, 85 and 97 ina sequence of 100 frames.

Figure 9.12: Tracking results using Kalman filters. From left, frames 39, 58, and 98 in asequence of 100 frames.

Figure 9.13: Tracking results using Particle PHD filter. From left, frames 39, 58, 83 in asequence of 100 frames.

Figure 9.14: Tracking results using GM PHD filter. From left, frames 39, 58, 83 in asequence of 100 frames.

100 200 300 400 500 600 700 800

Figure 9.15: Example with clutter using GM PHD filter. From left, raw, filtered and seg-mented image with tracking superimposed.

9.5 Conclusions

The multiple target tracking techniques developed for the Particle PHD filter and GM PHD

filter have been demonstrated on real forward-scan sonar data. The algorithms are com-

pared on both simulated sonar and real forward-looking sonar data obtained from an Au-

tonomous Underwater Vehicle (AUV) and demonstrate that the PHD filter can be effectively

used for practical multiple target tracking applications and compare well with conventional

approaches for multiple target tracking. The identities of the individual target tracks have

been maintained using the methods presented in this thesis.

A comparison of the Particle PHD filter with the traditional Nearest Neighbour ap-

proach with Kalman filters has shown that the two methods developed in this thesis give

comparable performance to conventional methods of multiple target tracking. Furthermore,

it is shown that the GM PHD filter multi-target tracker can successfully track the correct

targets in reasonably high levels of clutter without the need for data association, since this

is an inherent property of the algorithm.

The performance of the algorithms shown in chapters 7 and 8 show that the Gaussian

mixture PHD filter can operate in higher levels of clutter than the Particle PHD filter due

to the clustering required to determine the target states. Whilst the convergence properties

of chapter 4 are not affected by this, the ability to use the algorithm for target tracking

in clutter is. Since clustering is not required in the Gaussian mixture version, it is much

easier to extract the correct target states by taking the Gaussian components with the largest

weights.

Chapter 10

Conclusions

10.1 Thesis Summary

The random-set framework for multiple target tracking developed by Mahler [76] offers

a distinct alternative to the traditional approach to multiple target trackingby treating the

collection of individual targets as a set-valued state and the collection of individual obser-

vations as a set-valued observation. The set-valued state is predicted and updated at each

time-step based on the set-valued observation. The multiple target posterior can be esti-

mated using a generalisation of the single target Bayesian filtering equations to a multiple

target scenario. This model can also incorporate clutter, or false measurements, into the

framework.

The complexity of computing this recursion grows exponentially with the number of

targets and so the optimal filter must be approximated. To alleviate the complexity of

computing the multi-target posterior, a recursion was derived for the first order moment of

the multi-target posterior distribution, known as the PHD filter.

The Sequential Monte Carlo implementation of the PHD filter [17], known as the Par-

ticle PHD filter, demonstrated that practical applications of the filter were possible. In

chapter 4, a study of the convergence of this algorithm was conducted showing that the em-

pirical measure approximating the PHD converges weakly to the true density. An example

of this algorithm was illustrated with the application of sequentially estimating targets in

forward-looking sonar data in chapter 6.

The closed-form version of the PHD filter for linear-Gaussian target dynamics was de-

veloped recently to provide a multi-target tracker without the complexity of the particle

filtering approach [34], called the Gaussian Mixture (GM) PHD filter. Convergence prop-

erties of the GM-PHD filter were shown in chapter 5, and bounds were found for the ap-

proximation stages to alleviate the computational complexity.

In the Sequential Monte Carlo version of the PHD filter [17], the target estimates needed

to be determined from the particle distribution by using clustering techniques such as the

EM algorithm and k-means. A comparison of the accuracy and time complexity of these

algorithms is given in chapter 7, which showed empirically that the k-means algorithm can

outperform the EM algorithm in both of these properties. In addition to estimating the num-

ber of targets and their states at each point in time, it is also important in tracking scenarios

to know the trajectories of the targets and to be able to distinguish between different targets.

Two novel methods for incorporating track continuity into the Particle PHD filter were pro-

posed. These methods are simpler in complexity than other reported techniques [29] [30]

and have been illustrated using simulated data with clutter in chapter 7.

The Gaussian mixture multi-target tracker, developed in chapter 8, showed that individ-

ual Gaussians within the mixture are able to track targets successfully and hence the ability

to track is an inherent part of the GM-PHD filter and this technique can operate with a high

number of false alarms.

The methods developed in this thesis for state estimation and track continuity for both

of the implementations of the PHD filter are demonstrated in chapter 9 for multiple target

tracking in sequences of forward scan sonar. It is shown that these methods can be used for

practical tracking applications and compare well with conventional approaches to multiple

target tracking.

10.2 Current Research

A recent study on the PHD filter showed that the estimate of the number of targets provided

by taking the integral of the PHD over the state space is potentially unstable in the presence

of missed detections and high clutter density when the probability of detection is less than

one [74]. It was argued that the estimate is unstable due to the linearisation formula for

the expected number of targets. Examples of this phenomenon can be seen in chapter 7.

One possible way of resolving this difficulty is to take the average of the number over

several iterations, although the obvious difficulty with this is that there may be a delay in

the target number to be updated and short tracks may be missed. Since the GM-PHD filter

multi-target tracker presented in chapter 8 did not rely on the target number estimate, this

problem was not encountered.

Mahler [80] [81] derived an extension to the PHD filter where the probability distri-

bution in the target number is propagated in addition to the PHD. This filter was named

the Cardinalised PHD (CPHD) filter. A Gaussian mixture approximation to the CPHD fil-

ter has been implemented [82] which extends the GM-PHD filter given in chapter 5. It is

anticipated that the method for multiple-target tracking with the GM-PHD filter proposed

in chapter 8 could also be applied to this implementation of the filter to determine target

tracks. Furthermore, particle filter approximations have been implemented by Mahler et al.

and the state estimation and track continuity methods presented in chapter 7 could prove to

be useful for this implementation.

10.3 Future Work

Techniques developed in this thesis have enabled track continuity for the two PHD filter im-

plementations, either by labelling the different clusters or Gaussians which are propagated

with the filter, or by estimating the set of targets at the next time step and gating. It is antic-

ipated that these techniques will be useful for future implementations of PHD filters. These

techniques consider only the time-step immediately preceding the current one and have not

incorporated methods for backtracking. Backtracking methods, such as those used in mul-

tiple hypothesis tracking (MHT) [75], or probabilistic MHT (PMHT) [7], would allow for

the previous trajectories of the targets to be considered and hence be able to discriminate

between crossing and closely spaced targets more accurately. A recent technique devel-

oped for particle filters uses measurements collected over several time-steps to resolve the

problem of determining the correct target trajectories of multiple targets by using fixed-lag

SMC data association [83] which could be used for the Particle PHD filter.

Possible theoretical developments with PHD filters could consider higher order multi-

target moment approximations to improve the accuracy of the PHD filter. The CPHD filter

is a partial second-order filter, which is first order in the states of individual targets but

second order in the target number. The current PHD filter framework does not extend to a

second-order approximation and an alternative approach would be required [81] which may

not necessarily be computationally tractable although this remains a future research topic.

Bibliography

[1] R. E. Kalman. A new approach to linear filtering and prediction problems. Transac-

tions of the ASME–Journal of Basic Engineering, 82(Series D):35–45, 1960.

[2] N.J. Gordon, D.J. Salmond, and A.F.M. Smith. Novel approach to nonlinear/non-

Gaussian Bayesian state estimation. IEE Proceedings on Radar and Signal Process-

ing, 140:107–113, 1993.

[3] Y. Petillot, I. Tena Ruiz, and D. M. Lane. Underwater vehicle obstacle avoidance and

path planning using a multi-beam forward looking sonar. IEEE Journal of Oceanic

Engineering, Vol. 26, No. 2, 240-251, April 2001.

[4] I. Tena Ruiz, Y. Petillot, and D. M. Lane. AUV navigation using a forward looking

sonar. Unmanned Underwater Vehicle Symposium. Rhode Island, USA., 2000.

[5] Y. Bar-Shalom and T.E. Fortmann. Tracking and Data Association. Academic Press,

[6] D. Reid. An algorithm for tracking multiple targets. IEEE Trans. Automatic Control,

24 no. 6, 1979.

[7] R.L. Streit and T.E. Luginbuhl. Probabilistic Multi-Hypothesis Tracking. NUWC-NPT

Technical Report 10 428, Naval Undersea Warfare Center, Newport, Rhode Island,

[8] D. Schulz, W Burgard, D. Fox, and A. B. Cremers. People tracking with a mobile

robot using sample-based Joint Probabilistic Data Association Filters. International

Journal of Robotics Research, pages 99–116, 2003.

[9] J. Vermaak, S. J. Godsill, and P. Perez. Monte carlo filtering for multi target tracking

and data association. IEEE Transactions on Aerospace and Electronic Systems, 41,

1:309 – 332, 2005.

[10] I. R. Goodman, R. P. S. Mahler, and H. T. Nguyen. Mathematics of Data Fusion.

Kluwer Academic Publishers, 1997.

[11] R. P. S. Mahler. An introduction to multisource-multitarget statistics and its applica-

tions. Technical monograph, Lockheed Martin, March 2000.

[12] R. Mahler. Multitarget Bayes filtering via first-order multitarget moments. IEEE

Transactions on Aerospace and Electronic Systems, 39, No.4:1152–1178, 2003.

[13] G. Matheron. Random sets and integral geometry. J. Wiley, 1975.

[14] R. Mahler. Global integrated data fusion. in Proc. 7th Nat. Symp. on Sensor Fu-

sion, 1, (Unclassified) Sandia National Laboratories, Albuquerque, ERIM Ann Arbor

MI:187–199, 1994.

[15] D.J. Daley and D. Vere-Jones. An introduction to the theory of point processes.

Springer, 1988.

[16] R. Mahler. A theoretical foundation for the Stein-Winter Probability Hypothesis Den-

sity (PHD) multi-target tracking approach. in Proc. 2002 MSS Nat’l Symp. on Sensor

and Data Fusion, 1, (Unclassified) Sandia National Laboratories, San Antonio TX,

[17] B. Vo, S. Singh, and A. Doucet. Sequential Monte Carlo methods for Multi-target

Filtering with Random Finite Sets. IEEE Trans. Aerospace Elec. Systems, 41,

No.4:1224–1245, 2005.

[18] B. Vo, S. Singh, and A. Doucet. Sequential Monte Carlo Implementation of the PHD

filter for Multi-target Tracking. Proc. FUSION 2003, pages 792–799, 2003.

[19] T. Zajic and R. Mahler. A particle-systems implementation of the PHD multitarget

tracking filter. SPIE Vol. 5096 Signal Processing, Sensor Fusion and Target Recogni-

tion, pages 291–299, 2003.

[20] H. Sidenbladh. Multi-target particle filtering for the Probability Hypothesis Density.

International Conference on Information Fusion, pages 800–806, 2003.

[21] D. E. Clark and J. Bell. Convergence Results for the Particle PHD Filter. IEEE

Transactions on Signal Processing, 54, No.7:2652–2661, 2006.

[22] A. M. Johansen, S. S. Singh, A. Doucet, and B. Vo. Convergence of the SMC imple-

mentation of the PHD filter. Methodology and Computing in Applied Probability, to

appear., 2006.

[23] M. Tobias and A. D. Lanterman. Probability Hypothesis Density-based multi-target

tracking with bistatic range and Doppler observations. IEE Radar, Sonar and Naviga-

tion, Volume 152, Issue 3 , p. 195-205., 2005.

[24] D.E. Clark and J. Bell. Bayesian Multiple Target Tracking in Forward Scan Sonar

Images Using the PHD Filter. IEE Radar, Sonar and Navigation, Volume 152, Issue

5, p. 327-334, 2005.

[25] D. E. Clark, J. Bell, Y. de S.-Pern, and Y. Petillot. PHD Filter Multi-target Tracking in

3D Sonar. IEEE Oceans Europe Conference, Brest June 2005. Volume 1, June 20-23,

2005 p265 - 270.

[26] D. E. Clark, I. Tena-Ruiz, Y. Petillot, and J. Bell. Multiple target tracking and data

association in sonar images. IEE Seminar on Target Tracking: Algorithms and Appli-

cations. Birmingham, UK. March 2006., pages 149–154, 2006.

[27] N. Ikoma, T. Uchino, and T. Maeda. Tracking of feature points in image sequence by

SMC implementation of PHD filter. ICE 2004 Annual Conference, 4-6 Aug, 2004. p

1696 - 1701 vol. 2.

[28] B. Vo, W. K. Ma, and S. Singh. Locating an unknown time-varying number of speak-

ers: A Bayesian random finite set approach. in Proc. 2005 IEEE Int. Conf. Acoust.,

Speech, Signal Processing, Philadelphia, 4:1073–1076, 2005.

[29] K. Panta, B. Vo, S. Singh, and A. Doucet. Probability hypothesis density filter versus

multiple hypothesis tracking. Proceedings of SPIE – Volume 5429 Signal Processing,

Sensor Fusion, and Target Recognition XIII, Ivan Kadar, Editor, August 2004, pp.

284-295.

[30] Lin Lin. Parameter estimation and data association for multitarget tracking. PhD

Thesis, The University of Connecticut, 2004.

[31] D. E. Clark and J. Bell. Data Association for the PHD Filter. ISSNIP, Melbourne,

Australia. 5th-8th December 2005., pages 217 – 222.

[32] K. Panta, B. Vo, and S. Singh. Improved probability hypothesis density filter (PHD)

for multitarget tracking. Proceedings ICISIP, Bangalore 12th-15th December 2005.

[33] B. Vo and W. K. Ma. The Gaussian Mixture Probability Hypothesis Density Filter.

IEEE Transactions on Signal Processing, to appear, 2006.

[34] B. Vo and W. K. Ma. A closed-form solution to the Probability Hypothesis Density

filter. in Proc. Int’l Conf. on Information Fusion, Philadelphia, 2005.

[35] D. E. Clark and B. Vo. Convergence Analysis of the gaussian mixture PHD Filter.

IEEE Transactions on Signal Processing, to appear, 2006.

[36] D. Clark, B. Vo, and J. Bell. GM-PHD Filter Multi-target Tracking in Sonar Images.

Proc. SPIE Defense and Security Symposium. Orlando, Florida [6235-29], 2006.

[37] Y. C. Ho and R. C. K. Lee. A Bayesian approach to problems in stochastic estimation

and control. IEEE Trans. AC, AC-9:333–339, 1964.

[38] A. Jazwinski. Stochastic processes and filtering theory. Academic Press, 1970.

[39] S. J. Julier and J. K. Uhlmann. A General Method for Approximating Nonlinear Trans-

formations of Probability Distributions. Technical Report, RRG, Dept. of Engineering

Science, University of Oxford., 1996.

[40] H. W. Sorenson and D. L. Alspach. Recursive Bayesian estimation using Gaussian

sum. Automatica, 7:465–479, 1971.

[41] A. Doucet, N. de Freitas, and N. Gordon. Sequential Monte Carlo Methods in Practice.

Springer-Verlag, 2001.

[42] B. Vo and S. Singh. Technical aspects of the Probability Hypothesis Density recursion.

Tech. Rep. TR05-006 EEE Dept. The University of Melbourne, Australia, 2005.

[43] D. L. Alspach. A Bayesian Approximation Technique for Estimation and Control of

Discrete Time Systems. PhD thesis, University of California, San Diego, 1970.

[44] B. D. Anderson and J. B. Moore. Optimal Filtering. Prentice-Hall, New Jersey, 1979.

[45] D. E. Clark and J. Bell. Multi-target State Estimation and Track Continuity for the

Particle PHD Filter. IEEE Transactions on Aerospace and Electronic Systems, 43 no

3, July 2007.

[46] D. Clark, K. Panta, and B. Vo. The GM-PHD Filter Multiple Target Tracker. Proc.

International Conference on Information Fusion. Florence., July 2006.

[47] D. E. Clark, I. Tena-Ruiz, Y. Petillot, and J. Bell. Multiple Target Tracking in Sonar

Images. IEEE Transactions on Aerospace and Electronic Systems, 43 no 3, July 2007.

[48] B. Oksendal. Stochastic differential equations, 6th edition. Springer Verlag, Heidel-

berg, 2003.

[49] Venkatarama Krishnan. Nonlinear Filtering and Smoothing : An Introduction to Mar-

tingales, Stochastic Integrals and Estimation. Dover, 2005.

[50] S. Julier and J. Uhlmann. A new extension of the kalman filter to nonlinear systems.

In Int. Symp. Aerospace/Defense Sensing, Simul. and Controls, Orlando, FL., 1997.

[51] J. T.-H. Lo. Finite-dimensional sensor orbits and optimal non-linear filtering. IEEE

Trans. IT, IT-18(5):583–588, 1972.

[52] S. Arulampalam, S. Maskell, N. J. Gordon, and T. Clapp. A tutorial on particle filters

for on-line non-linear/non-Gaussian Bayesian tracking. IEEE Trans. SP, 50(2):174–

188, 2002.

[53] A. N. Shiryaev. Probability. Number 95 in Graduate Texts in Mathematics. Springer

Verlag, New York, second edition, 1995.

[54] D. Crisan and A. Doucet. A survey of convergence results on particle filtering for

practitioners, 2002.

[55] D. Crisan and A. Doucet. Convergence of sequential Monte Carlo methods, 2000.

[56] D. Crisan. Sequential Monte Carlo Methods in Practice, chapter 2, pages 17–41.

Springer-Verlag, 2001.

[57] J. Jacod and P. Protter. Probability Essentials. Springer, 2000.

[58] D.L. Hall and J. Llinas, editors. Handbook of Multisensor Data Fusion, chapter 7.

CRC Press, 2001.

[59] S. K. Srinivasan. Stochastic Point Processes and Their Applications. Griffin’s Statis-

tical Monographs and Courses, 1973.

[60] D. R. Cox and V. Isham. Point Processes. Chapman & Hall, 1980.

[61] H. Sidenbladh and S.L. Wirkander. Tracking random sets of vehicles in terrain. IEEE

Workshop on Multi-Object Tracking, Madison, WI, USA, 2003.

[62] M. Tobias and A.D. Lanterman. A Probability Hypothesis Density-based multitarget

tracker using multiple bistatic range and velocity measurements. System Theory, 2004.

Proceedings of the Thirty-Sixth Southeastern Symposium on , March 14-16, 2004,

pages 205–209, 2004.

[63] P. Billingsley. Convergence of probability measures. Wiley, New-York, 1968.

[64] B. Rynne and M. Youngson. Linear Functional Analysis. Springer-Verlag, 2000.

[65] D. Salmond. Tracking in Uncertain Environments. PhD thesis, University of Sussex,

[66] J. L. Williams. Gaussian mixture reduction for tracking multiple maneuvering targets

in clutter. Master’s thesis, Air Force Institute of Technology, 2003.

[67] I. Tena Ruiz, S. Raucourt, Y. Petillot, and D. M. Lane. Concurrent mapping and

localisation using side-scan sonar for autonomous navigation. Oceanic Engineering,

IEEE Journal of, 29, Issue 2:442–456, 2004.

[68] I. Tena Ruiz, D. M. Lane, and M. J. Chantler. A comparison of inter-frame feature

measures for robust object classification in sector scan sonar image sequences. IEEE

Journal of Oceanic Engineering, 24, No.4:458–469, 1999.

[69] J.M. Bell. A model for the simulation of side scan sonar. PhD Thesis. Heriot-Watt

University, 1995.

[70] B. S. Everitt and G. Dunn. Applied Multivariate Data Analysis. Arnold, 2nd edition,

[71] T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Y. Wu.

A Local Search Approximation Algorithm for k-means Clustering. Proc. of the 18th

Annual ACM Symp. on Computational Geometry, pages 10–18, 2002.

[72] J. Hoffman and R. Mahler. Multitarget miss distance via optimal assignment. IEEE

Trans. Sys., Man, and Cybernetics-Part A, 34(3):327–336, 2004.

[73] C. A. Bouman. Cluster: An unsupervised algorithm for modeling Gaussian mixtures.

Available from http://www.ece.purdue.edu/˜bouman, April 1997.

[74] O. Erdinc, P. Willet, and Y. Bar-Shalom. Probability Hypothesis Density Filter for

Multitarget Multisensor Tracking. Proc. FUSION 2005.

[75] T. Kurien. Issues in the design of practical multi-target tracking algorithms. Multi-

target Multi-sensor Tracking: Advanced Applications, pages 43–83, 1990.

[76] R. Mahler. Multi-target Bayes filtering via first-order multi-target moments. IEEE

Trans. AES, 39(4):1152–1178, 2003.

[77] I. Tena Ruiz, Y. Petillot, D. M. Lane, and C. Salson. Feature Extraction and Data

Association for AUV Concurrent Mapping and Localisation. Proceedings of the 2001

IEEE Conference on Robotics and Automation. Seoul, Korea. May 2001.

[78] I. Tena Ruiz, Y. Petillot, D. Lane, and J. Bell. Tracking objects in underwater multi-

beam sonar images. Motion Analysis and Tracking (Ref. No. 1999/103), IEE Collo-

quium on , 10 May 1999, pages 11/1 – 11/7, 1999.

[79] C. Haworth, Y. de Saint-Pern, D. Clark, E. Trucco, and Y. Petillot. Detection and

tracking of multiple metallic objects in millimetre-wave images. International Journal

of Computer Vision, 71, no. 2:183–196, February 2007.

[80] R. Mahler. A Theory of PHD Filters of Higher Order in Target Number. SPIE Defense

and Security Symposium, Orlando, Florida, 2006.

[81] R. Mahler. PHD Filters of Higher Order in Target Number. submitted to IEEE Trans.

AES., 2005.

[82] B. T. Vo, B. Vo, and A. Cantoni. The CPHD Filter for linear Gaussian multi-target

models. Proc. 40th Annual Conf. on Info. Sciences and Systems (CISS’06), Stanford,

[83] M. Briers, A. Doucet, S. Maskell, and P. Horridge. Fixed-lag sequential Monte Carlo

data association. SPIE Defense and Security Symposium, Orlando, Florida, 2006.

multiple target tracking with the probability …dec1/thesis/danielclarkthesis.pdf1.1 target...

Documents

forward-scan sonar tomographic reconstruction phd filter...

target tracking short

single target tracking -...

labrel dec1

clusterin, a novel dec1 target, modulates dna damage...

e10 dec1 2010

continuous psychophysics: target-tracking to measure visual...

target tracking: lecture 4 maneuvering target...

group_discussion on 4 dec1

target tracking: lecture 3 maneuvering target tracking...

tbs dec1 2014 mock2

advanced target tracking technique

encode dcc antibody characterization dec1...dec1...

real-time target tracking with calypso 4d tracking system

visual target tracking system

dean dissertation final dec1

2012 revised handbook dec1. 7

target tracking le 2: models in target tracking · 2019. 5....

target tracking & contour detection

1 bayesian multiple target tracking in forward scan...