multiple target tracking with the probability …dec1/thesis/danielclarkthesis.pdf1.1 target...
Post on 31-Aug-2020
2 Views
Preview:
TRANSCRIPT
HERIOT-WATT UNIVERSITY
Multiple Target Tracking with
The Probability Hypothesis Density Filter
Daniel Edward Clark
SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
ON COMPLETION OF RESEARCH IN THE
DEPARTMENT OF ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING.
OCTOBER 2006
This copy of the thesis has been supplied on the condition that anyone who consults it is
understood to recognise that the copyright rests with the author and that no quotation from
the thesis and no information derived from it may be published without the written consent
of the author or the University (as may be appropriate).
Declaration
I hereby declare that the work presented in this thesis was carried out by myself at Heriot-
Watt University, except where due acknowledgement is made, and not been submitted for
any other degree.
Signature of Daniel Edward Clark :
Signature of Supervisor:
Abstract
The random-set framework for multiple target tracking offers a distinct alternative to the tra-
ditional approach to multiple target tracking by treating the collections of individual targets
and observations as finite-sets. The multi-target state is predicted and updated recursively
based on the set-valued observation. The complexity of computing the multi-target recur-
sion grows exponentially with the number of targets and so a method for approximating
the optimal filter using a recursion for the first-order moment of the multi-target posterior,
known as the Probability Hypothesis Density (PHD) filter, was developed.
This thesis addresses some of the essential issues required for the PHD filter to be
of practical value in multiple target tracking applications. Two implementations of the
PHD filter are studied in detail; the Particle PHD filter, which is a Sequential Monte Carlo
technique based on particle filtering, and the Gaussian Mixture PHD filter, which provides
a closed form solution to the PHD filter.
A detailed study of the convergence properties is conducted which gives theoretical
justification for the use of the algorithms. Novel methods to determine the trajectories of
the targets for each of the algorithms are developed which enable the PHD filter to be used
for true multiple target tracking. These methods are implemented on forward-looking sonar
data and demonstrate that the multiple target tracking methods developed for the PHD filter
can be used for real applications.
Acknowledgements
A big thanks to Judith Bell for her excellent supervision throughout the course of my PhD,
her consistent support and guidance has been invaluable.
Thanks to QinetiQ for supporting this work, and, in particular, Douglas Carmichael and
Samantha Dugelay for their interest in this work.
At Heriot-Watt, thanks to Yvan Petillot and Ioseba Tena Ruiz for their expertise on
tracking and sonar and for developing the tracking algorithm with Kalman filters on sonar
data. Also at Heriot-Watt, thanks to Yves de Saint-Pern for providing his code, Chris
Haworth for the tracking work on millimetre wave images, Chris Capus for helping me
recover this thesis from my dead laptop, and to the excellent support staff in the department.
Thanks to Ba-Ngu Vo for an interesting couple of months in Melbourne and for devel-
oping the algorithms on which this thesis is based. Also in Melbourne, thanks to Kusha
Panta for his contribution to the Fusion paper. In Cambridge, thanks to Sumeetpal Singh
for helping with the complicated mathematics and his high level of rigour.
Thanks to Ronald Mahler for developing this interesting area in mathematics and en-
gineering and for inviting me to Florida to present some of this work. The anonymous
reviewers, some of whom have refereed a number of the articles in this thesis, have con-
tributed substantially to improving this work and deserve a special thanks.
Thanks to Spela for inspiring me to do something good.
Finally, the biggest thanks go to my parents for always supporting me, without whom
this work would not have been possible.
Contents
1 Introduction 6
1.1 Target Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Multiple Target Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 The PHD Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Original Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Bayesian Filtering 18
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Single-Target Bayesian Filtering . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Kalman Filtering Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 The Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 The Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . 27
2.3.3 The Unscented Kalman Filter . . . . . . . . . . . . . . . . . . . . 30
2.3.4 The Gaussian Sum Filter . . . . . . . . . . . . . . . . . . . . . . . 32
2.4 Sequential Monte Carlo Filtering . . . . . . . . . . . . . . . . . . . . . . . 35
2.4.1 Sequential Importance Sampling and Resampling . . . . . . . . . . 35
1
2.4.2 The Particle Filter Algorithm . . . . . . . . . . . . . . . . . . . . . 37
2.4.3 Convergence Properties . . . . . . . . . . . . . . . . . . . . . . . 39
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3 The Probability Hypothesis Density Filter 42
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 Random Set Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Point Process Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.1 Janossy Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.2 Probability Generating Functionals . . . . . . . . . . . . . . . . . 47
3.4 PHD Filter Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4.1 The PHD Prediction Equation . . . . . . . . . . . . . . . . . . . . 51
3.4.2 The PHD Measurement Update Equation . . . . . . . . . . . . . . 53
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4 The Particle PHD Filter 60
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2 The Particle PHD Filter Algorithm . . . . . . . . . . . . . . . . . . . . . . 61
4.3 Convergence for the Particle PHD Filter Algorithm . . . . . . . . . . . . . 65
4.3.1 Criteria for Convergence . . . . . . . . . . . . . . . . . . . . . . . 65
4.3.2 Convergence of the Mean Square Errors . . . . . . . . . . . . . . . 66
4.3.3 Convergence of Empirical Measures . . . . . . . . . . . . . . . . . 77
4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5 The Gaussian Mixture PHD Filter 85
2
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2 The Gaussian Mixture PHD Filter Algorithm . . . . . . . . . . . . . . . . 86
5.3 Convergence of the Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3.1 Initialisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3.2 Prediction Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.3.3 Measurement Equation . . . . . . . . . . . . . . . . . . . . . . . . 97
5.4 Pruning and Merging of Gaussian components . . . . . . . . . . . . . . . . 98
5.4.1 Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.4.2 Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.5 Non-linear Target Dynamic Models . . . . . . . . . . . . . . . . . . . . . 102
5.5.1 Extended Kalman Prediction Equation . . . . . . . . . . . . . . . . 104
5.5.2 Extended Kalman Measurement Update . . . . . . . . . . . . . . . 106
5.5.3 The Unscented Kalman PHD Filter . . . . . . . . . . . . . . . . . 109
5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6 PHD Filter Target Estimation in Sonar Images 111
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.2 The Particle PHD Filter with State Estimation . . . . . . . . . . . . . . . . 114
6.3 Forward-Looking Sonar Implementation . . . . . . . . . . . . . . . . . . . 116
6.3.1 Simulated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.3.2 Real Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7 State Estimation and Track Continuity for the Particle PHD Filter 126
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
3
7.2 Multi-Target State Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.2.1 Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.2.2 Multi-target Miss Distance Metrics . . . . . . . . . . . . . . . . . 132
7.2.3 Simulated Examples . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.2.4 PHD Filter Estimated Target Number . . . . . . . . . . . . . . . . 137
7.2.5 Time Complexity of PHD filter Tracker . . . . . . . . . . . . . . . 140
7.2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.3 Track Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.3.1 Particle Labelling Association . . . . . . . . . . . . . . . . . . . . 146
7.3.2 Estimate-to-Track Association . . . . . . . . . . . . . . . . . . . . 149
7.3.3 Simulated Examples . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8 The GM-PHD Filter Multiple Target Tracker 157
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2 The Gaussian Mixture PHD Filter Multiple Target
Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
8.3 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.3.1 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.3.2 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9 Multiple Target Tracking in Sonar Images 174
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
4
9.2 Tracking and Data Association . . . . . . . . . . . . . . . . . . . . . . . . 175
9.2.1 Tracking with Kalman filters . . . . . . . . . . . . . . . . . . . . . 176
9.2.2 Tracking with the Particle PHD filter . . . . . . . . . . . . . . . . 178
9.2.3 Tracking with the GM-PHD Filter . . . . . . . . . . . . . . . . . . 179
9.3 Implementation on Forward-Looking Sonar . . . . . . . . . . . . . . . . . 180
9.3.1 Simulated Sonar Data . . . . . . . . . . . . . . . . . . . . . . . . 180
9.3.2 Real Sonar Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.4.1 Simulated Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.4.2 Real Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
10 Conclusions 193
10.1 Thesis Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
10.2 Current Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
10.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
5
Chapter 1
Introduction
1.1 Target Tracking
Target tracking is a necessary part of systems that perform functions such as surveillance,
guidance or obstacle avoidance. Tracking algorithms take their input measurements from
sensors which provide the signals such as radar, sonar or video. The measurements are
taken at regular intervals and the task is to estimate the state of a target at each point in
time, such as its position, velocity or other attribute. Successive estimates provide the
tracks which describe the trajectory of a target.
The almost universally accepted mathematical framework used to describe this prob-
lem is that of filtering theory and, in particular, Bayesian filtering. The posterior prob-
ability distribution is recursively predicted by propagating this distribution with the state
model, which describes the motion of a target, and updated when a new observation be-
comes available. The mean and covariance of the state are determined at each time-step
from the posterior distribution. The most widely used filtering technique is the ubiquitous
Kalman filter, derived in 1960 [1], for linear and Gaussian target models. More recently,
6
sample based techniques have proved to be popular, including the particle filter developed
by Gordon in 1993 [2]. Chapter 2 describes the most commonly used filtering algorithms.
One of the current research aims in engineering is to develop autonomous vehicles such
as unmanned aerial vehicles (UAVs) or autonomous underwater vehicles (AUVs). The aim
is to develop self-navigating robots which operate without human interaction. AUVs can be
equipped with a range of sensors including forward-look sonar, sidescan sonar and video
to enable them to navigate autonomously for applications such as mine countermeasures,
pipeline inspection or seabed habitat mapping. Methods for detecting and tracking ob-
jects on the seabed are required to aid path planning [3] and navigation [4]. The vehicle
has to sense its environment to prevent collision with any obstacles. In this thesis, novel
techniques are developed for tracking a variable number of targets and implemented on
forward-scan data obtained from an underwater vehicle.
1.2 Multiple Target Tracking
The multiple target tracking problem extends the scenario to a situation where the number
of targets may not be known and varies with time. The measurements which have originated
from targets are not known since some of them may be due to false alarms. We are now
required to estimate the positions of an unknown number of targets, based on observations
of the targets corrupted by noise, with the possibilities that there may be missed detections
and that observations may be false alarms due to clutter. In addition, the identities of
the targets may need to be known to determine their trajectories. The usual method for
solving this problem is to assign a single-target stochastic filter, such as a Kalman filter or
an extended Kalman filter, to each target and use a data association technique to assign the
7
correct measurement to each filter [5].
The data association problem in multiple target tracking usually involves ensuring that
the correct measurement is given to each stochastic filter so that the trajectories of each tar-
get can be accurately estimated, this is referred to as measurement-to-track association. The
three classical approaches to this are the Nearest Neighbour Standard Filter (NNSF) [5],
the Joint Probabilistic Data Association Filter (JPDAF) [5], and the Multiple Hypothesis
Tracking filter (MHT) [6].
The Nearest Neighbour Standard Filter simply takes the nearest validated measurement
to the predicted measurement to update each of the target states. This can result in prob-
lems since the nearest validated measurement may be the same for two different targets.
The Joint Probabilistic Data Association Filter computes the joint probabilities for all the
pairings between the predicted measurements and estimated target states. This technique
also has to consider the false alarms from spurious measurements but is restricted to a
known, fixed number of targets. The ideal Multiple Hypothesis Tracking filter maintains
probabilities of all possible associations at each time step. Unlike the NNSF and JPDAF,
this does not just consider the probabilities from the previous time step, which allows for
backtracking and also track initiation. In practise, it is not feasible to keep track of all
possible hypotheses, as the computational complexity grows exponentially. Techniques for
reducing the complexity include gating, to ignore irrelevant observations, pruning, to elim-
inate low probability hypotheses, and merging, to combine hypotheses into a single track.
Some extensions of these techniques include the probabilistic MHT (PMHT) [7] which uses
a soft-gating procedure and Monte Carlo (MC)-JPDA [8], which uses a sample based JPDA
algorithm. A review of multiple target tracking and data association techniques was pre-
8
sented recently in [9], including novel developments for multi-target Monte Carlo filtering.
An alternative solution to the multiple target tracking problem is to view the set of ob-
servations collectively, and try to estimate the set of target states directly, where the correct
report-to-track association is considered unobservable [10]. The disadvantage of this ap-
proach is that the continuity of the individual target tracks are not kept. One such method
uses Finite Set Statistics for multiple target tracking [11], with an approach analogous to
the recursion used in Bayesian filtering by constructing multiple target posterior distribu-
tions. The time required for calculating joint multi-target likelihoods grows exponentially
with the number of targets so is therefore not very practical for sequential target estimation
as this may need to be undertaken in real time. A practical alternative to Bayesian multi-
ple target tracking was proposed [12] for propagating the first-order statistical moment, or
Probability Hypothesis Density (PHD), instead of the multiple target posterior itself. An
overview of this technique is given in the next section and the mathematical framework is
described in chapter 3.
1.3 The PHD Filter
The mathematical foundation of the multiple target filtering methods used in this thesis
are based on the theory of Random Sets, which was first studied by Matheron [13] in the
1970s. Mahler constructed Finite-Set Statistics (FISST) [14] from the mathematical theory
of point processes [15] and Random Set theory in the mid 1990s as a way of extending
classical single-sensor, single-target statistics to a multi-sensor, multi-target statistics of
finite-set variates. The multi-target states and observations are represented as Random Fi-
nite Sets from which a theoretically optimal Bayesian multi-sensor multi-target filter can
9
be derived [11]. The multi-target Bayes filter is not tractable for real-time implementa-
tions due to the combinatorial complexity of the multiple target likelihoods [11] and so
the optimal filter must be approximated. A recursive approach was proposed to propagate
the first-order statistical moment, or expectation, of the multi-target posterior distribution
based on the Stein-Winter Probability Hypothesis Density (PHD) [16]. This was called the
PHD filter [12]. The predictive density is approximated by a Poisson point process to track
potentially many targets, including birth, death and spawning of targets automatically.
Although the foundation was established in the form of Finite Set Statistics, its rela-
tionship to conventional probability was not entirely clear. Vo, Singh and Doucet estab-
lished the relationship between FISST and conventional probability [17], which led to the
development of a sequential Monte Carlo (SMC) multi-target filter. In addition, a SMC
implementation of the PHD filter was proposed in the form of a multi-target particle filter
which operates on sets of observations to provide a multi-modal intensity function from
which the multiple target states are determined [18] [17]. Particle filter methods for the
PHD-filter were also devised by Zajic et al. [19], and Sidenbladh [20]. Convergence prop-
erties of these algorithms have been established by Vo et al. [17], Clark [21] (as presented
in chapter 4) and Johansen et al. [22], which show that the empirical representation of the
PHD converges to the true PHD.
Practical applications of these methods have included tracking vehicles in different ter-
rains [20], tracking targets in passive radar located on ellipses [23] and tracking a variable
number of targets in forward scan sonar [24] [25] [26] (as demonstrated in chapters 6 and
9), tracking feature points in images sequences [27], and locating an unknown time-varying
number of speakers [28].
10
The advantage of the particle PHD filter is that it can track a variable number of tar-
gets, estimating both the number of targets and their locations. It avoids the need for data
association techniques as part of the multiple-target framework, since the identities of the
individual targets are not required. In addition to estimating the number of targets and their
states at each point in time, it is also important in tracking scenarios to know the trajectories
of the targets and to be able to distinguish between different targets. Some early techniques
for associating the targets between frames have been reported in the literature. The first
of these [29] used the PHD filter for pre-filtering the data input to a Multiple Hypothesis
Tracker. The second technique [30] represents the PHD in a resolution cell to differentiate
the peaks of the PHD posterior, and validation gating was used to determine the weights of
the particles. More recently, two methods were presented independently in [31] (see chap-
ter 7) and [32]. The first of these considered associating target estimates between iterations,
also known as estimate-to-track association. The second method used the partitioning of
the particle data to assign labels to the particles within the same cluster and associate the
clusters between time frames if there is a large intersection of particles with the same label
from the previous time step.
The Gaussian mixture Probability Hypothesis Density (GM-PHD) filter was derived re-
cently to provide a closed-form solution to the PHD filter [33] [34]. It was shown that,
under linear-Gaussian assumptions, the posterior intensity at any point in time is a Gaus-
sian mixture. The means and covariances of the Gaussians are determined from the Kalman
filtering equations and the weights are calculated according to the PHD filter update equa-
tion. The asymptotic convergence properties of the GM-PHD filter have been established,
showing that the mixture approximation converges to the true PHD [35] (see chapter 5).
11
The multiple target states of the GM-PHD mixture are determined from the Gaussian com-
ponents with the highest weights. It can be shown that Gaussians within the mixture track
the evolution of individual target states which ensures the continuity of target identity (see
chapter 8).
The first practical implementations of both the Particle PHD filter and the GM-PHD
filter with track continuity are presented in chapter 9 for multiple-target tracking in forward-
looking sonar images [26] [36]. These techniques are compared with the traditional NN
approach with Kalman filters.
1.4 Thesis Outline
The theory of Bayesian filtering is presented in chapter 2 with its relation to target track-
ing. The Kalman filter [1] is derived using properties of Gaussian distributions [37] and
the extended Kalman filter [38] is presented to accommodate nonlinearities in the state and
observation models. A more recent alternative, the Unscented Kalman filter [39], approx-
imates the mean and covariance of a Gaussian by a set of sigma-points. More general
probability distributions can be represented by the Gaussian Sum filter [40] which uses a
weighted sums of Gaussians which are updated with the Extended Kalman filter equations.
Finally, it is shown how the Particle filter uses Sequential Monte Carlo methods to provide
an approximate solution to the problem without relying on the restrictive linear or linearised
conditions for the signal and observation processes and has guaranteed convergence prop-
erties [2] [41].
Chapter 3 describes the Probability Hypothesis Density (PHD) Filter [12]. The point
process theory required for the derivation of the PHD filter is given together with its rela-
12
tionship to Random Finite Sets. The Multiple Target Tracking model is presented with the
Bayesian recursion analogous to the single target scenario, from which the PHD filter is
derived [12] [42].
Two practical implementations of the the PHD filter are studied in this thesis. The
first of these, in chapter 4, is the Sequential Monte Carlo implementation known as the
Particle PHD Filter [18] [17] which extends the single target particle filter to a multiple
target version. The second implementation, in chapter 5, is the Gaussian Mixture PHD fil-
ter [34] [33], which provides a closed form solution to the PHD filter under linear-Gaussian
conditions and is similar in style to the Gaussian Sum filter [40] [43].
A detailed study of the convergence properties is conducted for both of the implemen-
tations of the PHD filter. In chapter 4, it is shown that the empirical representation of the
PHD converges weakly to the true density as the number of particles increases and bounds
are provided for the mean square errors based on results for particle filters [21]. In chapter
5, it is shown that the Gaussian sum representation of the PHD converges uniformly to the
true PHD and error bounds are provided for the pruning and merging stages of the algo-
rithm to show that these fall within acceptable limits [35]. Conditions are provided for the
extended Kalman implementation of the algorithm to converge uniformly based on results
for the Gaussian Sum filter [44].
An example of the particle PHD filter algorithm for tracking in forward-looking sonar
data is shown in chapter 6 [24], demonstrating the potential for the algorithm for target
estimation in practical applications. A set of target states is estimated at each iteration
from range and bearing measurements obtained from a sonar device fitted to an underwater
vehicle surveying an area of seabed. The algorithm is demonstrated on real and simulated
13
sonar data with a variable number of targets in cluttered environments.
Since the posterior PHD is a multi-modal distribution, methods are required to deter-
mine the target state estimates at each iteration [45]. Methods for clustering the particles
are considered in chapter 7 for finding peaks in the empirical particle distribution. Another
important consideration for multiple target tracking is to maintain continuity track identity
to identify the same target in successive iterations of the algorithm. Chapter 7 also presents
novel methods to enable track continuity for the Particle PHD filter [31]. Chapter 8 presents
the GM PHD Multi-target Tracker [46] and demonstrates the the Gaussian Mixture PHD
filter has the inherent ability to maintain target tracks by following the individual Gaussians
within the mixture.
The ninth chapter demonstrates that the methods developed for multiple target tracking
with the PHD filter can be implemented on real data with an application on forward-looking
sonar data. It is shown that the Particle PHD filter gives comparable performance to a near-
est neigbour approach with Kalman filters [26] [47]. In addition, it is shown that the GM
PHD filter can track a variable number of targets in a reasonably high level of clutter [36].
These results provide the first implementations of the PHD filter for multiple target track-
ing with continuity of track identity on real data and demonstrate that these techniques have
real practical value.
The final chapter summarises the work presented in this thesis and outlines future re-
search with PHD filters.
14
1.5 Original Contributions
This thesis addresses some of the essential issues required for the PHD filter to be of prac-
tical value in multiple target tracking applications. Two implementations of the PHD filter
are studied in detail. The first of these is the Particle PHD filter [17], which is a Sequential
Monte Carlo technique based on particle filtering techniques. The second implementation
studied is the Gaussian Mixture PHD filter [33], which provides a closed form solution to
the PHD filter. The specific contributions of each chapter are outlined below.
Chapter 4: The Particle PHD Filter
This chapter presents mathematical proofs of convergence for the Particle PHD Filter algo-
rithm and gives bounds for the mean square error.
”Convergence Results for the Particle PHD Filter” IEEE Transactions on Signal Process-
ing, Volume 54, No. 7, p2652-2661, July 2006.
Chapter 5: The Gaussian Mixture PHD Filter
This chapter proves uniform convergence of the errors in the Gaussian Mixture PHD filter
algorithm and provides error bounds for the pruning and merging stages.
”Convergence Analysis of the GM PHD Filter”, IEEE Transactions on Signal Processing,
in press.
Chapter 6: PHD Filter Target Estimation in Sonar Images
An implementation of the particle PHD filter is demonstrated on real forward-looking sonar
taken from an underwater vehicle to estimate both the number of targets and their locations.
15
”Bayesian Multiple Target Tracking in Sonar Images with the PHD Filter” IEE Proceed-
ings on Radar, Sonar and Navigation, Volume 152, Issue 5 , p. 327-334. 2005
”PHD Filter Multi-target Tracking in 3D Sonar”, IEEE Oceans Europe Conference, Brest
June 2005. Volume 1, June 20-23, 2005 p265 - 270
Chapter 7: Target Tracking with the Particle PHD Filter
Two clustering techniques are compared for determining the multiple-target states from
the particle density, namely k-means clustering and mixture modelling via the expectation-
maximization algorithm. Novel techniques are developed for associating the targets be-
tween frames to enable identification of the individual target tracks.
”Multi-Target State Estimation and Track Continuity for the PHD Filter”, IEEE Transac-
tions on Aerospace and Electronic Systems, Volume 43 no 3. July 2007
”Data Association for the PHD Filter”, Intelligent Sensors, Sensor Networks and Informa-
tion Processing Conference, 2005. Proceedings of the 2005 International Conference on
5-8 Dec. 2005 Page(s):217 - 222
Chapter 8: The GM-PHD Filter Multi-Target Tracker
It is shown here that the trajectories of the targets can be determined directly from the
evolution of the Gaussian mixture of the PHD and that single Gaussians within this mix-
ture accurately track the correct targets. Furthermore, the technique is demonstrated to be
successful in estimating the correct number of targets and their trajectories in high clutter
density.
”The GM-PHD Filter Multi-Target Tracker” Proceedings of the International Conference
16
on FUSION, July 2006.
Chapter 9: Multiple Target Tracking in Sonar Images
The multiple target tracking techniques developed for the two implementations of the PHD
filter are demonstrated on both simulated sonar and real forward-looking sonar data ob-
tained from an Autonomous Underwater Vehicle (AUV) and these approaches are com-
pared with a conventional Nearest Neighbour approach with Kalman filters. It is shown
that the Particle PHD filter with estimate-to-track association gives comparable tracking
performance to the Nearest Neighbour approach, and that the GM-PHD filter is demon-
strated to give comparable performance in higher levels of clutter.
”Particle PHD Filter Multiple Target Tracking in Sonar Images” IEEE Transactions on
Aerospace and Electronic Systems, Volume 43, no 3. July 2007.
”Multiple Target Tracking and Data Association in Sonar Images” 2006 IEE Seminar on
Target Tracking. p149-154
”GM-PHD Multi-target Tracking in Sonar Images”, 2006 SPIE Defense and Security Sym-
posium [6235-29]
17
Chapter 2
Bayesian Filtering
2.1 Background
Single-target tracking requires the estimation of the state of a signal at each point in time
based on a discrete set of noisy measurements, where a new measurement is received at
each time-step. This definition coincides with the mathematical theory of filtering and the
terms have become synonymous in the engineering community (provided that the correct
measurement is assigned to the filter).
One interpretation of filtering theory, called optimal non-linear filtering, is defined as
follows. Suppose that we wish to estimate a process which can not be observed directly, by
observations from a different noisy process, where a relationship between the two processes
is known (these processes may be continuous in time). The problem is to provide the
estimate of the signal based on the observations up to the current time. The signal and
observation processes are given by stochastic differential equations [48]. In the case where
an estimate is required when each measurement becomes available, the problem is known
as filtering [49]. The Kalman filter [1] is a special case of the nonlinear filtering problem
18
when the signal and observation processes are linear and the noise processes are Gaussian.
In this case, the stochastic differential equations can be solved explicitly since the linear
filtering equations form a closed set.
An alternative interpretation of filtering theory is Bayesian filtering, which recursively
applies Bayes’ rule to determine the conditional probability distribution of the signal pro-
cess. The signal and observation processes are now discrete in time, which is consistent
with measurements received for target tracking and thus is applicable. The Bayesian deriva-
tion of the Kalman filter relies only on the properties of Gaussian distributions and does not
require an understanding of stochastic differential equations and martingale theory [37].
Furthermore, the Bayesian interpretation can allow for a Sequential Monte Carlo approach
to be adopted [41], where simulation based methods can be used for approximating the
posterior distributions. These techniques do not rely on any of the linearised/Gaussian as-
sumptions on the signal and observation models and have provable convergence properties.
This chapter provides a motivation for the filtering algorithms in the context of target
tracking before describing Bayesian filtering and presenting the commonly used techniques.
2.2 Single-Target Bayesian Filtering
To make an inference about the state of a dynamic system, two equations are needed, the
state equation which describes the evolution of state with time, or the motion of a target,
and the measurement equation which relates the observations received from a sensor to the
state. In the case where an estimate is required every time a measurement is received, a
recursive filtering approach is used which is predicted and updated for each time-step. The
prediction stage uses the state equation to predict the state in the next time-step and update
19
stage uses the measurement equation to calculate the posterior distribution according to
Bayes rule.
Let x0:t := x0, ...,xt be an unobserved signal process of dimension n that we wish
to estimate, and Zt := σ(z1, ...,zt) be the σ-algebra generated by noisy observations of
dimension m ≤ n related to this process. The single-target filtering problem is to estimate
recursively in time, the probability distribution p(xt |Zt) of the signal. From this, an estimate
of the target location, xt , needs to be determined. One of the possible estimates is the con-
ditional expectation, xt = E(xt |Zt), of the signal. Other possible choices are the maximum
a-posterior estimate which may be more appropriate with multi-modal distributions.
The evolution of the signal process x0:t is governed by the state equation
xt = ft(xt−1,vt−1), (2.1)
where ft is a (possibly) non-linear function representing the motion of the target and v0:t−1
is the process noise sequence representing the uncertainty in the target motion. The obser-
vations are governed by the measurement equation,
zt = ht(xt ,εt), (2.2)
where ht is a function related to observing xt and ε1:t := ε1, . . . ,εt is the observation noise
sequence reflecting errors in the observations. The process and measurement error noise
sequences are uncorrelated. When functions ft and ht are linear and the noise sequences
v0:t−1 and ε1:t are Gaussian, then the optimal estimate is given by the Kalman filter. When
these restrictive conditions are not met, alternative methods for obtaining xt are needed.
20
Expressed in Bayesian terms, the problem is to estimate recursively in time the posterior
distribution, p(xt |Zt), by the following prediction and update stages. The prediction stage
involves calculating the prior distribution, p(xt |Zt−1), of the state being in xt based on the
previous observations,
p(xt |Zt−1) =
Z
p(xt |xt−1)p(xt−1|Zt−1)dxt−1. (2.3)
When the new measurement, zt , has been observed, the update stage involves calculating
the posterior distribution by Bayes’ Rule,
p(xt |Zt) =p(zt |xt)p(xt |Zt−1)
p(zt |Zt−1)=
p(zt |xt)p(xt |Zt−1)R
p(zt|xt)p(xt |Zt−1)dxt, (2.4)
where Zt is the σ-algebra generated by the measurements up to time t and p(zt |xt) is the
likelihood of observing zt given signal xt . The estimated signal xt can, for example, be
taken to be conditional mean of xt ,
xt = E(xt |Zt) =Z
xt p(xt |Zt)dxt . (2.5)
2.3 Kalman Filtering Techniques
In this section, the Kalman filter [1] is derived and variants of this technique for non-linear
scenarios are described including the extended Kalman filter (EKF) [38], unscented Kalman
filter (UKF) [39] and Gaussian sum filter [40].
21
2.3.1 The Kalman Filter
The Kalman Filter [1] recursively calculates the exact posterior distribution based on the
assumptions that the posterior distribution is Gaussian, the process and observation noises vt
and εt are uncorrelated, white noise sequences with mean zero, and state and measurement
equations ft and ht are linear functions. The state and measurement equations are
xt = Ftxt−1 +Γvt−1 (2.6)
zt = Htxt + εt (2.7)
where vt−1 and εt are uncorrelated, white Gaussian noise sequences with mean zero and
covariance matrices Q and R respectively, and Ft and Ht are matrices defining the linear
functions ft and ht respectively. The expectation and covariance of the signal given the set
of measurements up to time t are denoted E(xt |Zt) := xt and Cov(xt |Zt) = Pt . The notation
used for Gaussians shall be
N (x;m,P) := 1(2π)d/2 det(P)1/2 exp−1/2(x−m)T P−1(x−m), (2.8)
with variable x, mean m and covariance P.
The following two Theorems establish the prediction and update steps required in the
Kalman filter based on the Bayesian derivation of the Kalman filter by Ho and Lee [37].
THEOREM 1 Given the Gaussian posterior distribution at time t − 1 and the linear state
22
model, the prior probability distribution at time t is the Gaussian
p(xt |Zt−1) = N (xt ; xt|t−1,Pt|t−1), (2.9)
where the predicted state estimate and covariance to time t are
xt|t−1 := Fxt−1 (2.10)
Pt|t−1 := FPt−1FT +ΓQΓT . (2.11)
Proof
The expectation of the state at time t, given measurements up to time t −1, is, by equation
(2.6),
E(xt |Zt−1) = E(Fxt−1 +Γvt−1|Zt−1), (2.12)
which, by the linearity of expectation,
= E(Fxt−1|Zt−1)+E(Γvt−1|Zt−1) = FE(xt−1|Zt−1)+ΓE(vt−1|Zt−1) = Fxt−1. (2.13)
The last equality holds since xt−1 := E(xt−1|Zt−1) and E(vt−1) = 0. The prediction covari-
23
ance is calculated with
Cov(xt|Zt−1) (2.14)
= E((Fxt−1 +Γvt−1)(Fxt−1 +Γvt−1)T |Zt−1) (2.15)
= E(Fxt−1xTt−1FT +Γvt−1vT
t−1Γ|Zt−1) (2.16)
(since the cross terms are zero)
= FPt−1F +ΓQΓT = Pt|t−1. (2.17)
Combining the mean and covariance gives the required Gaussian 1
THEOREM 2 Given that the prior probability density to time t is Gaussian and that the
dynamic model is linear, the posterior distribution at time t is also Gaussian, and is given
by
p(xt |Zt) = N (xt ; xt ,Pt). (2.18)
The state estimate xt and covariance Pt are obtained by
xt = xt|t−1 +Kt(zt −Hxt|t−1) (2.19)
Pt = [I −KtH]Pt|t−1, (2.20)
1The blacksquare shall be used to denote the end of a proof of a Theorem or Lemma throughout the thesis.
24
where Kt is known as the Kalman gain,
Kt = Pt|t−1HT [HPt|t−1HT +R]−1. (2.21)
To prove this theorem, we need Lemmas 1 and 2, which are given after the proof of the
main result.
Proof
By Bayes rule, equation (2.4),
p(xt |Zt) =p(zt |xt)p(xt |Zt−1)
p(zt |Zt−1), (2.22)
which, by Lemmas 1 and 2, and Theorem 1,
=N (zt;Hxt ,R)
N (zt;Fxt ,HPtHT +R)N (xt ;Fxt−1,Pt|t−1) = N (xt ; xt ,Pt), (2.23)
where the final equality comes from completing the square, which results in the state esti-
mate xt and covariance Pt given in equations (2.19) and (2.20)
LEMMA 1 The probability distribution of zt based on measurements up to time t − 1 is
given by the Gaussian
p(zt |Zt−1) = N (zt;Fxt ,HPt|t−1HT +R). (2.24)
Proof
We compute the expectation of zt given the measurements up to time t − 1, by first using
25
the measurement equation,
E(zt |Zt−1) = E(Hxt + εt |Zt−1), (2.25)
and then the state equation,
= E(H(Fxt−1 +Γwt−1)+ εt |Zt−1) = HFxt−1, (2.26)
where the last equality holds since E(vt−1) = 0 and E(εt) = 0. Now consider the covariance,
Cov(zt|Zt−1) = E((Hxt + εt)(Hxt + εt)T |Zt−1) = HPtHT +R. (2.27)
The Lemma is proved by combining the mean and covariance above
LEMMA 2 The likelihood of observing zt given state xt is the Gaussian
p(zt |xt) = N (zt ;Hxt ,R) (2.28)
Proof
The expectation is computed using the measurement equation,
E(zt |xt) = E(Hxt + εt |xt) = Hxt , (2.29)
26
since E(εt) = 0. Similarly, the covariance is calculated,
Cov(zt |xt) =E((Hxt + εt)(Hxt + εt)T |xt) (2.30)
= E((HxtxTt HT + εtεt)
T |xt) = HCov(xt |xt)HT +R = R. (2.31)
Combining the mean and covariance gives the required Gaussian likelihood function
2.3.2 The Extended Kalman Filter
If the process to be estimated or the relationship between the measurement and the state
is non-linear, then the conditions required for the Kalman filter are no longer valid. The
extended Kalman filter linearises about the current mean and covariance using Taylor ap-
proximations [38]. The state and observation equations are now described by the equations,
xt = ft(xt−1,vt−1) (2.32)
zt = ht(xt ,εt), (2.33)
where the functions ft and ht can be non-linear. The noise sequences vt−1 and εt are
zero mean white Gaussian noises, as with the Kalman filter. For simplicity, we define the
notation ht(xt) := ht(xt ,0) and ft(xt) := ft(xt ,0).
To derive the extended Kalman filter, the following partial derivatives are required for
the state equation,
Ft−1 =∂ ft(xt−1,0)
∂xt−1
∣
∣
∣
∣
xt−1=xt−1
,Gt−1 =∂ ft(xt−1,wt−1)
∂wt−1
∣
∣
∣
∣
vt−1=0, (2.34)
27
and for the measurement equation,
Ht =∂ht(x)
∂x x=xt|t−1,Ut =
∂ht(xt|t−1,εt)
∂εt
∣
∣
∣
∣
εt=0. (2.35)
The nonlinear functions, ft and ht can then be expanded in terms of their Taylor series,
ft(xt) = ft(xt|t)+Ft(xt − xt|t)+ . . . (2.36)
ht(xt) = ht(xt|t)+Ht(xt − xt|t)+ . . . , (2.37)
and the model can be approximated with
xt+1 = Ftxt +Gtvt−1 +( ft(xt|t)−Ft xt|t), (2.38)
zt = Htxt + εt +(ht(xt|t)−Ht xt|t). (2.39)
Prediction Step
The predicted state and covariance are given by the extended Kalman prediction equations,
xt|t−1 = ft(xt−1,0), (2.40)
Pt|t−1 = Gt−1Qt−1[Gt−1]T +Ft−1Pt−1[Ft−1]
T . (2.41)
Measurement Update
The updated state and covariance are computed with the extended Kalman update equa-
28
tions,
xt =xt|t−1 +Pt|t−1Ht[HTt Pt|t−1Ht +Rt ]
−1HTt Pt|t−1 (2.42)
Pt =[I −KtHt ]Pt|t−1, (2.43)
where the extended Kalman filter gain is
Kt =Pt|t−1[Ht]T [UtRtUT
t +HtPt|t−1HTt ]−1. (2.44)
In Theorems 1 and 2, it was shown that when the dynamic model is linear and Gaussian,
the Kalman prediction and update equations result in another Gaussian. Since the prediction
for the Extended Kalman Filter produce approximations for xt|t−1 and Pt|t−1 and is no longer
Gaussian, it is useful to know under what circumstances the approximation is accurate. The
following results, from Anderson and Moore [44], give conditions for the convergence of
the filter (we omit the proofs here).
LEMMA 3 If the prior probability at time t is the Gaussian
p(xt |Zt−1) = N (xt ; xt|t−1,Pt|t−1), (2.45)
then for fixed ht , xt|t−1 and Rt
p(xt |Zt) → N (xt ; xt|t,Pt|t) (2.46)
uniformly in xt and zt as Pt|t−1 → 0.
29
LEMMA 4 If the posterior density at time t is the Gaussian
p(xt |Zt) = N (xt ; xt|t,Pt|t), (2.47)
then
p(xt+1|Zt) → N (xt+1; xt+1|t ,Pt+1|t) (2.48)
as Pt → 0.
These Lemmas will also be invoked to give convergence results for the Gaussian Sum
filter and the Gaussian Mixture Probability Hypothesis Density Filter to be described in a
later chapter.
2.3.3 The Unscented Kalman Filter
A new linear estimator was developed in the mid-nineties by Julier [50] called the Un-
scented Kalman filter which uses a set of discretely sampled points to parameterise the
mean and covariance. This technique does not require the linearisation steps needed for the
extended Kalman filter and it was shown that the performance is analytically superior to the
extended Kalman filter.
The idea behind the unscented transform is that it is easier to approximate a Gaussian
distribution than an arbitrary non-linear function. A set of sigma-points are chosen so that
their mean and covariance are xt−1 and Pt−1. The non-linear transform is applied to each
of the points to obtain a set of transformed points with mean xt|t−1 and covariance Pt|t−1. If
30
the state dimension is n, then 2n+1 sample points are chosen deterministically with
χ(0)t−1 =xt−1, w(0)
t−1 = κ/(n+κ), (2.49)
χ(i)t−1 =xt−1 +
(
√
(n+κ)Pt−1)
i, w(i)
t−1 = κ/2(n+κ), (2.50)
χ(i+n)t−1 =xt−1 −
(
√
(n+κ)Pt−1)
i, w(i+n)
t−1 = κ/2(n+κ), (2.51)
where κ ∈ R and w(i)t−1 is the weight associated with the ith sigma point at time t −1. Simi-
larly, a set of points are computed for the observation equation h.
The predicted mean and covariance are computed with the following summations,
xt|t−1 =2n∑i=0
w(i)t−1 f (χ(i)
t−1) (2.52)
Pt|t−1 =2n∑i=0
w(i)t−1( f (χ(i)
t−1)− xt|t−1)( f (χ(i)t−1)− xt|t−1)
T , (2.53)
and these are updated with the usual Kalman filter when a measurement is received. Similar
calculations are performed to find the predicted observation and innovation covariance.
The mean and covariance are calculated accurately up to the second order whereas the
EKF is only accurate up to first order. The main difference between this and the EKF is
that the distribution is being approximated instead of the state function ft . Numerically
stable and efficient methods can be used to compute the sigma points and there is no need
to calculate complicated Jacobian matrices. Practical examples have demonstrated the use
of the UKF in real tracking scenarios and it compares favourably with the EKF both in
accuracy and ease of implementation.
31
2.3.4 The Gaussian Sum Filter
The Gaussian Sum filter [44] was developed to allow for more general probability distri-
butions than just a unimodal Gaussian distribution. The state estimate is a weighted sum
of the filter outputs from a set of Extended Kalman filters. The justification behind this
approach is due to a consequence of Wiener’s Theorem on Approximation that any prob-
ability density can be approximated to an arbitrary degree with a sum of Gaussians, see
Theorem 3.
THEOREM 3 Any density on Rd can be approximated as closely as desired in L1 by a linear
combination of Gaussian densities,
v(x) = limn→∞
n∑i=1
αiN (x;µi,Pi) (2.54)
Proof
This result is due to Wiener’s theorem on approximation [51]
This means that given any ε > 0, a positive integer N can be found such that
Z
|v(x)−n∑i=1
αiN (x;µi,Pi)|dx ≤ ε, (2.55)
for n ≥ N.
Assume that the posterior at time t is given by the Gaussian sum
p(xt |Zt) =Jt
∑i=1
w(i)t N (x;m(i)
t ,P(i)t ). (2.56)
32
Then the mean and covariance are
xt =Jt
∑i=1
w(i)t m(i)
t , (2.57)
Pt = E[(xt − xt)(xt − xt)T ] =
Jt
∑i=1
w(i)t [P(i)
t +(xt −m(i)t )(xt −m(i)
t )T ], (2.58)
and the sum of the weights is 1,
Jt
∑i=1
w(i)t = 1. (2.59)
The algorithm is initialised with a set of Gaussians and follows the prediction and update
recursion given below.
Prediction Step
The individual components are predicted into the next time step using the Extended Kalman
Filter prediction equations (2.40) and (2.41).
LEMMA 5 If the posterior at time t −1 is given by the sum of Gaussians
p(xt−1|Zt−1) =Jt−1
∑i=1
w(i)t−1N (x;m(i)
t−1,P(i)t−1), (2.60)
then the predicted density approaches the Gaussian sum
p(xt |Zt−1) →Jt−1
∑i=1
w(i)t−1N (x;m(i)
t|t−1,P(i)t|t−1), (2.61)
uniformly in xt as P(i)t−1 → 0 for each component i.
33
Proof
Each component converges uniformly using Lemma 3 for the Extended Kalman Filter and
the result follows
Measurement Update
When the new measurement, zt , becomes available at time t, the Gaussian components are
updated with the Extended Kalman Update. The weights are recomputed according to the
Gaussian likelihood function,
w(i)t = w(i)
t−1N (zt ;ht(m(i)
t ,HtP(i)t|t−1HT
t +R)
∑Jtl=1 N (zt;ht(m(l)
t ,HtP(l)t|t−1HT
t +R). (2.62)
LEMMA 6 Suppose that the predicted density to time t is given by the Gaussian sum
p(xt |Zt−1) =Jt
∑i=1
w(i)t−1N (x;m(i)
t|t−1,P(i)t|t−1). (2.63)
Then the updated density approaches the Gaussian sum
p(xt |Zt) =Jt
∑i=1
w(i)t N (x;m(i)
t ,P(i)t ). (2.64)
Proof
Each term converges uniformly by Lemma 4 from the Extended Kalman filter and the result
follows
34
2.4 Sequential Monte Carlo Filtering
The Bayesian filtering equations can not usually be computed analytically for general prob-
ability distributions and so Sequential Monte Carlo techniques or particle filters have proved
to be a successful method for approximating them.
Particle filters are sequential Monte Carlo methods based on point mass or particle rep-
resentations of probability densities with weightings of the particles corresponding to the
probability distribution. The basic concept is a recursive Bayesian filter by Monte Carlo
simulations. The Bootstrap Filter was proposed by Gordon [2] for implementing recursive
Bayesian filters with empirical representations of the probability densities. The density of
the state vector is represented by particles updated and propagated by the algorithm. The
idea is to eliminate particles having low importance weights and multiply particles having
high importance weights. A tutorial on particle filters and its variants is given in [52] and
review of Sequential Monte Carlo Methods with applications is presented in [41]. This
section describes sequential importance sampling, how this relates to particle filtering algo-
rithms and the convergence properties of these algorithms.
2.4.1 Sequential Importance Sampling and Resampling
A common technique for approximating a probability distribution is by Importance Sam-
pling. Suppose that we wish to draw samples from a probability distribution p(x) ∝ π(x)
which is difficult to sample from but it is possible to sample from π(x). Let q(x) be an
importance density for which we can generate N samples from. Then a weighted approxi-
35
mation to p(x) is given by
p(x) ≈N∑i=1
ω(i)δ(x− x(i)), (2.65)
ω(i) ≈ π(x(i))
p(x(i)), (2.66)
where ω(i) is the normalised weight of particle x(i).
The importance sampling distribution π(xt |Zt) at time t is given by
π(xt |Zt) = π(x0)t
∏k=1
π(xk|xk−1,Zk). (2.67)
The weights can be calculated recursively by
ω(i)t ∝ ω(i)
t−1p(zt |x(i)
t )p(x(i)t |x(i)
t−1)
π(x(i)t |x(i)
t−1,Zt). (2.68)
This technique can be applied sequentially when the prior distribution is the importance
sampling distribution π0
π(xt |Zt) = p(xt) = π0(x0)t
∏k=1
p(xk|xk−1), (2.69)
then the weights satisfy
ω(i)t ≈ ω(i)
t−1 p(zt |x(i)t ), (2.70)
and we only need calculate the likelihood function p(zt |x(i)t ).
36
This technique suffers from a problem called degeneracy which is when, after a few
iterations, the particles have negligible weights. This problem is resolved by resampling
from the weighted distribution to obtain an unweighted particle set which approximates the
posterior distribution.
2.4.2 The Particle Filter Algorithm
The Particle Filter or Bootstrap Filter was proposed by Gordon[2] for implementing recur-
sive Bayesian filters. The density of the state vector is represented by the discrete samples,
or particles. A description of the algorithm is given below.
•Step 0: Initialisation Step at t = 0.
In the initialisation step, we assume that we can sample N particles directly from the prior
π0, each one is assigned a mass of ω(i)0 = 1/N, hence
πN0 =
1N
N∑i=1
δx(i)0
, (2.71)
where δx(i)0
represents the dirac delta function located at particle position x(i)0 . We have
assumed that we can sample directly from π0 so by the Glivenko-Cantelli Theorem [53],
which states that empirical distributions converge almost surely to their true distributions,
limN→∞
πN0 = π0 a.s. (2.72)
Set t = 1.
•Step 1: Prediction Step at t ≥ 1.
A predicted state for each particle x(i)t−1 is obtained by projecting it with the Markov transi-
37
tion kernel f (x(i)t−1, ·),
x(i)t = f (x(i)
t−1, ·) (2.73)
The set of particles gives a discrete approximation to prior probability density p(xt |Zt−1).
•Step 2: Update Step at t ≥ 1.
When the new measurement zt is obtained, weights are updated for the particles by using
the likelihood function g(·|·),
ω(i)t =
Ng(zt |x(i)t )
∑Nj=1 g(zt |x( j)
t )(2.74)
The posterior distribution p(xt |Zt) := πt is represented by the measure,
πNt =
N∑i=1
ω(i)t δx(i)
t. (2.75)
•Step 3: Resampling Step, for t ≥ 1.
N new particles, x(i)t , i = 1, . . . ,N are created by resampling from x(i)
t , i = 1, . . . ,N according
to their weights. Thus particles with large weights will tend to be resampled more often than
those with low weights, and particles with low weights may be eliminated. This creates an
unweighted representation of the posterior distribution πt ,
πNt =
1N
N∑i=1
δx(i)t
. (2.76)
38
If Zt is the σ-algebra generated by the measurements up to time t, and Gt is the σ-algebra
generated by the particles at time t, then the estimated state xt is approximated by
xNt = EN(xt |Zt) = E(xt |Gt) =
1N
N∑i=1
x(i)t (2.77)
where EN(xt |Zt) represents the approximation to the true expectation E(xt |Zt) given by the
particles.
Set t = t +1 and repeat from Step 1.
2.4.3 Convergence Properties
One of the crucial considerations for the particle filter algorithm is whether it converges,
that is to say that as the number of particles increases does the empirical distribution given
by the particles tend to the true distribution in some sense, and can the errors in the ap-
proximation be bounded. Convergence studies by Crisan [54], [55], [56] amongst others
have demonstrated convergence of the mean square errors and weak convergence of the
empirical measures to the true measures at each step in the algorithm. When the density
in the inner product 〈., .〉 is continuous, it defines the integral inner product, and when it is
discrete, it defines the summation inner product, so that:
〈πt ,ϕ〉 =
Z
πt(xt |Z1:t)ϕ(xt)dxt (2.78)
and
〈πNtt ,ϕ〉 =
Nt
∑i=1
ω(i)t ϕ(x(i)
t ) (2.79)
39
If πt is the posterior distribution at time t, then it can be shown that at time t, there is a
constant c such that
E[
(〈πNt ,ϕ〉−〈πt,ϕ〉)2]≤ c‖ϕ‖2
N , (2.80)
for any bounded function ϕ.
To prove that an empirical distribution converges to its true distribution, we need to
have a notion of convergence for measures. This type of convergence is called weak con-
vergence, which is fundamental to the study of probability and statistics. With this type
of convergence, the values of the random variables are not important; it is the probabili-
ties with which they assume those values that matter. Thus, the probability distributions of
the random variables will be converging, not the values themselves [57]. Let µN and µ be
probability measures on Rd . Then, the sequence µN converges weakly to µ if
R
f (x)µN(dx)
converges toR
f (x)µ(dx) for each real-valued continuous and bounded function f on Rd .
The empirical measures considered here are the particles that approximate the true mea-
sures, where N is the number of particles. Let Cb(Rd) be the set of real-valued continuous
bounded functions on Rd . If (µN) is a sequence of measures, then µN converges weakly to
µ if:
limN→∞
〈µN,ϕ〉 = 〈µ,ϕ〉. (2.81)
We can write this as
limN→∞
πNt = πt a.s. (2.82)
40
where a.s. stands for almost surely, i.e. true for all values outside the null set.
In chapter 4, a study of the convergence properties of the particle implementation of
the Probability Hypothesis Density (PHD) filter is presented based on the results derived
for particle filters. One of the main differences between the two algorithms is that in the
particle filter, the total particle mass is 1 whereas in the PHD filter the particle mass gives
the expected number of targets.
2.5 Summary
This chapter has presented basic filtering theory from a Bayesian perspective. The Kalman
filter has been derived using the linear/ Gaussian assumptions on the state and measurement
models and Bayes’ rule. The extended Kalman filter and unscented Kalman filter are shown
for situations when the assumptions for the Kalman filter can be relaxed to accommodate
mildly non-linear models. The Gaussian sum filter is then introduced for non-Gaussian
distributions by representing the probability distribution as a mixture of Gaussians. The
sequential Monte Carlo approach to filtering is described with the example of a particle
filter, which approximates the probability distribution with a set of discrete samples. All of
these algorithms were presented in the context of single-target tracking. The PHD filter is
presented in the next chapter as a means of tracking multiple targets.
41
Chapter 3
The Probability Hypothesis Density
Filter
3.1 Introduction
This chapter describes the Probability Hypothesis Density (PHD) Filter. The PHD filter is a
first-moment filter which propagates the first-order moment of a dynamic point process. In
order to obtain a closed-form recursion, a Poisson point process assumption is made after
the prediction and update steps.
In single target tracking problems, the constant gain Kalman filter provides the compu-
tationally fastest solution for approximate filtering which propagates the first-order moment
of the posterior distribution. The PHD filter was proposed to provide an analogous solu-
tion in multiple target tracking problems. The first-order statistical moment of the multiple
target posterior distribution, known as the PHD, is propagated instead of the posterior. The
integral of the PHD over the state space provides an estimate of the number of targets and
42
the target states can be estimated by determining the peaks of this distribution.
In this chapter, the random set filtering framework is presented as a multiple-target
Bayesian recursion analogous to the single target case given in the previous chapter. The
point process theory required for the derivation is given and the PHD filter is then derived
using standard results from probability theory.
3.2 Random Set Filtering
The multiple target tracking framework based on random-sets was first proposed by Mahler[12]
as a rigorous mathematical model which attempts to unify the problems of detection, clas-
sification and tracking. The approach is a Bayesian model for recursively estimating and
updating a multi-target density function based on measurements received at each time-step.
Multiple-target filtering requires the unobserved signal process X0:t = X0, ...,Xt to
be estimated based on the σ-algebra generated by the sets of observations up to time t,
Zt := σ(Z1, ...,Zt), i.e. to obtain Xt = xt,1, ..., xt,Tt, where xt,i are the individual target
estimates and Tt is the estimate of the number of targets at time t. This is done by recursively
calculating the posterior distribution, or filtering distribution, pt(·|Zt).
The set of objects tracked at time t is modelled by the point process or Random Finite
Set (RFS)
Xt =
(
[
x∈Xt−1
St|t−1(x))
∪(
[
x∈Xt−1
Bt|t−1(x))
∪Γt . (3.1)
where St|t−1 is the RFS of targets survived at time t from multi-target state Xt−11 at time
1Note that Xt represents the RFS and Xt represents its realisation.
43
t−1, Bt|t−1 is the RFS of targets spawned from Xt−1 and Γt is the RFS of targets that appear
spontaneously at time t. The multi-target measurement at time t is modelled by RFS
Zt = Kt ∪(
[
x∈Xt
Θt(x))
, (3.2)
where Θt(Xt) is the RFS of measurements from multi-target state Xt and Kt is the RFS of
measurements due to clutter.
The optimal multi-target Bayes filter propagates the multi-target posterior density pt(·|Zt)
conditioned on the sets of observations up to time t, Zt , with the following recursion
pt|t−1(Xt|Zk−1) =
Z
ft|t−1(Xt |X)pt−1(X |Zt)µs(dX), (3.3)
pt(Xt |Zt) =gt(Zt |Xt)pt|t−1(Xt|Zt−1)
R
gt(Zt |X)pt|t−1(X |Zt−1)µs(dX), (3.4)
where the dynamic model is governed by the transition density ft|t−1(Xt|Xt−1) and multi-
target likelihood gt(Zt |Xt) and µs takes the place of the Lebesgue measure, as described
in [17].
The function gt|t(Zt |Xt) is the joint multi-target likelihood function, or global density,
of observing the set of measurements, Z, given the set of target states, X , which is the
total probability density of association between measurements in Z and parameters in X .
The parameters for this density are the set of observations, Z = z1, ...,zk, the unknown
set of target states, X = x1, ...,xt, the sensor noise distribution, or observation noise, the
probabilities of detection, PD, and false alarm, PFA, clutter models, and the detection profile
of sensor or field of view (FoV). For example suppose that the sensor noise density or single
target likelihood function is g(z|x), and that there are no false alarms and the probability of
44
detection is constant, then the joint multi-target likelihood is given by,
g(z1, ...,zk|x1, ...,xt) = pkD(1− pD)t−k ∑
1≤i1 6=...6=ik≤tg(z1|xi1) . . .g(zk|xik). (3.5)
The computational complexity of the joint multi-target likelihood grows exponentially with
the number of targets and so becomes numerically intractible [58]. The PHD filter was
derived to provide a sub-optimal strategy for determining the set of target states at each
iteration by using the first-order statistical moment of the multi-target posterior distribu-
tion [12].
3.3 Point Process Theory
It is a requirement for multiple target tracking problems that the number of targets and their
states are estimated. The set of target states can be modelled by a Point Process, where the
state of the population is defined as an unordered set of points X = x1, . . . ,xN. The points
are located in the state space χ, a complete separable metric space, where the number and
their locations are random.
To formulate the multiple target tracking model as a point process, we need the follow-
ing assumptions. A distribution pn,n ∈ N is given which determines the total number
of targets and satisfies ∑n∈N pn = 1. For each n ≥ 1, a probability distribution dn is given
which determines the joint distribution of the n targets. In target tracking, the realisation
of the point processes involved are unordered sets. Thus we require that dn are symmetric,
that is, all permutations are given equal weight. If the distributions dn are not symmetric,
45
they can be symmeterised,
dsymn (A1 × . . .×An) = ∑
permdn(Ai1 × . . .×Ain) (3.6)
where (A1, . . . ,An) is any partition of the state space, ∑perm is taken over all partitions
(i1, . . . , in) of integers (1, . . . ,n). dsymn now has the desired symmetric property [15].
3.3.1 Janossy Measures
It is convenient to introduce the Janossy measure which is symmetric by definition:
Jn(A1× . . .×An) = n!pndsymn (A1× . . .×An), (3.7)
Let jn(x1, . . . ,xn) denote the density of Jn, then
∞
∑n=0
1n!
Z
jn(x1, . . . ,xn)dx1 . . .dxn = 1 (3.8)
These densities directly relate to the multi-target posterior density of a random set Θ by
ft|t(x1, . . . ,xn|Zt) = j(x1, . . . ,xn), (3.9)
where x1, . . . ,xn is the random-set of target states and Zt is the random-set of observa-
tions [12]. The random-set x1, . . . ,xn and vector (x1, . . . ,xn) can be used interchangeably
since the Janossy density is symmetric.
46
3.3.2 Probability Generating Functionals
Let ξ be any bounded complex-valued Borel measurable function defined on complete sep-
arable metric space χ. Then, for any realisation, (x1, . . . ,xN) of a finite point process, the
product ∏Ni=1 ξ(xi) is well defined. For convenience, we define the notation
Z
pX(dX)∏X
[ξ] :=Z
pX(dx1 . . .dxN)N∏i=1
ξ(xi) (3.10)
The probability generating functional (PGFL), GX , of a point process X is well defined, and
is given by
GX [ξ] := E(
N∏i=1
ξ(xi)
)
=
Z
pX(dX)∏X
[ξ], (3.11)
where pX is the density of probability measure PX defined on point process X . The PGFL
characterises the point process entirely. For instance, one can recover the Janossy measures
by expanding GX [ξ] as
GX [ξ] =∞
∑n=0
1n!J(n)
X [ξ, . . . ,ξ], (3.12)
where the functional J(n)X is defined by
J(n)X [ξ, . . . ,ξ] :=
Z
χ1...χnJn(dx1 × . . .×dxn)
n∏i=1
ξ(xi). (3.13)
Let M be the linear vector space of all bounded measurable complex-valued functions
defined on χ and let ‖η‖ < 1. Let ξ and η be fixed elements of M and let ‖η‖ < 1. If r is
the largest real number such that η + λξ ∈ Sg = φ : ‖φ‖ ≤ r for |λ| < r, then G[η + λξ]
47
can be written as
G[η+λξ] =∞
∑k=0
λk∞
∑n=k
Z
χ(n)
k∏i=1
ξ(xi)n
∏i=k
η(xn)p(n)X (dx1 × . . .×dxn) (3.14)
The nth-order variation of generating functional G, or functional derivative, is defined to be
δnξ1,...,ξn
G[η] :=[
∂n
∂λ1, . . . ,λnG[
η+n∑i=1
λiξi
]]∣
∣
∣
∣
∣
λ1=...=λn=0, (3.15)
where supx |η(x)| < 1 (see [59]). The nth order Janossy measure can be determined from
GX by evaluating δnξ1,...,ξn
G[η] at η = 0,
(d(n)GX)0[ξ1, . . . ,ξn] := limη→0
δnξ1,...,ξn
G[η] = J(n)X [ξ1, . . . ,ξn]. (3.16)
The intensity measure VX can be obtained by differentiating GX at η = 1,
(dGX)1[ξ] := limη→1
δξG[η] = VX ·ξ :=Z
ξ(x)VX(dx). (3.17)
VX is a measure in the conventional sense, i.e., non-negative and countably additive. If
this measure admits a density, then it defines the Probability Hypothesis Density. VX can
be defined in a few different ways For instance, it can also be defined as the first-order
moment or expectation measure via Random Counting Measures. Let x = x1, . . . ,xn be a
collection of points in X . We define the counting measure N(·|x) to be
N(A|x) =n∑i=1
IA(xi), A ⊂ X , (3.18)
48
where IA(xi) = 1 if xi ∈ A and 0 otherwise. A point process X has an equivalent repre-
sentation in terms of the counting measure it induces. To see this, note that the product
∏Ni=1 ξ(xi) can be expressed as
N∏i=1
ξ(xi) = expZ
logξN(dy|x). (3.19)
Let N1 and N2 be the counting measure representation of two independent point processes.
We can define a third point process to be the superposition of these two [60],
N(A) = N1(A)+N2(A), A ⊂ X (3.20)
It follows that
GN[ξ] = GN1[ξ]GN2[ξ]. (3.21)
A joint probability generating functional (JPGFL), GX ,Y of point processes X and Y can be
defined by
GX ,Y [g,h] :=Z Z
pX ,Y (dx,dy)∏X
[g]∏Y
[h], (3.22)
and has the following properties,
(dnGX ,Y [1, ·])0[η1, . . . ,ηn] = J(n)Y [η1, . . . ,ηn], (3.23)
(dnGX ,Y [g, ·])h=0[η1, . . . ,ηn] = pX · (J(n)Y |X [η1, . . . ,ηn|x]Π(·)[g]), (3.24)
49
where η1, . . . ,ηn are complex-valued functions which are defined on the complete separable
metric space on which elements of point process Y are located. The second property is valid
provided that differentiation and expectation can be interchanged. This can be verified using
the Lebesgue Dominated Convergence Theorem [42]. If the Janossy measure JY |X admits a
density jY |X then it can be replaced with this in the second property above. We can now use
these results to define the conditional probability generating functional GX |y[η] = GX |Y [η|y]
using Bayes rule,
PX |Y (dx|y) =PX(dx)PY |X(y|x)
R
PX(dx)PY |X(y|x) . (3.25)
The conditional PGFL of X given Y = y is defined to be
GX |Y [η|y] := PX |Y (·|y) ·Π(·)[η] (3.26)
= PX ·pY |X(y|·)Π(·)[η]
PX · pY |X(y|·) , (3.27)
=(dnGX ,Y [g, ·])0[δy1 , . . . ,δyn ]
(dnGX ,Y [1, ·])0[δy1, . . . ,δyn], (3.28)
where δy represents the dirac delta function centred at y.
The theory presented in this section will be used in the next section to derive the PHD
filter.
50
3.4 PHD Filter Derivation
The PHD filter can be considered as the first-order moment of a dynamic point process
with Markov shifts [42]. A Poisson assumption is made in order to derive a closed form
solution. This section provides a derivation of the PHD recursion from point process theory
described in section 3.2 using standard probability theory, based on the derivations by Vo
and Singh [42]. The probability generating functional of the prediction distribution, ft|t−1,
is shown to be a transformation of the PGFL for the multi-target posterior at time t, ft|t .
The formula for the prediction PHD is then found in terms of posterior PHD by taking the
functional derivative of the probability generating functionals. The update equation for the
PHD filter is found by assuming that the prediction density ft|t−1 is approximately Poisson,
and finding the relationship with the posterior in terms of its joint probability generating
functional.
3.4.1 The PHD Prediction Equation
The dynamics of the system evolve according to the probability that a given target state
x ∈ Xt−1 will survive, pS,t , and the transition kernel ft|t−1.
The RFS Bt|t−1(x) models the set of target states spawned from target x ∈ Xt−1 and
Γt models the set of new target states which appear spontaneously. The random finite set
Xt , which models the multi-target state at time t, is the union of the targets which have
survived from t − 1, those which have been spawned by existing targets and those which
appear spontaneously at time t.
51
THEOREM 4 Suppose that the RFS of targets at time t is given by
Xt =
(
[
x∈Xt−1
St|t−1(x))
∪(
[
x∈Xt−1
Bt|t−1(x))
∪Γt . (3.29)
If the intensity measures of the multi-target probability distributions admit densities, then
the prediction equation for the PHD is given by
Dt|t−1(x|Zt−1) = γt(x)+Z
φt|t−1(x,xt−1)Dt−1|t−1(xt−1|Zt−1)dxt−1, (3.30)
where Dt−1|t−1 is the density of intensity measure Vt−1|t−1 of the multi-target posterior at
time t − 1, an Dt|t−1 is the predicted density to time t. The transition kernel φt|t−1 is given
by
φt|t−1(x,ξ) = pS,t(ξ) ft|t−1(x|ξ)+βt|t−1(x|ξ), (3.31)
γt is the PHD for spontaneous birth of a new target at time t, βt|t−1 is the PHD for spawned
target birth of a new target at time t, PS,t is the probability of target survival and ft|t−1 is the
single target motion distribution.
Proof
Let the intensity measure of the multi-target prediction density pt|t−1(·|Zt−1) be denoted
Vt|t−1(·|Zt−1), and let the intensity measure of the multi-target posterior density pt(·|Zt−1)
be denoted Vt(·|Zt). Furthermore, let these intensity measures admit densities Dt|t−1 and Dt
resepctively, which are the PHDs. Let VSt|t−1 , VBt|t−1 and VΓt denote the intensity measures of
52
St|t−1, Bt|t−1 and Γt respectively. The Random Sets St|t−1, Bt|t−1 and Γt are independent, so
by superposition (equation 3.21)(
S
x∈Xt−1 St|t−1(x))
∪(
S
x∈Xt−1 Bt|t−1(x))
has probability
generating functional GSt|t−1GBt|t−1GΓt and intensity measure VSt|t−1 +VBt|t−1 +VΓt . Using
the fact that St|t−1 and Bt|t−1 are formed from Markov shifts and Proposition 8.2IV [15],
Vt|t−1(A) = 〈Vt−1,VSt|t−1 +VBt|t−1〉+VΓt , ∀A ∈ B(Y ), (3.32)
which gives the desired PHD prediction, by taking the densities
3.4.2 The PHD Measurement Update Equation
The set of noisy observations at time t, which may include false alarms and have missed
detections, is modelled by random finite set, or point process, Zt. A measurement, detected
with probability pD,t , is distributed according to conditional probability, Lt ,
Lt(x,B) = P(zt |xt = x), (3.33)
which it is assumed admits a density known as the likelihood function, lt , determined from
the Radon-Nikodym derivative,
lt(x, ·) =dLt(x, ·)
dλZ, (3.34)
on observation space Z. The random finite set representing the set of measurements is
given by the union, of Θt(x), the RFS which is either empty or is distributed according to
Lt(x, ·), and Kt , the set of false alarms known as clutter points. The point process Zt can be
53
described by the conditional probability measure
P(Zt ∈ V |Xt = x) (3.35)
for all sets V on the observation space.The likelihood of observation z given target state
x is written gt(z|x) := lt(x,z) (note that there will be no confusion with the multi-target
likelihood since the parameters will relate to single targets).
THEOREM 5 Suppose that the set of measurements at time t is given by
Zt = Kt ∪(
[
x∈Xt
Θt(x))
, (3.36)
where Kt and Xt are Poisson point processes. Furthermore, assume that the prediction
distribution is Poisson, and that the intensity measures admit densities. Then the PHD
Measurement Update Equation is given by
Dt|t(x|Zt) =
[
(1− pD,t)+ ∑z∈Zt
ψt,z(x)κt(z)+ 〈Dt|t−1,ψt,z〉
]
Dt|t−1(x|Zt−1), (3.37)
where
κt(z) = λtct(z), (3.38)
is the clutter model with λt being the Poisson parameter specifying the expected number
of false alarms and ct is the probability distribution over the observation space of clutter
54
points and
ψt,z = pD,t(x)g(z|x), (3.39)
where g is the single target likelihood function and pD,t is the probability of detection.
Proof
Using the independence of point processes Kt and Θt(x), by superposition, the conditional
probability generating functional GZt|Xt [h|x] is given by
GZt|Xt [h|x] = GKt [h] ∏x∈Xt
GΘt(x)[h], (3.40)
and since x is distributed according to Lt with probability pD,t that it is detected, PGFL
GΘt(x)[h] is given by
GΘt(x)[h] = (1− pD,t)+ pD,tH[h](x) (3.41)
where H[h] is the functional defined by
H[h](x) :=Z
h(z)g(z|x)dz, (3.42)
so that equation (3.40) is equal to
GKt [h] ∏x∈Xt
((1− pD,t)+ pD,tH[h](x)) . (3.43)
55
Hence, using 3.24, the joint probability generating functional GXt ,Zt [ξ,h] is given by
GXt ,Zt [ξ,h] = Pt|t−1 · (GZt|Xt [h|·]Π(·)[ξ]) = GKt [h]GXt [ξ(1− pD,t + pD,tH[h])], (3.44)
and since Xt and Kt are Poisson point processes,
= exp(VKt · (h−1)+VXt · (ξ(1− pD,t + pD,tH[h]))) . (3.45)
Using equation 3.28, from Bayes rule,
GXt |Zt[y] =(dnGXt ,Zt [Zt , ·])0[δz1, . . . ,δzn]
(dnGXt ,Zt [1, ·])0[δz1, . . . ,δzn], (3.46)
so that the conditional measure VXt |Zt [η|y] is
VXt |Zt [η|y] = VXt ·η =(d(dnGXt ,Zt [y, ·])0[δz1 , . . . ,δzn])ξ=1[η]
(dnGXt ,Zt [1, ·])0[δz1, . . . ,δzn]. (3.47)
For ease of notation, define the functional Fy,h[ξ] as,
Fy,h[ξ] :=(dnGXt ,Zt [y, ·])h[ξ, . . . ,ξ], (3.48)
so that the conditional intensity measure is
VXt |Zt [η|y] =(dFy,0)1[η]
Fy,0[1]. (3.49)
56
Evaluating Fy,h[ξ], by taking the functional derivatives, gives
Fy,h[ξ] = GXt ,Zt [ξ,h] ∏z∈Zt
(VXt ·ξψt,z + vKt (y)), (3.50)
where ψt,z(x) = pD,tg(z|x). Taking the derivative of this at h = 0 and ξ = 1 using the chain
rule, we get
(dFy,0)1[η] = (3.51)
VXt ·η(1− pD,t)Fy,0[1]+GXt ,Zt [1,0] ∏z∈Zt
(VXt ·ξψt,z + vKt (z)) ∑z∈Zt
VXt ·ηψt,zVXt ψt,z + vKt (z)
, (3.52)
which by 3.50,
= VXt η · (1− pD,t)Fy,0[1]+Fy,0[1] ∑z∈Zt
VXt ·ηψt,zVXt ψt,z + vKt (z)
, (3.53)
and taking out a factor of Fy,0[1],
= Fy,0[1]VXt ·η(
(1− pD,t)+ ∑z∈Zt
ψt,zVXt ψt,z + vKt (z)
)
. (3.54)
Then, by 3.47, the conditional measure VXt |Zt [η|z] is
VXt |Zt [η|z] = VXt ·η(
(1− pD,t)+ ∑z∈Zt
ψt,zVXt ψt,z + vKt (z)
)
. (3.55)
57
Since VXt = Vt|t−1(·|Zt−1) and VXt |Zt = Vt , we have
Vt [η|z] = Vt|t−1 ·(
(1− pD,t)+ ∑z∈Zt
ψt,zVt|t−1 ·ψt,z + vKt (z)
)
η, (3.56)
which, by taking the densities gives the PHD measurement equation
3.5 Summary
The framework for multiple target tracking used in this thesis has been described in this
chapter. This has been presented in the context of point processes, which enables results
from standard probability to be invoked. The PHD filter has been derived from point process
theory as the first-order moment of the optimal multiple-target Bayes filter with a dynamic
point process, from which a set-valued estimate can be determined at each time-step based
on a set-valued observation. The relationships between point process theory and random
counting measures have been shown.
In the next two chapters, two different implementations of the PHD filter are given.
The first of which is the Particle PHD filter, which extends the particle filter from chapter
2 to a multiple-target environment using a sequential Monte Carlo algorithm. The asymp-
totic convergence properties of this algorithm are established in the next chapter and error
bounds are determined for the mean square errors. The second implementation is the Gaus-
sian Mixture PHD filter described in chapter 5, which is similar in style to the Gaussian
Sum filter from chapter 1, where in this case the PHD is represented by a finite weighted
mixture of Gaussians in which the means and covariances are predicted and updated with
the Kalman filter equations and the weights are updated according to the PHD filter equa-
58
tions. The uniform convergence properties of this approximation to the PHD are derived.
59
Chapter 4
The Particle PHD Filter
4.1 Introduction
Sequential Monte Carlo approximations of the optimal multiple-target filter are computa-
tionally expensive. A practical suboptimal alternative to the optimal filter is the Probability
Hypothesis Density (PHD) filter, which propagates the first-order statistical moment in-
stead of the full multiple-target posterior. The integral of the PHD in any region of the state
space is the expected number of targets in that region [12].
Particle filter methods for the PHD-filter have been devised by Vo [18] and Zajic [19].
Practical applications of the filter include tracking vehicles in different terrains [61], track-
ing targets in passive radar located on ellipses [62], and tracking a variable number of targets
in forward-scan sonar [24]. The Sequential Monte Carlo implementation, or Particle PHD
Filter algorithm, is given in the next section.
It was noted in chapter 2 that one of the crucial considerations for particle filter algo-
rithms is convergence as the number of samples from the posterior distribution increases. It
is required that the empirical distribution represented by the particles tends to the true dis-
60
tribution and that the errors in the approximation can be bounded. Convergence studies by
Crisan [54], [55], [56] amongst others have demonstrated convergence of the mean square
errors and weak convergence of the empirical measures to the true measures at each step in
the algorithm. This chapter presents convergence results for the Particle PHD filter. Bounds
are established for the mean square error and weak convergence of the empirical particle
measure to the true PHD measure is shown.
4.2 The Particle PHD Filter Algorithm
The implementation of the PHD Particle filter is an adaptation of the method described by
Vo et al. [17], based on a Sequential Monte Carlo algorithm for multi-target tracking. The
algorithm can be informally described by the following stages. In the initialisation stage,
particles are distributed across the field of view according to the prior. The particles are
propagated in the prediction stage using the dynamic model with added process noise and,
in addition, particles are added to allow for incoming targets. When the measurements
are received, weights are calculated for the particles based on their likelihoods, which are
determined by the statistical distance of the particles to the set of observations. The sum
of the weights gives the estimated number of targets. Particles are then resampled from the
weighted particle set to give an unweighted representation of the PHD.
The Sequential Monte Carlo implementation of the PHD Filter is given here. The algo-
rithm is initialised in Step 0 and then iterates through Steps 1 to 3.
Step 0: Initialisation at t=0
The filter is initialised with N0 particles drawn from a prior distribution. The number of
particles is adapted at each stage so that it is proportional to the number of targets. Let N
61
be the number of particles per target. The mass assigned to each particle is T0/N0, where
T0 is the expected initial number of targets, which will be updated after an iteration of the
algorithm.
•∀i = 1, . . . ,N0 sample x(i)0 from D0|0 and set t = 0.
Let DN00|0 be the measure:
DN00|0(dxt) := T0
N0
N0
∑i=1
δx(i)t
(dxt), (4.1)
where δx(i)t
is the Dirac delta function centred at x(i)t .
Step 1: Prediction Step, for t ≥ 0
In the prediction step, samples are obtained by two importance sampling proposal den-
sities, qt and pt :
•∀i = 1, ..,Nt−1, sample x(i)t from a proposal density qt(.|x(i)
t−1,Zt), and evaluate the pre-
dicted weights ω(i)t|t−1:
ω(i)t|t−1 =
φt|t−1(x(i)t ,x(i)
t−1)
qt(x(i)t |x(i)
t−1,Zt)ω(i)
t−1. (4.2)
M new-born particles are also introduced from the spontaneous birth model to detect new
targets entering the state space.
•∀i = Nt−1 + 1, ..,Nt−1 + M, sample x(i)t from another proposal density pt(.|Zt), and com-
pute the weights of new born particles ω(i)t|t−1:
62
ω(i)t|t−1 =
1M
γt(x(i)t )
pt(x(i)t |Zt)
. (4.3)
Let DNt−1t|t−1 and DNt−1,M
t|t−1 be the measures:
DNt−1t|t−1(dxt) :=
Nt−1
∑i=1
ω(i)t|t−1δx(i)
t(dxt), (4.4)
DNt−1,Mt|t−1 (dxt) :=
Nt−1+M
∑i=1
ω(i)t|t−1δx(i)
t(dxt). (4.5)
Step 2: Update Step, for t ≥ 0
After the new measurements are obtained, the weights are recalculated using the likeli-
hood function g(·|·) to update the distribution based on new information:
• Let Rt = Nt−1 +M. ∀z ∈ Zt , compute:
〈ωt|t−1,ψt,z〉 =Rt
∑i=1
ψt,z(x(i)t )ω(i)
t|t−1. (4.6)
•∀i = 1, . . . ,Rt , update weights:
ω(i)t =
[
(1− pD)+ ∑z∈Zt
ψt,z(x(i)t )
κt(z)+ 〈ωt|t−1,ψt,z〉
]
ω(i)t|t−1. (4.7)
63
Let DRtt|t be the measure:
DRtt|t(dxt) :=
Rt
∑i=1
ω(i)t δx(i)
t(dxt). (4.8)
Step 3: Resampling Step
The particles are resampled to obtain an unweighted representation of Dt|t . This is un-
weighted since the resampled representation of Dt|t is given by the particle density.
• Compute the mass of the particles:
Tt =Rt
∑i=1
ω(i)t , (4.9)
and set Nt = N · int(Tt) (where int(Tt) is the integer nearest to Tt). Target estimates are taken
at this stage because the resampling stage introduces further approximations, resulting in
less descriptive posterior distributions. In the PHD Filter algorithm, the weights are not
normalized as in the standard particle filter algorithm as they do not sum to one but, instead,
to the expected number of targets.
• Resample
ω(i)t
Tt, x(i)
t
Rt
i=1to get
Tt/Nt ,x(i)t
Nt
i=1.
The particles each have weight Tt/Nt after resampling. Let DNtt|t be the measure:
DNtt|t(dxt) :=
Nt
∑i=1
ω(i)t δx(i)
t(dxt). (4.10)
64
4.3 Convergence for the Particle PHD Filter Algorithm
Convergence properties for the Particle PHD Filter will now be established. First, we con-
sider the rate of convergence of the average mean square error E[
(〈DNtt|t ,ϕ〉−〈Dt|t,ϕ〉)2
]
for any function ϕ ∈ B(Rd), where B(Rd) is the set of bounded Borel measurable functions
on Rd . Then we show almost-sure convergence of DNt
t|t to Dt|t . When the measure in the in-
ner product 〈., .〉 is continuous, it defines the integral inner product, and when it is discrete,
it defines the summation inner product, so that:
〈Dt|t ,ϕ〉 =Z
Dt|t(xt |Z1:t)ϕ(xt)dxt (4.11)
and
〈DNtt|t ,ϕ〉 =
Nt
∑i=1
ω(i)t ϕ(x(i)
t ) (4.12)
The norm ‖ϕ‖ used here is the supremum norm.
4.3.1 Criteria for Convergence
To show convergence, certain conditions on the functions need to be met:
• The transition kernel φt|t−1 satisfies the Feller Property, i.e. ∀t > 0,R
ϕ(y)φt|t−1(x,dy) is
continuous ∀ϕ ∈Cb(Rd), where Cb(R
d) are the continuous bounded functions on Rd .
•ψt,z ∈Cb(Rd)
• For any rational-valued random variables Q(i)t such that there exists p > 1, some constant
65
C, and α < p−1,
E[∣
∣
∣
∣
∣
N∑i=1
(Q(i)t −Nω(i)
t )q(i)
∣
∣
∣
∣
∣
p]
≤CNα‖q‖p (4.13)
for all vectors q = (q(1), ..,q(N)) and ∑Ni=1 Q(i)
t = N.
• We assume that the importance sampling ratios are bounded, i.e. there exists constants B1
and B2 such that ‖γt/pt‖ ≤ B1 and ‖φt|t−1/qt‖ ≤ B2.
• The resampling strategy is multinomial and hence unbiased [54], i.e. the resampled par-
ticle set is i.i.d. according to the empirical distribution before resampling.
The data update equation assumes a Poisson model, and hence is only an approximation.
The clutter parameter κt,z needs to be determined from the data and cannot be inferred from
the recursion. For the purpose of these proofs, it has been assumed that we know the correct
density ct and average number of Poisson distributed clutter points λt .
4.3.2 Convergence of the Mean Square Errors
If µN,N = 1, . . . ,∞, is a sequence of measures that depend on the number of particles, then
we say µN converges to µ if ∀ϕ ∈ B(Rd),
limN→∞
E[
(〈µN,ϕ〉−〈µ,ϕ〉)2]= 0. (4.14)
We show that, in the case of the PHD, this depends only on T , the number of targets, and
Nt , the number of particles. Let the likelihood function g ∈ B(Rd) be a bounded function.
At each stage of the algorithm, the approximation admits a mean square error on the order
66
of the number of particles. We proceed by first showing that equation (4.15) is satisfied in
the initialisation step. Then we show that if equation (4.15) holds, then after the prediction
step equation (4.16) holds. If equation (4.16) holds, then equation (4.17) holds after the
update step. Finally, we show that if equation (4.17) holds, then equation (4.18) holds after
resampling.
E[
(〈DNt−1t−1|t−1,ϕ〉−〈Dt−1|t−1,ϕ〉)2
]
≤ ct−1|t−1‖ϕ‖2
Nt−1, (4.15)
E[
(〈DNt−1,Mt|t−1 ,ϕ〉−〈Dt|t−1,ϕ〉)2
]
≤ ‖ϕ‖2(ct|t−1Nt−1
+dtM ), (4.16)
E[
(〈DRtt|t,ϕ〉−〈Dt|t,ϕ〉)2
]
≤ ct|t‖ϕ‖2
Rt, (4.17)
E[
(〈DNtt|t ,ϕ〉−〈Dt|t,ϕ〉)2
]
≤ ct|t‖ϕ‖2
Nt. (4.18)
In deriving the proofs, we use the Minkowski inequality, which states that for any two
random variables X and Y in L2:
E[(X +Y )2]12 ≤ E[X2]
12 +E[Y 2]
12 . (4.19)
LEMMA 7 For any ϕ ∈ B(Rd), there exists some real number c0|0 such that at Step 0 (Ini-
tialization), condition (4.18) holds at time t = 0.
67
Proof:
We assume that at time t = 0, we can sample exactly from the initial distribution D0|0. Then,
〈DN00|0,ϕ〉−〈D0|0,ϕ〉 (4.20)
=T0N0
N0
∑i=1
(ϕ(x(i)t )−〈D0|0,ϕ〉).
Let ξi = T0ϕ(x(i)t )−〈D0|0,ϕ〉. Then E [ξi] = 0, and ξ1, . . . ,ξN0 is a sequence of independent
integrable random variables. From the Marcinkiewicz and Zygmund inequalities (see, for
example, p 498 [53]), there exists a constant c such that
E[
(1
N0
N0
∑i=1
ξi)2
]
≤ cE[
1N2
0
N0
∑i=1
ξ2i
]
. (4.21)
and hence
1N2
0E[
(N0
∑i=1
ξi)2
]
≤ c‖ξ‖2
N0, (4.22)
where ‖ξ‖ is the supremum norm. Using the definition of ξ, we have
‖ξi‖ ≤ 2T0‖ϕ‖, (4.23)
since 〈D0|0,ϕ〉 ≤ ‖ϕ‖R
D0|0(dx) = ‖ϕ‖T0, by Holder’s Inequality. Therefore, at time t = 0,
68
there is a real number c0|0, dependent on the initial number of targets T0, such that
E[
(〈DN00|0,ϕ〉−〈D0|0,ϕ〉)2
]
≤ c0|0‖ϕ‖2
N0(4.24)
so condition (4.18) holds at the beginning of the algorithm
LEMMA 8 Assume that for any ϕ ∈ B(Rd), (4.15) holds. Then, after Step 1 (Prediction),
for any ϕ ∈ B(Rd), (4.16) holds for some constant dt and some real number ct|t−1 that
depends on the number of spawned targets.
Before proving this Lemma, some considerations are given below. The Sequential
Monte Carlo implementation involves sampling from two densities: qt , the density propa-
gated from the previous time step, and pt , the de nsity for spontaneous birth. Suppose that
the spontaneous birth density is sampled by M particles and the propagated density by Nt
particles.
To prove convergence, we use the fact that the sum of two sequences converges weakly
to the sum of the limits of those sequences, which follows from a basic result of Real
Analysis on the convergence of sequences of real numbers. It then suffices to establish
weak convergence of the two sequences independently.
We have assumed that we can sample exactly from the spontaneous birth density γt ,
so using the same argument for showing that the initial distribution is bounded (Lemma
7), and using the assumption that the importance ratio ‖γt/pt‖ is bounded, then there is a
constant dt such that
E[
(〈γMt ,ϕ〉)2]≤ dt
‖ϕ‖2
M . (4.25)
69
Define D′t|t−1 to be Dt|t−1 − γt . We now show that D′
t|t−1(x), the density propagated from
the previous time step, is bounded.
Proof:
By the triangle inequality, we have
|〈DNt−1t|t−1,ϕ〉−〈D′
t|t−1,ϕ〉| (4.26)
≤|〈DNt−1t|t−1,ϕ〉−〈DNt−1
t−1|t−1,φt|t−1ϕ〉|+ |〈DNt−1t−1|t−1,φt|t−1ϕ〉−〈Dt−1|t−1,φt|t−1ϕ〉|. (4.27)
Let Gt−1 be the σ-algebra generated by the particles x(i)t−1. Then
E[
(〈DNt−1t,t−1,ϕ〉|Gt−1
]
= 〈DNt−1t−1,t−1,φt|t−1ϕ〉, (4.28)
hence
E[
(〈DNt−1t,t−1,ϕ〉−E
[
(〈DNt−1t,t−1,ϕ〉|Gt−1
]
)2|Gt−1]
(4.29)
=E[
(〈DNt−1t,t−1,ϕ〉−〈DNt−1
t−1,t−1,φt|t−1ϕ〉)2|Gt−1]
(4.30)
=E[
(〈DNt−1t,t−1,ϕ〉(〈D
Nt−1t,t−1,ϕ〉−〈DNt−1
t−1,t−1,φt|t−1ϕ〉)]
(4.31)
−〈DNt−1t−1,t−1,φt|t−1ϕ〉E
[
〈DNt−1t,t−1,ϕ〉−〈DNt−1
t−1,t−1,φt|t−1ϕ〉|Gt−1]
.
The second term in (4.31) is zero, so the above simplifies to
E[
〈DNt−1t,t−1,ϕ〉2
]
−〈DNt−1t−1,t−1,φt|t−1ϕ〉2 (4.32)
70
Writing out this as a sum, and using the independence of the particles, (4.32) equals
(
Tt−1Nt−1
)2 Nt−1
∑i=1
E
(
ϕ(x(i)t )
φt|t−1(x(i)t ,x(i)
t−1)
qt(x(i)t |x(i)
t−1,Zt)
)2
|Gt−1
− (φt|t−1ϕ)(x(i)t−1)
2
(4.33)
≤T 2
t−1Nt−1
‖ϕ‖2
(
∥
∥
∥
∥
φt|t−1qt
∥
∥
∥
∥
2+‖φt|t−1‖2
)
. (4.34)
Using Minkowski’s inequality, we obtain
E[
(〈DNt−1t|t−1,ϕ〉−〈D′
t|t−1,ϕ〉)2] 1
2 (4.35)
≤E[
(〈DNt−1t|t−1,ϕ〉−〈DNt−1
t−1|t−1,φt|t−1ϕ〉)2]
12 (4.36)
+E[
(〈DNt−1t−1|t−1,φt|t−1ϕ〉−〈Dt−1|t−1,φt|t−1ϕ〉)2
]12
≤ 1√Nt−1
‖ϕ‖
Tt−1
(
∥
∥
∥
∥
φt|t−1qt
∥
∥
∥
∥
2+‖φt|t−1‖2
)12
+√ct−1|t−1
. (4.37)
The transition kernel φt|t−1 is bounded by the single-target transition, ft|t−1, and the PHD
of spawned targets, bt|t−1:
φt|t−1(x,xt−1) = PS(xt−1) ft|t−1(x|xt−1)+bt|t−1(x|xt−1). (4.38)
Therefore ‖φt|t−1ϕ‖≤ 1+Tt|t−1, where Tt|t−1 is the number of spawned targets. By assump-
71
tion, the ratio ‖φt|t−1/qt‖ is bounded by some constant B2, and so the lemma is proved:
E[
(〈DNt−1,Mt|t−1 ,ϕ〉−〈Dt|t−1,ϕ〉)2
]
≤ ‖ϕ‖2(ct|t−1
Nt−1+
dtM
)
, (4.39)
where ct|t−1 =(
Tt−1(B22 +(1+Tt|t−1)
2)12 +
√ct−1|t−1
)2
LEMMA 9 Assume that for any ϕ ∈ B(Rd), (4.16) holds. Then, after Step 2 (Data Update),
for any ϕ ∈ B(Rd), (4.17) holds for some real number ct|t that depends on the number of
targets.
Proof:
From the definitions (4.11) and (4.12), we have:
〈DRtt|t,ϕ〉−〈Dt|t,ϕ〉 (4.40)
=
⟨
ν+ ∑z∈Zt
ψt,z
κt,z + 〈DNt−1,Mt|t−1 ,ψt,z〉
DNt−1,Mt|t−1 ,ϕ
⟩
−⟨[
ν+ ∑z∈Zt
ψt,zκt,z + 〈Dt|t−1,ψt,z〉
]
Dt|t−1,ϕ
⟩
(by linearity)
=(
〈DNt−1,Mt|t−1 ,ϕν〉−〈Dt|t−1,ϕν〉
)
+ ∑z∈Zt
〈DNt−1,Mt|t−1 ,ϕψt,z〉
κt,z + 〈DNt−1,Mt|t−1 ,ψt,z〉
−〈Dt|t−1,ϕψt,z〉
κt,z + 〈Dt|t−1,ψt,z〉
(4.41)
72
(adding and subtracting a new term)
=(
〈DNt−1,Mt|t−1 ,ϕν〉−〈Dt|t−1,ϕν〉
)
(4.42)
+ ∑z∈Zt
〈DNt−1 ,Mt|t−1 ,ϕψt,z〉
κt,z + 〈DNt−1,Mt|t−1 ,ψt,z〉
−〈DNt−1 ,M
t|t−1 ,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉
+
〈DNt−1,Mt|t−1 ,ϕψt,z〉
κt,z + 〈Dt|t−1,ψt,z〉−
〈Dt|t−1,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉
.
The modulus of the first bracket in the summation from (4.42) is:
∣
∣
∣
∣
∣
∣
〈DNt−1,Mt|t−1 ,ϕψt,z〉
κt,z + 〈DNt−1,Mt|t−1 ,ψt,z〉
−〈DNt−1,M
t|t−1 ,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉
∣
∣
∣
∣
∣
∣
(4.43)
=
∣
∣
∣〈DNt−1,M
t|t−1 ,ϕψt,z〉(κt,z + 〈Dt|t−1,ψt,z〉)−〈DNt−1,Mt|t−1 ,ϕψt,z〉(κt,z + 〈DNt−1,M
t|t−1 ,ψt,z〉)∣
∣
∣
(κt,z + 〈DNt−1,Mt|t−1 ,ψt,z〉)(κt,z + 〈Dt|t−1,ψt,z〉)
(4.44)
≤
∣
∣
∣〈DNt−1,M
t|t−1 ,ϕψt,z〉(κt,z + 〈Dt|t−1,ψt,z〉)−〈DNt−1,Mt|t−1 ,ϕψt,z〉(κt,z + 〈DNt−1,M
t|t−1 ,ψt,z〉)∣
∣
∣
〈DNt−1,Mt|t−1 ,ψt,z〉〈Dt|t−1,ψt,z〉
(4.45)
≤ ‖ϕ‖〈Dt|t−1,ψt,z〉
∣
∣
∣〈Dt|t−1,ψt,z〉−〈DNt−1,M
t|t−1 ,ψt,z〉∣
∣
∣. (4.46)
The second bracket in the summation from (4.42) is
∣
∣
∣
∣
∣
∣
〈DNt−1,Mt|t−1 ,ϕψt,z〉
κt,z + 〈Dt|t−1,ψt,z〉−
〈Dt|t−1,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉
∣
∣
∣
∣
∣
∣
=
∣
∣
∣〈DNt−1,M
t|t−1 ,ϕψt,z〉−〈Dt|t−1,ϕψt,z〉∣
∣
∣
κt,z + 〈Dt|t−1,ψt,z〉(4.47)
73
≤
∣
∣
∣〈DNt−1,M
t|t−1 ,ϕψt,z〉−〈Dt|t−1,ϕψt,z〉∣
∣
∣
〈Dt|t−1,ψt,z〉. (4.48)
Combining these, we get
(
〈DNt−1,Mt|t−1 ,ϕν〉−〈Dt|t−1,ϕν〉
)
+ (4.49)
∑z∈Zt
〈DNt−1 ,Mt|t−1 ,ϕψt,z〉
κt,z + 〈DNt−1,Mt|t−1 ,ψt,z〉
−〈DNt−1,M
t|t−1 ,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉
+
〈DNt−1,Mt|t−1 ,ϕψt,z〉
κt,z + 〈Dt|t−1,ψt,z〉−
〈Dt|t−1,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉
≤ |〈DNt−1,Mt|t−1 ,ϕν〉−〈Dt|t−1,ϕν〉| (4.50)
+ ∑z∈Zt
‖ϕ‖〈Dt|t−1,ψt,z〉
∣
∣
∣〈Dt|t−1,ψt,z〉−〈DNt−1,Mt|t−1 ,ψt,z〉
∣
∣
∣+
∣
∣
∣〈DNt−1,Mt|t−1 ,ϕψt,z〉−〈Dt|t−1,ϕψt,z〉
∣
∣
∣
〈Dt|t−1,ψt,z〉
.
From Minkowski’s inequality,
E[
(〈DRtt|t,ϕ〉−〈Dt|t,ϕ〉)2
]12 ≤ E
[
(〈DNt−1,Mt|t−1 ,ϕν〉−〈Dt|t−1,ϕν〉)2
]12 (4.51)
+ ∑z∈Zt
‖ϕ‖〈Dt|t−1,ψt,z〉
E[
(〈Dt|t−1,ψt,z〉−〈DNt−1,Mt|t−1 ,ψt,z〉)2
] 12+
E[
(〈DNt−1,Mt|t−1 ,ϕψt,z〉−〈Dt|t−1,ϕψt,z〉)2
] 12
〈Dt|t−1,ψt,z〉
74
≤√ct|t−1‖ϕ‖‖ν‖
√Rt
+ ∑z∈Zt
(
2‖ϕ‖‖ψt,z‖√ct|t−1
〈Dt|t−1,ψt,z〉√
Rt
)
(4.52)
≤√ct|t−1‖ϕ‖
√Rt
[
1+ ∑z∈Zt
(
2‖ψt,z‖〈Dt|t−1,ψt,z〉
)]
. (4.53)
ψt,z is a bounded function, since g is bounded by assumption. Lemma 9 follows from this
where ct|t = ct|t−1
[
1+∑z∈Zt
(
2‖ψt,z‖〈Dt|t−1,ψt,z〉
)]2
LEMMA 10 Assume that ∀ϕ ∈ B(Rd), (4.17) holds. Then after Step 3 (Resampling), there
exists a real number ct|t , that depends on the number of targets, such that ∀ϕ ∈ B(Rd),
(4.18) holds.
Proof:
Adding and subtracting the term 〈DRtt|t ,ϕ〉 from the Data Update step, we have
〈DNtt|t ,ϕ〉−〈Dt|t,ϕ〉 = (〈DNt
t|t ,ϕ〉−〈DRtt|t,ϕ〉)+(〈DRt
t|t,ϕ〉−〈Dt|t,ϕ〉), (4.54)
so by Minkowski’s inequality,
E[
(〈DNtt|t ,ϕ〉−〈Dt|t,ϕ〉)2
]12 ≤ E
[
(〈DNtt|t ,ϕ〉−〈DRt
t|t,ϕ〉)2]
12+E
[
(〈DRtt|t,ϕ〉−〈Dt|t,ϕ〉)2
]12.
(4.55)
Let Ft be the σ-algebra generated by x(i), i = 1, . . . ,Rt Then the expectation of the inner
75
product 〈DNtt|t ,ϕ〉 conditioned on Ft is
E[
〈DNtt|t ,ϕ〉|Ft
]
= 〈DRtt|t ,ϕ〉. (4.56)
Hence there exists a number c such that
E[
(〈DNtt|t ,ϕ〉−〈DRt
t|t,ϕ〉)2|Ft
]
≤ cNt
‖ϕ‖2. (4.57)
This follows from the assumption that the resampling strategy is unbiased. Using Minkowski’s
inequality, as above, we have
E[
(〈DNtt|t ,ϕ〉−〈Dt|t,ϕ〉)2
]12 ≤ (
√c+√
ct|t)‖ϕ‖√
Nt. (4.58)
Lemma 10 is then proved with ct|t = (√
c+√
ct|t)
THEOREM 6 ∀t ≥ 0, there is a real number ct|t , that depends on the number of new targets
but is independent of the number of particles, such that ∀ϕ ∈ B(Rd), (4.18) holds.
Proof:
Combining the above proofs, we have shown that ∀t ≥ 0,∃ct|t independent of Nt , but de-
pendent on the number of targets, such that ∀ϕ ∈ B(Rd):
E[
(〈DNtt|t ,ϕ〉−〈Dt|t,ϕ〉)2
]
≤ ct|t‖ϕ‖2
Nt (4.59)
76
4.3.3 Convergence of Empirical Measures
To prove that an empirical distribution converges to its true distribution, we need to have
a notion of convergence for measures. This type of convergence is called weak conver-
gence, which is fundamental to the study of probability and statistics. With this type of
convergence, the values of the random variables are not important; it is the probabilities
with which they assume those values that matter. Thus, the probability distributions of the
random variables will be converging, not the values themselves [57].
Let µN and µ be probability measures on Rd . Then, the sequence µN converges weakly
to µ ifR
f (x)µN(dx) converges toR
f (x)µ(dx) for each real-valued continuous and bounded
function f on Rd .
This definition can be extended to more general measures, not just probability distribu-
tions. In our case, we will be considering the PHD measure, where the notion still applies.
(Further details on weak convergence for measures can be obtained from Billingsley [63].)
The empirical measures considered here are the particles that approximate the true mea-
sures, where Nt represents the number of particles. Let Cb(Rd) be the set of real-valued
continuous bounded functions on Rd . If (µN) is a sequence of measures, then µN converges
weakly to µ if:
limN→∞
〈µN,ϕ〉 = 〈µ,ϕ〉 (4.60)
This section shows that after each stage of the PHD Filter algorithm, the measures
converge weakly. We proceed by first showing that equation (4.61) is satisfied in the ini-
tialisation step. Then we show that if equation (4.61) holds, then after the prediction step
77
equation (4.62) holds. If equation (4.62) holds, then equation (4.63) holds after the up-
date step. Finally, we show that if equation (4.63) holds, then equation (4.64) holds after
resampling.
limNt−1→∞
DNt−1t−1|t−1 = Dt−1|t−1 a.s. (4.61)
limNt−1,M→∞
DNt−1,Mt|t−1 = Dt|t−1 a.s. (4.62)
limRt→∞
DRtt|t = Dt|t a.s. (4.63)
limNt→∞
DNtt|t = Dt|t a.s. (4.64)
(a.s. stands for almost surely, i.e. true for all values outside the null set.)
We assume that at time t = 0, we can sample exactly from the initial distribution D0|0.
Then, from the Glivenko-Cantelli Theorem [57], which states that empirical distributions
converge to their actual distributions almost surely,
limN0→∞
DN00|0 = D0|0 a.s. (4.65)
LEMMA 11 Suppose (4.61) holds, then after Step 1 (Prediction), (4.62) holds.
78
Proof:
Define D′t|t−1 to be Dt|t−1 − γt . It suffices to prove that
limM→∞
γMt = γt a.s. (4.66)
and
limNt−1→∞
DNt−1t|t−1 = D′
t|t−1 a.s. (4.67)
We have assumed that we sample M i.i.d. particles from γt for the first of these, so by
the Glivenko-Cantelli Theorem, (4.66) is true. Let Gt−1 be the σ-algebra generated by
x0:t−1Nt−1i=1 , then
E[
〈DNt−1t|t−1,ϕ〉|Gt−1
]
= 〈DNt−1t−1|t−1,φt|t−1ϕ〉. (4.68)
Since E[
ϕ(x(i)t )|Gt−1
]
= (φt|t−1ϕ)(x(i)t−1) and x0:tNt−1
i=1 are i.i.d. random variables which
are conditional on Gt−1, we have
E[
(〈DNt−1t|t−1,ϕ〉−E
[
〈DNt−1t|t−1,ϕ〉|Gt−1
]
)4|Gt−1]
(4.69)
= E
(
Tt−1Nt−1
Nt−1
∑i=1
(ϕ(x(i)t )
φt|t−1(x(i)t ,x(i)
t−1)
qt(x(i)t |x(i)
t−1,Zt)− (φt|t−1ϕ)(x(i)
t−1)
)4
|Gt−1
. (4.70)
79
For notational simplicity, define the measure Φ as
Φ(x(i)t ,x(i)
t−1) = ϕ(x(i)t )
φt|t−1(x(i)t ,x(i)
t−1)
qt(x(i)t |x(i)
t−1,Zt)− (φt|t−1ϕ)(x(i)
t−1), (4.71)
which has an expectation of zero. Then, expanding the above quartic gives
=
(
Tt−1Nt−1
)4(
Nt−1
∑i=1
E[
Φ(x(i)t ,x(i)
t−1)4|Gt−1
]
(4.72)
+Nt−1
∑i6= j
E[
Φ(x(i)t ,x(i)
t−1)3Φ(x( j)
t ,x( j)t−1)+Φ(x(i)
t ,x(i)t−1)
2Φ(x( j)t ,x( j)
t−1)2|Gt−1
]
+Nt−1
∑i, j,k distinct
E[
Φ(x(i)t ,x(i)
t−1)2Φ(x( j)
t ,x( j)t−1)|Gt−1
]
E[
Φ(x(k)t ,x(k)
t−1)|Gt−1]
+Nt−1
∑i, j,k,l distinct
E[
Φ(x(i)t ,x(i)
t−1)Φ(x( j)t ,x( j)
t−1)Φ(x(k)t ,x(k)
t−1)Φ(x(l)t ,x(l)
t−1)|Gt−1]
)
=
(
Tt−1Nt−1
)4(Nt−1
∑i=1
E[
Φ(x(i)t ,x(i)
t−1)4|Gt−1
]
+Nt−1
∑i6= j
E[
Φ(x(i)t ,x(i)
t−1)2Φ(x( j)
t ,x( j)t−1)
2|Gt−1]
)
,
(4.73)
where the last equality holds because Φ(x(i)t ,x(i)
t−1) are mutually independent random vari-
ables with mean zero. Taking expectations of (4.70) and (4.71), there exists a constant C
80
such that
E[
(〈DNt−1t|t−1,ϕ〉−E
[
〈DNt−1t|t−1,ϕ〉|Gt−1
]
)4]
≤CT 4
t−1(B42 +(1+Tt|t−1)
4)‖ϕ‖4
N2t−1
, (4.74)
since there are O(N2t−1) terms bounded by CT 4
t−1(B42 +(1+Tt|t−1)
4)‖ϕ‖4, following a sim-
ilar argument as in Lemma 7.
It then follows that
E[
(〈DNt−1t|t−1,ϕ〉−〈DNt−1
t−1|t−1,φt|t−1ϕ〉)4]
≤CT 4
t−1(B42 +(1+Tt|t−1)
4)‖ϕ‖4
N2t−1
, (4.75)
and hence
limNt−1→∞
(〈DNt−1t|t−1,ϕ〉−〈DNt−1
t−1|t−1,φt|t−1ϕ〉) = 0. (4.76)
Using the result from Real Analysis, we have
limNt−1,M→∞
〈DNt−1,Mt|t−1 ,ϕ〉 = lim
Nt−1→∞〈DNt−1
t|t−1,ϕ〉+ limM→∞
〈γMt ,ϕ〉 (4.77)
= 〈D′t|t−1,ϕ〉+ 〈γt,ϕ〉 = 〈Dt|t−1,ϕ〉 (4.78)
LEMMA 12 Suppose (4.62) holds, then after Step 2 (Data Update), (4.63) holds.
81
Proof:
By definition,
〈DRtt|t ,ϕ〉 = 〈DNt−1
t|t−1,ϕν〉+ ∑z∈Zt
〈DNt−1t|t−1,ϕψt,z〉
κt,z + 〈DNt−1t|t−1,ψt,z〉
(4.79)
By continuity and Lemma 11, we have
limNt−1→∞
〈DNt−1t|t−1,ϕν〉 = 〈Dt|t−1,ϕν〉, (4.80)
limNt−1,M→∞
〈DNt−1,Mt|t−1 ,ϕψt,z〉 = 〈Dt|t−1,ϕψt,z〉, (4.81)
and
limNt−1→∞
〈DNt−1t|t−1,ψt,z〉 = 〈Dt|t−1,ψt,z〉. (4.82)
Hence,
limRt→∞
〈DRtt|t ,ϕ〉 = 〈Dt|t−1,ϕν〉+ ∑
z∈Zt
(
〈Dt|t−1,ϕψt,z〉κt,z + 〈Dt|t−1,ψt,z〉
)
= 〈Dt|t ,ϕ〉 (4.83)
and therefore
limRt→∞
DRtt|t = Dt|t a.s. (4.84)
LEMMA 13 Suppose (4.63) holds, then after Step 3 (Resampling), (4.64) holds.
82
Proof:
Let P(i)t be the number of times that particle x(i)
t is resampled and let Q(i)t = P(i)
t ·Rt/Nt .
Then, from our assumption, we have
E[
|〈DNtt|t ,ϕ〉−〈DRt
t|t,ϕ〉|p]
= E[
(1Rt
Rt
∑i=1
|(Q(i)t −Rtω
(i)t )ϕ(x(i)
t )|)p]
≤ C‖ϕ‖p
Rt1+ε , (4.85)
where ε = p−α−1 ≥ 0. Hence,
limNt→∞
〈DNtt|t ,ϕ〉−〈DRt
t|t,ϕ〉 = 0 a.s. (4.86)
THEOREM 7 For all t ≥ 0,(4.64) holds.
Proof:
The above three proofs have shown that for all t ≥ 0, limNt→∞ DNtt|t = Dt|t
4.4 Conclusions
It has been shown, under the assumption that ϕ(x) is bounded above, that it is possible
to find bounds for the mean square error of the PHD Particle filter at each stage of the
algorithm. These depend on the number of targets introduced at each iteration, but if the
order of the number of targets is much lower than the order of the number of particles, i.e.
T << N, then the error tends to zero as N tends to infinity.
It has also been shown, under the additional assumptions that the transition kernel sat-
isfies the Feller property and the likelihood function is a continuous bounded function, that
the empirical distribution, represented by the particles, converges almost surely to the true
PHD distribution. These results are not dependent on the state dimension. The data update
83
equation assumes a Poisson model, and hence is only an approximation. The clutter param-
eter κt,z needs to be determined from the data and cannot be inferred from the recursion.
For the purpose of these proofs, it has been assumed that we know the correct density ct
and average number of Poisson clutter points λt .
The assumption that ϕ(x) is bounded above may be too restrictive for practioners, and
the additional assumptions on the likelihood and transition kernel may be unrealistic for
practical applications; although applications of the PHD filter have demonstrated its poten-
tial for real-world applications. Despite these reservations, these results give justification to
the Sequential Monte Carlo implementation of the PHD filter given in chapter 3, and show
how the order of the mean squared error is reduced as the number of particles increases.
An implementation of the algorithm on forward-looking sonar for estimating a variable
number of targets in forward-looking sonar shall be demonstrated in chapter 6 and novel
methods for introducing track continuity into the algorithm are presented in chapter 7 which
will then be demonstrated on real sonar data in chapter 9.
84
Chapter 5
The Gaussian Mixture PHD Filter
5.1 Introduction
The second implementation of the PHD filter studied in this thesis is described in this
chapter, called the Gaussian Mixture PHD filter. The closed-form solution to the PHD
(Probability Hypothesis Density) filter was recently derived to provide a solution for multi-
ple target tracking with linear/Gaussian models without the need for measurement-to-track
data association [34, 33] where it was shown that when the initial prior intensity of the
random-set of targets is a Gaussian mixture, the posterior intensity at any time step is also
a Gaussian mixture.
This chapter demonstrates the uniform convergence of the errors for each of the stages
of the Gaussian Mixture PHD Filter [34, 33] using results already established for the particle
implementation of the PHD filter [21] and Wiener’s Theory of Approximation [51]. Error
bounds are provided in L1 for the pruning and merging stage of the algorithm, based on
those established for the Gaussian Sum filter [40].
Extensions of the Gaussian Mixture PHD filter proposed in [33], namely the Extended
85
Kalman PHD filter and the Unscented Kalman PHD filter, are also discussed. Conver-
gence results for the Extended Kalman PHD filter are given based on the Gaussian Sum
filter developed by Sorenson and Alspach [40], the L1 convergence properties discussed in
Anderson and Moore [44], and the fact that densities can be represented by a linear com-
bination of Gaussians in L1 [51]. Taken with the convergence results of the L1 error, the
Gaussian mixture approximation then converges to the true posterior intensity.
The results show that, under linear Gaussian assumptions of the dynamic model, the
Gaussian Mixture posterior intensity can approximate the true posterior intensity to any
desired degree of accuracy. In addition, error bounds have been established for the pruning
and merging stages of the algorithm which ensure that the accuracy of these stages can be
controlled.
5.2 The Gaussian Mixture PHD Filter Algorithm
In this section, we describe the linear-Gaussian multiple target model and the recently de-
veloped Gaussian Mixture PHD filter.
The multiple target model for the PHD recursion is described here. Each target follows
a linear Gaussian dynamical model,
ft|t−1(x|ζ) = N (x;Ft−1ζ,Qt−1), (5.1)
gt(z|x) = N (z;Htx,Rt), (5.2)
where N (·;m,P) denotes a Gaussian density with mean m and covariance P, Ft−1 is the
state transition matrix, Qt−1 is the process noise covariance, Ht is the observation matrix,
86
and Rt is the observation noise covariance.
The survival and detection probabilities are state independent, pS,t(x) = pS,t , and pD,t(x) =
pD,t . The intensities of the spontaneous birth and spawned targets are Gaussian mixtures,
γt(x) =Jγ,t
∑i=1
w(i)γ,t N (x;m(i)
γ,t ,P(i)γ,t ), (5.3)
βt|t−1(x|ζ) =
Jβ,t
∑j=1
w( j)β,t N (x;F( j)
β,t−1ζ+d( j)β,t−1,Q
( j)β,t−1), (5.4)
where Jγ,t , w(i)γ,t , m(i)
γ,t , P(i)γ,t , i = 1, . . . ,Jγ,t , are given model parameters that determine the
shape of the birth intensity, similarly, Jβ,t , w( j)β,t , F( j)
β,t−1, d( j)β,t−1, and Q( j)
β,t−1, j = 1, . . . ,Jβ,t ,
determine the shape of the spawning intensity of a target with previous state ζ.
THEOREM 8 Under the assumptions that each target follows a linear Gaussian dynamical
model, the survival and detection probabilities are constant, the intensities of the birth and
spawned targets are Gaussian mixtures, and that the posterior intensity at time t − 1 is a
Gaussian mixture of the form
Dt−1|t−1(x) =Jt−1
∑i=1
w(i)t−1N (x;m(i)
t−1,P(i)t−1). (5.5)
Then the predicted intensity to time t is also a Gaussian mixture, and is given by
Dt|t−1(x) = DS,t|t−1(x)+Dβ,t|t−1(x)+ γt(x), (5.6)
where DS,t|t−1(x) is the PHD of existing targets, Dβ,t|t−1(x) is the PHD for spawned targets,
and γt(x) is the PHD of spontaneous birth targets. The density for existing targets, DS,t|t−1,
87
is determined from the linear Gaussian model using the Kalman prediction equations,
DS,t|t−1(x) = pS,t
Jt−1
∑j=1
w( j)t−1N (x;m( j)
S,t|t−1,P( j)S,t|t−1), (5.7)
where
m( j)S,t|t−1 = Ft−1m( j)
t−1, (5.8)
P( j)S,t|t−1 = Qt−1 +Ft−1P( j)
t−1FTt−1, (5.9)
(5.10)
and similarly for the spawned target density, Dβ,t|t−1,
Dβ,t|t−1(x) =Jt−1
∑j=1
Jβ,t
∑=1
w( j)t−1w(`)
β,t N (x;m( j,`)β,t|t−1,P
( j,`)β,t|t−1), (5.11)
where
m( j,`)β,t|t−1 = F(`)
β,t−1m( j)t−1 +d(`)
β,t−1, (5.12)
P( j,`)β,t|t−1 = Q(`)
β,t−1 +F(`)β,t−1P( j)
t−1(F(`)β,t−1)
T . (5.13)
88
Proof:
In the PHD prediction equation,
Dt|t−1(x|Ft−1) = γt(x)+Z
φt|t−1(x,xt−1)Dt−1|t−1(xt−1|Ft−1)dxt−1, (5.14)
we substitute the Gaussian state equation, constant probability of survival pS,t , Gaussian
spontaneous birth and spawned target PHDs and the posterior at time t −1 to give
Dt|t−1(xt−1|Ft−1) =Jγ,t
∑i=1
w(i)γ,t N (x;m(i)
γ,t ,P(i)γ,t ) (5.15)
+
Z
(
pS,t N (xt−1;Ft−1ζ,Qt−1)+
Jβ,t
∑j=1
w( j)β,t N (xt−1;F( j)
β,t−1ζ+d( j)β,t−1,Q
( j)β,t−1)
)
Jt−1
∑i=1
w(i)t−1N (xt−1;m(i)
t−1,P(i)t−1)dζ
(expanding the integral gives)
=Jγ,t
∑i=1
w(i)γ,t N (x;m(i)
γ,t ,P(i)γ,t ) (5.16)
+ pS,t
Jt−1
∑i=1
w(i)t−1
Z
N (xt−1;Ft−1ζ,Qt−1)N (xt−1;m(i)t−1,P
(i)t−1)dζ (5.17)
+
Jβ,t
∑j=1
Jt−1
∑i=1
w( j)β,t w(i)
t−1
Z
N (xt−1;F( j)β,t−1ζ+d( j)
β,t−1,Q( j)β,t−1)N (xt−1;m(i)
t−1,P(i)t−1)dζ. (5.18)
Clearly, the first summation is the PHD for spontaneous birth. The integrals in the second
and third summations can be simplified using the following property for Gaussians,
Z
N (x;Fζ+d,Q)N (ζ;m,P)dζ = N (x;Fm+d,Q+FPFT ), (5.19)
89
which gives the required result
THEOREM 9 Under the above assumptions, and that the predicted intensity to time t is a
Gaussian mixture of the form
Dt|t−1(x) =
Jt|t−1
∑i=1
w(i)t|t−1N (x;m(i)
t|t−1,P(i)t|t−1). (5.20)
Then the posterior intensity at time k is also a Gaussian mixture, and is given by
Dt|t(x) = (1− pD,t)Dt|t−1(x)+ ∑z∈Zt
Jt|t−1
∑j=1
w( j)t (z)N (x;m( j)
t|t (z),P( j)t|t ) (5.21)
where the weights are calculated according to the closed form PHD update equation,
w( j)t (z) =
pD,t w( j)t|t−1N (z;Htm( j)
t|t−1,Rt +HtP( j)t|t−1HT
t )
κt(z)+ pD,t ∑Jt|t−1`=1 w(`)
t|t−1N (z;Htm(`)t|t−1,Rt +HtP(`)
t|t−1HTt )
, (5.22)
(5.23)
and the mean and covariance are updated with the Kalman filter update equations,
m( j)t|t (z) = m( j)
t|t−1 +K( j)t (z−Htm( j)
t|t−1), (5.24)
P( j)t|t = [I −K( j)
t Ht]P( j)t|t−1, (5.25)
K( j)t = P( j)
t|t−1HTt (HtP( j)
t|t−1HTt +Rt)
−1. (5.26)
90
Proof:
The PHD update equation is given by
Dt|t(x|Ft) =
[
(1− pD,t)+ ∑z∈Zt
ψt,z(x)κt(z)+ 〈Dt|t−1,ψt,z〉
]
Dt|t−1(x|Ft−1). (5.27)
Substituting the Gaussian likelihood and Gaussian Mixture Prediction PHD, we get
Dt|t(x|Ft) = (1− pD,t)Jt|t−1
∑i=1
w(i)t|t−1N (x;m(i)
t|t−1,P(i)t|t−1) (5.28)
+
Jt|t−1
∑i=1
∑z∈Zt
pD,t w(i)t|t−1N (z;Ht x,Rt)
κt(z)+ pD,t ∑Jt|t−1i=1 w(i)
t|t−1R N (ζ;m(i)
t|t−1,P(i)t|t−1),N (z;Ht ζ,Rt)dζ
N (x;m(i)t|t−1,P
(i)t|t−1).
Using the property of integrals of Gaussians in Theorem 1 and the following property of
Gaussians,
N (z;Hx,R)N (x;m,P) = N (z;Hm,R+HPHT )N (x;m+K(z−Hm),(I−KH)P)
(5.29)
(where K is the Kalman gain, K = PHT (HPHT +R)−1), the result follows
5.3 Convergence of the Errors
This section shows the L1 convergence of the Gaussian mixture PHD filter; in other words,
proving that each step in time of the PHD filter will maintain a suitable approximation
91
error that converges to zero as the number of Gaussians in the mixture tends to infinity.
This is achieved through the successive application of triangle inequalities and Holder’s
inequality. Finally the observation update is shown to converge using an adaptation of the
previous result on particle PHD convergence [21].
Results for the convergence properties of the Gaussian Mixture PHD Filter are now
established. Convergence of the L1 error is first shown, limJt→∞ |〈DJtt − Dt ,ϕ〉| = 0, for
any function ϕ, where DJtt is the Gaussian mixture approximation to Dt with Jt Gaussian
components. The 〈., .〉 notation defines the usual inner product
〈Dt ,ϕ〉 =Z
Dt(xt |Zt)ϕ(xt)dxt . (5.30)
and the operator notations ft|t−1ϕ, v ft|t−1 are defined by
( ft|t−1ϕ)(xt−1) =Z
ft|t−1(xt |xt−1)ϕ(xt)dxt , (5.31)
(v ft|t−1)(xt) =
Z
D(xt−1) ft|t−1(xt |xt−1)dxt−1. (5.32)
Note that 〈v ft|t−1,ϕ〉= 〈v, ft|t−1ϕ〉. Also the PHD prediction equation (5.14) can be written
as
Dt|t−1 = (pS,tDt−1) ft|t−1 +Dt−1βt|t−1 + γt . (5.33)
In the proofs, we use an instance of Holder’s Inequality (see, for example pp. 27 [64]),
|〈v,ϕ〉| ≤ ‖v‖1‖ϕ‖∞. (5.34)
92
The data update equation assumes a Poisson model and, hence, is only an approxima-
tion. The clutter parameters need to be determined from the data and cannot be inferred
from the recursion. For the purpose of these proofs, it has been assumed that the correct
density ct and average number of Poisson clutter points λt are known.
THEOREM 10 Any density on Rd can be approximated as closely as desired in L1 by a
linear combination of Gaussian densities,
D(x) = limn→∞
n∑i=1
w(i)N (x;µi,Pi) (5.35)
Proof
This result is due to Wiener’s theorem on approximation [51]
This means that given any ε > 0, a positive integer N can be found such that
Z
|D(x)−n∑i=1
w(i)N (x;µi,Pi)|dx ≤ ε, (5.36)
for n ≥ N. This result shall be used to establish bounds for the error in the Gaussian ap-
proximation to the posterior intensity.
5.3.1 Initialisation
It is assumed that the initial intensity is known. By Theorem 1, this initial intensity can
be approximated to any arbitrary degree of accuracy, so that, for any bounded measurable
93
function ϕ and any given ε0 > 0, there is a positive integer J such that
|〈D0 −DJ00 ,ϕ〉| ≤ ε0‖ϕ‖∞, (5.37)
for any J0 > J, using Holder’s Inequality where
‖D0 −DJ00 ‖1 ≤ ε0. (5.38)
The notation vJ is used to denote the Gaussian mixture approximation to the density v,
where J is the number of Gaussians in the mixture.
5.3.2 Prediction Equation
Let us assume that the approximation of the posterior intensity, DJt−1t−1 , by a sum of Gaussians
converges uniformly to the true posterior intensity Dt−1. Then, given any εt−1 > 0, an
integer J can be found such that
|〈Dt−1 −DJt−1t−1 ,ϕ〉| ≤ εt−1‖ϕ‖∞, (5.39)
for Jt−1 ≥ J, using Holder’s Inequality.
LEMMA 14 After the prediction step, there exist real numbers bt|t−1, dt and et|t−1 such that
|〈DJt|t−1t|t−1−Dt|t−1,ϕ〉| ≤ (bt|t−1εt−1 +dt + et|t−1)‖ϕ‖∞, (5.40)
where dt and et|t−1 are dependent on the models for the spontaneous birth and spawned
94
target models.
Proof
Expanding the prediction density using equation (5.33) and using the triangle inequality,
|〈DJt|t−1t|t−1−Dt|t−1,ϕ〉| ≤ |〈(pS,tDt−1 ft|t−1)
Jt−1 − pS,tDt−1 ft|t−1,ϕ〉|
+ |〈(Dt−1βt|t−1)Jt−1Jβ,t −Dt−1βt|t−1,ϕ〉|+ |〈γJγt
t − γt ,ϕ〉| (5.41)
Taking the first term on the right hand side, which concerns the predicted intensity for
existing targets, adding and subtracting 〈pS,tDJt−1t−1 , ft|t−1ϕ〉, and using the triangle inequality
again we get
|〈(pS,tDt−1 ft|t−1)Jt−1 − pS,tDt−1 ft|t−1,ϕ〉| ≤ |〈(pS,tDt−1 ft|t−1)
Jt−1 ,ϕ〉−〈pS,tDJt−1t−1 , ft|t−1ϕ〉|
+ pS,t |〈DJt−1t−1 −Dt−1, ft|t−1ϕ〉|, (5.42)
the first term on the right hand side is zero due to the linear Gaussian prediction model.
Moreover,
( ft|t−1ϕ)(xt−1) =Z
ft|t−1(xt |xt−1)ϕ(xt)dxt (5.43)
≤ ‖ϕ‖∞
Z
ft|t−1(xt |xt−1)dxt
= ‖ϕ‖∞
95
where the last equation follows from the fact that ft|t−1(xt |xt−1) is a transition density.
Hence,
|〈DJt−1S,t|t−1− pS,tDt−1 ft|t−1,ϕ〉| ≤ pS,t‖ϕ‖∞εt−1, (5.44)
for Jt−1 ≥ J.
Now consider the birth model; there exists a constant dt and integer J such that
|〈γJγ,tt − γt ,ϕ〉| ≤ dt‖ϕ‖∞, (5.45)
for Jγ,t ≥ J, since we assume that we can model this exactly.
Finally, for the spawned target model, adding and subtracting 〈DJt−1t−1 ,β
Jβ,tt|t−1ϕ〉 and ap-
plying the triangle inequality gives
|〈(Dt−1βt|t−1)Jt−1Jβ,t −Dt−1βt|t−1,ϕ〉| ≤ |〈(Dt−1βt|t−1)
Jt−1Jβ,t ,ϕ〉−〈DJt−1t−1 ,β
Jβ,tt|t−1ϕ〉|
+ |〈DJt−1t−1 ,β
Jβ,tt|t−1ϕ〉−〈Dt−1,βt|t−1ϕ〉|, (5.46)
the first term on the right is zero due to the linear Gaussian spawned target model. Using an
argument similar to the prediction for existing targets, equation 5.43, there exists a number
et|t−1, such that the second term is less than or equal to et|t−1‖ϕ‖∞, for Jβ,tJt|t−1 ≥ J. This
number, et|t−1, is dependent on the L1 norm of the spawned target intensity, ‖βt|t−1‖1. The
lemma is proved by combining the three results above and setting bt|t−1 = pS,t
96
5.3.3 Measurement Equation
Let us assume that the approximation of the prediction intensity, DJt|t−1t|t−1, by a sum of Gaus-
sians converges uniformly to the true prediction intensity Dt|t−1. Then, using the same
arguments as in (5.37), we have for any εt|t−1 > 0, an integer J can be found such that
|〈DJt|t−1t|t−1−Dt|t−1,ϕ〉| ≤ εt|t−1‖ϕ‖∞, (5.47)
LEMMA 15 After the measurement update step, there exists a real number bt , dependent
on the number of measurements such that
|〈DJtt −Dt ,ϕ〉| ≤ btεt|t−1‖ϕ‖∞. (5.48)
We assume that the predicted intensity Dt|t−1 is non-zero. This is a reasonable assumption
as there would be no intensity to update when the measurements are received if it were zero.
Proof
Using the convergence result for the particle PHD filter (Lemma 9), we have the inequality
|〈vJtt −Dt ,ϕ〉| ≤ (1− pD,t)|〈D
Jt|t−1t|t−1−Dt|t−1,ϕ〉| (5.49)
+ ∑z∈Zt
(
1〈Dt|t−1,ψt,z〉
(
‖ϕ‖∞
∣
∣
∣〈Dt|t−1−DJt|t−1
t|t−1,ψt,z〉∣
∣
∣+∣
∣
∣〈DJt|t−1
t|t−1−Dt|t−1,ϕψt,z〉∣
∣
∣
)
)
,
97
using the assumption, we find that it is less than or equal to
εt|t−1‖ϕ‖∞
(
(1− pD,t)+ ∑z∈Zt
(
(2‖ψt,z‖∞)
〈Dt|t−1,ψt,z〉
))
, (5.50)
so that the lemma is proved with
bt =
(
(1− pD,t)+ ∑z∈Zt
(
(2‖ψt,z‖∞)
〈Dt|t−1,ψt,z〉
))
(5.51)
5.4 Pruning and Merging of Gaussian components
Since the number of Gaussians used to represent the Gaussian mixture increases at each
time step, methods are required to ensure that the complexity of the algorithm is controlled.
This is achieved through pruning, to eliminate the Gaussians with low weights, and merg-
ing, to combine Gaussians with similar means [33]. This section considers the errors
introduced in these stages and shows that they can be controlled. The first approximation
shows that a bound can be placed on the error introduced by eliminating terms with neg-
ligible weights, using a result for the Gaussian sum filter [40]. The second approximation
arises from the tendency of many terms to converge to the same result so that they can be
combined by adding their weights. When two terms are approximately equal, a bound on
the error can be introduced so that the errors introduced in the merging stage are within
tolerable limits, using another result for the Gaussian sum filter [43].
The number of Gaussian components used to represent the Gaussian mixture increases
98
without bound; at time t, the posterior intensity requires
(Jt−1(1+ Jβ,t)+ Jγ,t)(1+ |Zt|) = O(Jt−1|Zt |) (5.52)
Gaussian components, where |Zt | is the number of measurements at time t and O(·) rep-
resents the asymptotic complexity. Clearly this has implications for the complexity of the
algorithm, so it would be useful to reduce the total number of components required to
represent the PHD. To alleviate these problems, components with small weights, w(i)t , are
pruned, and components with similar means, m(i)t ≈ m( j)
t , are merged. The full procedure
is given in Vo and Ma [33]. It is shown here that bounds can be put on the L1 error when
these methods are used.
5.4.1 Pruning
The pruning stage of the algorithm allows us to drop terms with negligible weights. It is
shown here that the error introduced in this stage can be bounded. Suppose that the posterior
intensity at time t is given by the sum of Gaussians,
Dt(x) =Jt
∑i=1
w(i)t N (x;m(i)
t ,P(i)t ). (5.53)
Assume, without loss of generality, that the components with indices i = 1, . . . ,NP are those
with weights, w(i)t , less than some specified threshold δ1. Prune these components and
replace Dt(x) by DPt (x),
DPt (x) =
∑Jtl=1 w(l)
t
∑Jtj=NP+1 w( j)
t
Jt
∑i=NP+1
w(i)t N (x;m(i)
t ,P(i)t ), (5.54)
99
where components with indices i = NP +1, . . . ,Jt are the surviving components. The fol-
lowing bound can be established (from Sorenson and Alspach [40]),
‖Dt −DPt ‖1 ≤ 2
NP
∑i=1
w(i)t ≤ 2NPδ1. (5.55)
This shows that the L1 error can be selected to fall within specified bounds for the pruning
stage of the algorithm.
5.4.2 Merging
Several methods for Gaussian mixture reduction using merging techniques have been pro-
posed for Gaussian sum filters. The first of which was derived by Alspach who provided
an L1 error bound for approximating two Gaussian components with the same covariance
as follows [43]. Suppose that two components have the same covariance Pt := P(1)t = P(2)
t ,
and similar means, m(1)t ≈ m(2)
t , so that for some threshold δ2,
(m(1)t −m(2)
t )T P−1t (m(1)
t −m(2)t ) ≤ (δ2)
2. (5.56)
Consider approximating Dt(x) by
DMt (x) =
Jt
∑i=3
w(i)t N (x;m(i)
t ,P(i)t )+ w(l)
t N (x; m(l)t ,Pt), (5.57)
100
where the weight and mean of the new component are given by
w(l)t = w(1)
t +w(2)t , (5.58)
m(l)t =
1w(l)
t(w(1)
t m(1)t +w(2)
t m(2)t ), (5.59)
then the following bound holds,
‖Dt −DMt ‖1 ≤
2w(1)t w(2)
t
w(1)t +w(2)
tδ2. (5.60)
Note that as the covariance decreases, the distance between the terms must also decrease to
retain the same bound. Unfortunately, this requires that both of the covariance matrices are
the same which may be an unrealistic assumption.
Salmond proposed two techniques for merging Gaussian components named Joining
and Clustering algorithms [65]. In the Joining algorithm, the two components, i and j,
which are closest using the distance measure
δ23 =
w(i)t w( j)
t
w(i)t +w( j)
t(m(i)
t −m( j)t )T P−1
t (m(i)t −m( j)
t ) (5.61)
are merged, where Pt is the covariance of the entire mixture. It was shown that the minimum
distance increases monotonically as the reduction proceeds, and that it is bounded by the
dimension of the state space where a threshold is chosen to be a constant fraction of this.
In the Clustering algorithm, the Gaussians with the largest weights are chosen as principal
components which define cluster centres. The covariance in equation (5.61) is replaced
with the covariance of the principal component and components in set L within a specified
101
threshold can be merged with the following calculations to preserve the overall covariance
of the cluster.
w(`)t = ∑
i∈Lw(i)
t , (5.62)
m(`)t = 1
w(`)t
∑i∈L
w(i)t m(i)
t , (5.63)
P(`)t = 1
w(`)t
∑i∈L
w(i)t (P(i)
t +(m(`)t −m(i)
t )(m(`)t −m(i)
t )T ). (5.64)
This procedure was used in the original formulation of the Gaussian mixture PHD filter [33]
and is appropriate since the intensity is multi-modal, where the principal components rep-
resent the expected target states.
Williams developed a reduction algorithm which considered the overall change in the
probability distribution by evaluating the cost of each possible action and selecting the one
which has the minimum effect on the extire mixture in an L2 sense [66]. The components
are merged with equations (5.62-5.64) above, which preserves the mixture mean and co-
variance. This is good for probability distribution but it may not be desirable for intensities
as this has the effect of smearing out the modes.
5.5 Non-linear Target Dynamic Models
This section considers the convergence for the nonlinear extensions of the Gaussian mixture
PHD filter proposed in [33]. As with the linear case, the survival and detection probabilities
are assumed constant and the intensities of the birth and spawned target intensities are
102
Gaussian but the state and observation processes can be relaxed to the nonlinear model:
xt = ϕt(xt−1,νt−1), (5.65)
zt = ht(xt ,εt), (5.66)
where ϕt and ht are known nonlinear functions, νt−1 and εt are zero-mean Gaussian pro-
cess noise and measurement noise with covariances Qt−1 and Rt respectively. Due to the
nonlinearity of ϕt and ht , the posterior intensity can no longer be represented as a Gaussian
mixture. However, the proposed Gaussian mixture PHD filter can be adapted to accommo-
date models with mild nonlinearities.
The results here show that the intensity function can be approximated by a set of ex-
tended Kalman filters where the covariance of each separate Gaussian component is suffi-
ciently small for the time evolution of its mean and covariance to be calculated accurately.
These are based on the results established for the Gaussian sum filter [44]. In a low noise
environment, the EK PHD filter can be nearly optimal. In a high noise environment, it may
be necessary to reinitialise the algorithm such that the error covariance of each Gaussian is
sufficiently small. If these conditions can not be met, then it may be more appropriate to use
the particle PHD filter [17], which can use non-linear dynamic models and non-Gaussian
state and observation noises, although this will result in a higher computational complexity.
We now establish the conditions for uniform convergence of the extended Kalman (EK)
PHD filter. It is shown that, as the covariance term tends to zero, the approximation is
optimal. In addition, convergence for the Unscented Kalman (UK) PHD filter is discussed.
103
5.5.1 Extended Kalman Prediction Equation
Using the PHD prediction equation,
Dt|t−1(x) =
Z
φt|t−1(x,ζ)Dt−1(ζ)dζ+ γt(x), (5.67)
we show that the predicted intensity for the EK PHD filter can be given by a sum of Gaus-
sians. The extended Kalman prediction tools for existing targets are given by
w( j)S,t|t−1 = pS,t w( j)
t−1, (5.68)
m( j)S,t|t−1 = ϕt(m( j)
t−1,0), (5.69)
P( j)S,t|t−1 = G( j)
t−1Qt−1[G( j)t−1]
T +F ( j)t−1P( j)
t−1[F( j)
t−1]T , (5.70)
where
F ( j)t−1 =
∂ϕt(xt−1,0)
∂xt−1
∣
∣
∣
∣
xt−1=m( j)t−1
,G( j)t−1 =
∂ϕt(m( j)t−1,νt−1)
∂νt−1
∣
∣
∣
∣
∣
νt−1=0
. (5.71)
LEMMA 16 If we know the dynamic model, and the posterior intensity at time t − 1 is
given by the sum of Gaussians,
Dt−1(x) =Jt−1
∑i=1
w(i)t−1N (x;m(i)
t−1,P(i)t−1), (5.72)
then the predicted intensity approaches a sum of Gaussians in L1,
DEKt|t−1(x) → DS,t|t−1(x)+Dβ,t|t−1(x)+ γt(x), (5.73)
104
as P(i)t−1 → 0. 1
Proof
We assume that we know the birth intensity, γt , so that by Theorem 10, we can represent
this by a sum of Gaussians as closely as we wish in L1,
γt(x) =Jγ,t
∑i=1
w(i)γ,t N (x;m(i)
γ,t ,P(i)γ,t ). (5.74)
For the existing targets, using the intensity at time t − 1, Dt−1(x), and the extended
Kalman filter prediction equations, we obtain an approximate expression for the predicted
estimate of each Gaussian component, N (x;m(i)t−1,P
(i)t−1), to a new Gaussian component,
N (x;m(i)S,t|t−1,P
(i)S,t|t−1). Then using the result for the EK Gaussian Sum filter [44], we find
that
DEKS,t|t−1(x) → pS,t
Jt−1
∑j=1
w( j)t−1N (x;m( j)
S,t|t−1,P( j)S,t|t−1), (5.75)
uniformly in x as P(i)t−1 → 0 for i = 1, . . . ,Jt−1, where
m( j)S,t|t−1 = ϕt(m( j)
t−1,0), (5.76)
P( j)S,t|t−1 = G( j)
t−1Qt−1[G( j)t−1]
T +F( j)t−1P( j)
t−1[F( j)
t−1]T . (5.77)
Finally, we come to the predicted intensity for spawned targets, βt|t−1(x|ζ). Using the
PHD prediction equations for the EK PHD filter, each of the Gaussian components at time
1The EK superscript refers to the extended Kalman approximation.
105
t −1 produces Jβ,t Gaussian components,
βt|t−1(x|m(i)t−1) =
Jβ,t
∑l=1
w(l)β,tN (x;F(l)
β,t−1m(i)t−1 +d(l)
β,t−1,Q(l)β,t−1). (5.78)
Similar to the result used for the prediction of existing targets, the sum over the Jt−1 com-
ponents approaches a Gaussian sum
DEKβ,t|t−1(x) →
Jt−1
∑j=1
Jβ,t
∑=1
w( j)t−1w(`)
β,t N (x;m( j,`)β,t|t−1,P
( j,`)β,t|t−1), (5.79)
where
m( j,`)β,t|t−1 = F (`)
β,t−1m( j)t−1 +d(`)
β,t−1, (5.80)
P( j,`)β,t|t−1 = G( j)
β,t−1Qβ,t−1[G( j)β,t−1]
T +F ( j)β,t−1P( j)
β,t−1[F( j)β,t−1]
T (5.81)
5.5.2 Extended Kalman Measurement Update
Using the EK PHD filter measurement update equation,
Dt(x) = [1− pD,t(x)]Dt|t−1(x)+ ∑z∈Zt
ψt,z(x)Dt|t−1(x)κt(z)+
R
ψt,z(ξ)Dt|t−1(ξ)dξ, (5.82)
106
we show that the posterior intensity converges to a sum of Gaussians uniformly in L1. The
PHD update components are given by
S( j)t = U ( j)
t Rt [U ( j)t ]T +H( j)
t P( j)t|t−1[H
( j)t ]T , (5.83)
K( j)t = P( j)
t|t−1[H( j)t ]T [S( j)
t ]−1, (5.84)
P( j)t|t = [I −K( j)
t H( j)t ]P( j)
t|t−1, (5.85)
where
H( j)t =
∂ht(xt ,0)
∂xt
∣
∣
∣
∣
xt=m( j)t|t−1
,U ( j)t =
∂ht(m( j)t|t−1,εt)
∂εt
∣
∣
∣
∣
∣
∣
εt=0
. (5.86)
LEMMA 17 With the non-linear measurement equation zt = ht(xt ,εt) and the predicted
intensity given by the sum of Gaussians,
Dt|t−1(x) =
Jt|t−1
∑i=1
w(i)t|t−1N (x;m(i)
t|t−1,P(i)t|t−1), (5.87)
the updated density approaches the Gaussian sum
DEKt (x) → (1− pD,t)Dt|t−1(x)+ ∑
z∈Zt
DD,t(x;z), (5.88)
uniformly in xt and Zt as P(i)t|t−1 → 0 for i = 1, . . . ,Jt|t−1.
Proof
Clearly the term on the left of the measurement equation, [1− pD,t(x)]Dt|t−1(x), is a Gaus-
107
sian sum, since the probability of detection pD,t(x) = pD,t is assumed to be a constant and
Dt|t−1(x) is given by the predicted intensity.
Taking the numerator of the term on the right inside the summation and using the pre-
dicted intensity,
ψt,z(x)Dt|t−1(x) = pD,tgt(z|x)vEKt|t−1(x) (5.89)
= pD,tN (z;ht(x),Rt)Jt|t−1
∑i=1
w(i)t|t−1N (x;m(i)
t|t−1,P(i)t|t−1), (5.90)
(which, by Anderson and Moore [44] pp 215-216)
→ pD,t
Jt|t−1
∑i=1
w(i)t|t−1N (z;η(i)
t|t−1,Rt +H(i)Tt P(i)
t|t−1H(i)t )N (x;m(i)
t|t ,P(i)t|t ), (5.91)
uniformly as P(i)t|t−1 → 0 for all i = 1, . . . ,Jt|t−1.
Now consider the denominator,
κt(z)+Z
ψt,z(ξ)Dt|t−1(ξ)dξ. (5.92)
Taking the integral and the L1 convergence result discussed above,
Z
ψt,z(ξ)Dt|t−1(ξ)dξ = pD,t
Z
N (z;ht(ξ),Rt)Jt|t−1
∑i=1
w(i)t|t−1N (ξ;m(i)
t|t−1,P(i)t|t−1)dξ (5.93)
→ pD,t
Z
Jt|t−1
∑i=1
w(i)t|t−1q( j)
t (z)N (ξ;m(i)t|t (ξ),P(i)
t|t )dξ, (5.94)
108
where
q( j)t (z) = N (z;η(i)
t|t−1,H(i)Tt P(i)
t|t−1H(i)t +Rt). (5.95)
Changing the order of the summation and integral, this is equal to
pD,t
Jt|t−1
∑i=1
w(i)t|t−1
Z
q(i)t (z)N (ξ;m(i)
t|t (ξ),P(i)t|t )dξ (5.96)
= pD,t
Jt|t−1
∑i=1
w(i)t|t−1q(i)
t (z)N (x;m(i)t|t ,P
(i)t|t ), (5.97)
so that,
Dt(x) → (1− pD,t)Dt|t−1(x)+ pD,t
Jt|t−1
∑i=1
w(i)t|t−1q(i)
t (z)
κt(z)+ pD,t ∑Jt|t−1l=1 w(l)
t|t−1q(l)t (z)
N (x;m(i)t|t ,P
(i)t|t ),
(5.98)
uniformly as P(i)t|t−1 → 0 for all i = 1, . . . ,Jt|t−1, where
m( j)t|t (z) = m( j)
t|t−1 +K( j)t (z−ht(m( j)
t|t−1)) (5.99)
5.5.3 The Unscented Kalman PHD Filter
Instead of linearising the model, as is the case with the extended Kalman filter, the un-
scented Kalman filter [50] approximates the mean and covariance with a set of sigma points
using the unscented transform. It can be shown that the predicted mean converges to an esti-
mate which is accurate to a second order, which is more accurate than the estimate given by
109
the extended Kalman filter, and that the predicted covariance converges to the same as that
estimated through linearisation using the EKF. In the unscented PHD filter, the unscented
transform is applied in the prediction step to each term in the Gaussian mixture and the up-
date step is the same as the Gaussian mixture PHD filter update. The convergence analysis
of the UK PHD filter is omitted here, and the interested reader is referred to the work by
Julier and Uhlmann [39] for an analysis of the convergence of the unscented Kalman filter.
5.6 Conclusions
A consequence of Wiener’s Theory of Approximation is that density functions can be ap-
proximated uniformly with a sum of Gaussians. This result has been used to show that the
error for the recently proposed Gaussian mixture PHD filter converges uniformly for each
of the steps in the algorithm. Error bounds have been provided for the pruning and merging
stages, which are used to reduce the number of Gaussian components, based on those es-
tablished for the Gaussian sum filter. These results give further theoretical justification for
the use of the Gaussian mixture PHD filter in multiple target tracking problems.
Proofs of uniform convergence are also derived for the extended Kalman PHD filter.
The accuracy of the unscented Kalman PHD filter is discussed as an extension to the results
already established for the unscented Kalman filter.
110
Chapter 6
PHD Filter Target Estimation in Sonar
Images
6.1 Introduction
One of the goals of the sonar research community is to develop Autonomous Underwater
Vehicles (AUVs), self-navigating robots which operate underwater. Such vehicles can be
equipped with a range of sensors including forward-look sonar, sidescan sonar and video
to enable them to navigate autonomously and undertake a range of missions, for example
mine countermeasures, pipeline inspection or seabed habitat mapping. To enable AUVs to
do this successfully, methods for detecting and tracking objects on the seabed are required
to aid path planning and navigation as well as using these techniques as an integral part of
the mission. The obvious initial application is to enable the vehicle to sense its environment
and prevent collision with any object. Although AUVs are typically equipped with inertial
navigation systems, they are prone to drifting and errors in the measured vehicle position
111
increase during the mission.
This chapter demonstrates an application of the Particle PHD Filter from chapter 4
for estimating a variable number of targets in a sequence of sonar images in the presence
of clutter. The objects of interest to be tracked will either be stationary on the seabed
or moving through the water. The stationary objects will be moving with respect to the
AUV’s frame of reference image plane as it is the vehicle which is moving. Tracking the
stationary objects can aid registration of the sequence of images generated by the forward-
look sonar which could be useful for concurrent mapping and localisation of the underwater
terrain [67], AUV path planning [3] and navigation [4].
Traditional multi-target tracking is based on coupling trackers such as Kalman fil-
ters, extended Kalman filters or particle filters with a data association technique (see Bar-
Shalom [5] for example). The aim of the data association process is to interpret which
measurements are due to the targets and which are due to false alarms. Another technique
which has been applied to sonar imagery uses Optical Flow calculations to estimate direc-
tion motion [68].
The PHD Filter is a method of propagating a multi-modal measure within a unified
framework without associating the measurements and has the ability to estimate the number
and position of targets in data with clutter. Data association techniques are avoided as the
identities of the targets are not kept. This is a drawback of the PHD Filter tracker as often
continuity of identity is needed. Novel techniques for this shall be presented in chapter 7.
One of the advantages of the PHD Filter is its ability to track objects in clutter, which is
often the case in sonar data where there are many spurious measurements due to the noisy
data. The measurements are taken in the sonar reference plane so that a stationary object in
112
the global or world reference plane will be moving with respect to the underwater vehicle.
Whilst many of the objects to be tracked will be in the world reference plane, there could
also be moving objects which it may be necessary to track such as other vehicles, marine
mammals or fish. Thus the ability to track a variable number of targets in the presence of
missed detections and spurious measurements is advantageous in this application.
This section describes the method for multiple target tracking which has been imple-
mented for forward-looking sonar data. The Particle PHD-Filter can be viewed as an ex-
tension of the single target particle filter with the ability to track multiple targets without
data association. An application version of the Particle PHD-Filter from chapter 4 is given
here in pseudocode form. The system model describes the evolution of state with time i.e.
the motion of the underwater vehicle and the measurement model relates the measurements
to the state i.e. the objects on the seabed. New targets are introduced into the model by
the birth model which assigns M uniformly distributed particles at the end of the sector
for incoming objects into the FoV according to the rate in which the sonar moves across
the seabed (the new targets are assumed to come at the end of the sector as this is the new
section of seabed surveyed). Again, the particles represent a hypothesis about speed and
position of a target although they are not specifically attached to any particular target. In the
target location estimation stage the number of targets is estimated by taking the sum of all
particle weights, the nearest integer value is taken to be the number of targets. A Gaussian
mixture model is fitted to the data, to determine the target locations.
The Particle PHD Filter algorithm is first shown working on a simulated target trajectory
where the accuracy of the tracker can be determined. It is then demonstrated using real
data to show that this technique is applicable in a real scenario. The sonar data returns
113
from objects on the seabed have a much higher intensity than the surrounding region of
seabed due to a combination of higher reflectivity properties and the geometry of the sonar
imaging process which will result in multiple returns at the same time. The measurements
for the tracker have been obtained by thresholding the images and finding the centroid of
the regions above the threshold. The signal to noise ratio for objects is high so this simple
technique is effective for finding the target locations although there is often a large number
of spurious measurements. However, as will be demonstrated in the results, the estimated
positions converge to the actual target locations despite the false alarms.
6.2 The Particle PHD Filter with State Estimation
The sequential Monte Carlo implementation of the PHD filter described in chapter 4 has
been implemented for multiple target state estimation of obstacles in sonar data. The algo-
rithm is initialised by distributing particles across the observation space, or field of view,
with randomly chosen states. The particles are predicted with the linear state equation into
the next time step and Gaussian noise is added. When the measurements are received, the
particle weights are updated with the PHD filter update equation. The sum of the weights
after the update step provides an estimate of the number of targets in the scene which is
used for the multiple target state estimation. The particles are resampled according to their
weights and reweighted so that each particle has the same weight. The multiple target
states are estimated using the Expectation-Maximisation algorithm to fit a Gaussian Mix-
ture model to the particle data to find the state estimates and covariances. The number of
Gaussian components fitted to the particles is the expected number of targets determined
from the total particle mass, or sum of the weights after the PHD update step.
114
Particle PHD filter Algorithm Implementation
step 0. (Initialisation, at t = 0.)
for i = 1, . . . ,N0
sample x(i)0 ∼ D0|0, the prior PHD .
assign particle weight, ω(i)0 , the mass ω(i)
0 = T0/N.
set t = 1.
step 1. (Prediction Step, for t ≥ 1.)
for i = 1, ..,Nt−1
Project particle with state equation ft|t−1(xt |ξit−1).
Assign weight ωit|t−1 = pS/N according to their probability of survival pS which
is dependent on the position in the FoV.
Introduce M particles at end of FoV for birth model
Assign weight ωit|t−1 = pB/M where pB is the probability of target birth.
Let Rt = Nt−1 +M.
step 2. (Update Step, for t ≥ 1.)
for z ∈ Zt ,
compute 〈ωt|t−1,ψt,z〉 = ∑Rti=1 ψt,z(x(i)
t )ω(i)t|t−1.
for i = 1, . . . ,Rt ,
update weights, ω(i)t =
[
(1− pD +∑z∈Ztψt,z(x
(i)t ))
κt(z)+〈ωt|t−1,ψt,z〉
]
ω(i)t|t−1.
step 3. (Resampling Step, for t ≥ 1.)
Compute the total particle mass, Tt = ∑Rti=1 ω(i)
t ,
set Nt = N.int(Tt) (where int(Tt) is the nearest integer to Tt ).
for i = 1, . . . ,Rt ,
Resample
ω(i)t
Tt, x(i)
t
Rt
i=1to get
Tt/N,x(i)t
Nt
i=1.
The particles each have weight Tt/N after resampling.
step 4. (Target Estimation, for t ≥ 1.)
The locations of the targets are found by fitting a Gaussian mixture model to theparticles where the number of mixture components is the expected number of targetsat time t, Tt = ∑i ωi
t .
115
6.3 Forward-Looking Sonar Implementation
The forward-looking sonar can be considered as having k beams, where the angular dis-
tance between their central axes is δθ degrees. The data for each of the beams is in the
form of acoustic intensity against time. The time values are related to the slant range to
the object. If isovelocity conditions are assumed then they can be translated into range
measurements. The measurements taken from the sonar are in polar co-ordinates and so
the tracker implemented for forward-scan sonar will track range and bearing measurements
obtained from thresholded sonar images.
The following state space model is used:
xt =
1 1 0 0
0 1 0 0
0 0 1 1
0 0 0 1
xt−1 +
0.5 0
1 0
0 0.5
0 1
vt−1, (6.1)
ot =
arctan(yt/xt)
√
(xt)2 +(yt)2
+nt . (6.2)
vt and nt are the process and measurement noises respectively, which are uncorrelated.
The state vector is defined as the 2D position and velocity vector of the target, relative to a
fixed external reference frame:
xt =
(
xt xt yt yt
)T. (6.3)
116
The observation at time t (ot) is the bearing angle and range from the fixed observer
towards the target. Although more complex noise models can be used, Gaussian observation
and state noise distributions have been used for initial investigation with the filter.
The motion of the sonar is assumed to be linear for the particle filter and so the objects
are moving towards the sonar. The FoV is a sector of 10 degrees with range 20m to 60m.
Due to the linear motion of the sonar and the FoV, new objects are most likely to appear
at the end of the sector and disappear at the beginning with the objects moving towards
the sonar and so the probabilities of birth pB and survival pS have been defined to reflect
this. The birth model will allow for new targets entering the FoV by distributing particles
uniformly at the end of the sector.
6.3.1 Simulated Data
To demonstrate the performance of the technique the tracker is firstly tested on simulated
data. The advantage of using simulated data is that it allows various different realistic
scenarios and trajectories to be created easily. The exact locations of the vehicle and objects
are known and thus the accuracy of the tracker can be determined.
A sequence of forward-looking sonar images is simulated using the Sonar Simulator
developed by Bell [69] which has the capability of modelling sonar in complex underwater
terrain. The artificial seabed is modelled by a 100× 100m2 textured image. Spherical
shaped objects of radius 0.5m have been placed on the seabed.
Figure 6.1 shows the sequence of simulated sonar images from the above scenario,
where the highlights are created by the objects. The sonar has followed an approximately
linear trajectory with a small amount of deviation from this. For ease of display, the images
117
are shown on a rectangular grid although the data is in polar form. Blank lines separate the
images in the sequence.
The results of the tracking for the simulated sonar run have been displayed in the sonar
image reference frame and the global reference frame. In the sonar image reference frame
the objects are moving towards the sonar, figure 6.2 shows the measurements and estimated
positions with respect to the sonar. In the global reference frame, figure 6.3, the sonar
positions are marked on a global co-ordinate map and the objects are stationary. The actual
positions are shown along with the measurements and estimated positions. The tracker
estimates well the number of targets in each image which is between 1 and 4 targets. There
are a few outliers where the position has wrongly estimated a target location, this can often
happen in the initialisation stage where the particles have been uniformly spread and the
distribution of particles are not sufficiently localised onto the target.
Obtaining accurate navigation information can be a significant problem during AUV
missions. However, this technique is still robust when no navigation information is present.
The same scenario as above has been repeated but the sequence of sonar images have been
simulated with the AUV following a sinusoidal trajectory. The tracking was then repeated
with the system having no knowledge of the actual motion, and the results in the global
reference frame have been displayed in figure 6.4. The measurements taken were not as
accurate as the linear path (figure 6.3) but the tracker seems to perform well tracking the
measurements with only a few false estimates.
118
Figure 6.1: Sequence of Simulated Forward Scan Sonar Images with Objects: range (xaxis), bearing (y axis)
119
0 10 20 30 40 50 60−2.5
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
x(m)
y(m
)
Measurements and Estimated Positions in Sonar Reference Frame
measured positionsinferred positionsobserver position
Figure 6.2: Linear Tracking in Sonar Image Reference Frame
0 10 20 30 40 50 6045
46
47
48
49
50
51
52
53
54
55
x(m)
y(m
)
Motion of Sonar with Targets and Estimated Positions
target locationsmeasured positionsinferred positionssonar position
Figure 6.3: Linear Tracking in Global Reference Frame
120
0 10 20 30 40 50 6045
46
47
48
49
50
51
52
53
54
55
x(m)
y(m
)
Motion of Sonar with Targets and Estimated Positions
target locationsmeasured positionsinferred positionssonar position
Figure 6.4: Sinusoidal Sonar Tracking in Global Reference Frame
6.3.2 Real Data
The tracker was then applied to a sequence of real forward-look sonar images. The data
was obtained from a forward-looking sonar device fitted to an underwater vehicle where
the sonar scans a sector of seabed in the direction of the vehicle motion. The sequence of
18 images (figure 6.5) was obtained as the vehicle was flown towards a cylindrical object
lying on the seabed. The images are very noisy but the object can be seen in the sequence as
the small bright highlight moving from top right to bottom left in the sequence. The seabed
over which the sonar traverses appears to be composed of two different sediment types, one
of which provided higher intensity returns. This can be seen in the region before the target
in the first 10 images of the sequence.
The results of the tracking is illustrated in figure 6.6. The measurements and tracked
positions are in the sonar reference frame, the global positions are unknown since no navi-
121
Figure 6.5: Sequence of Real Forward-Scan Images: range (x axis), bearing (y axis)
122
gation information was provided with the data and no accurate ground truth of the object’s
location was available. The location of the cylinder is given by the sequence of points from
the lower right hand region of figure 6.6, moving towards the centre of the figure, as the
vehicle moves closer to the object. In the first few images, there were a lot of false targets,
or clutter points, due to bad observations of the cylinder as a result of high intensity returns
from a region of seabed. These are the group of measurements in the top half of figure 6.6.
To show how well the implementation works with clutter, the tracker has been run on
the data forward in time (where there are a lot of clutter points initially and fewer in the
later images) and backward in time (where there are few clutter points initially and more in
the later images). This demonstration is to show how the initial conditions affect the per-
formance of the algorithm and the convergence as the clutter density at the start is different
in this sequence of images when run backward and forward. Running the algorithm in
the forward direction results in poor estimation initially but afterwards converges onto the
correct target location (see figure 6.6). When the algorithm is run on the data backwards,
the algorithm quickly converges onto the correct target and manages to predict the correct
location through the clutter (see figure 6.7).
These results show that there is a sequence of images where there are few false alarms
and then encounter a cluttered region then the algorithm can predict the correct target but
it works poorer if the cluttered region is at the start. This can be expected as the distri-
bution of the particles is propagated from one frame to the next and so if the particles are
predominantly located in the region of the true target they are more likely to track it well.
123
0 10 20 30 40 50 60 70 80−8
−6
−4
−2
0
2
4
6
x(m)
y(m
)
PHD Tracking Example on Real Sonar Data
measured positionsinferred positionsobserver position
Figure 6.6: Tracked Cylinder in Forward Direction
6.4 Discussion
An application of the particle PHD filter has been implemented for estimating a variable
number of objects in a sequence of forward-looking sonar images. The filter was shown
working on simulated data where the results could be displayed on a global map and then
on real sonar data with clutter for tracking a cylindrical object on the seabed. The simu-
lated data provided a test case scenario where accurate ground truth was available, but the
simulated sonar images contain significantly less noise and clutter than real data.
This technique also has the capacity to incorporate measurements obtained from other
sensing equipment such as video data although the implementation here is restricted to
sonar.
The identities of the objects are not determined in this implementation and so data
association techniques are not used. In many applications, knowledge of which target in
124
0 10 20 30 40 50 60 70 80−10
−5
0
5
x(m)
y(m
)
PHD Tracking Example on Real Sonar Data
measured positionsinferred positionsobserver position
Figure 6.7: Tracked Cylinder in Backward Direction
the current frame relates to which target in the previous frame is important and so the data
association problem needs to be addressed. Methods for incorporating this into the PHD
filter framework are presented in chapter 7. One of the advantages of the PHD Filter is its
ability to filter clutter and so the number of spurious measurements is reduced. Sonar data
is very noisy which gives rise to many spurious measurements and the PHD Filter copes
well with this.
125
Chapter 7
State Estimation and Track Continuity
for the Particle PHD Filter
7.1 Introduction
The output from the PHD filter provides a multimodal density from which we need to
estimate the states of the targets at each iteration. This chapter considers and compares
the different approaches for clustering data.We consider two techniques for estimating the
target states at each iteration, namely k-means clustering and mixture modelling via the
expectation-maximization algorithm.
The advantage of the PHD filter is that it can track a variable number of targets, estimat-
ing both the number of targets and their locations. It avoids the need for data association
techniques, since the identities of the individual targets are not required. Whilst this may
be advantageous if the main concern is where the targets are, it is a major drawback if it is
necessary to identify the trajectories of the different targets. Two methods for associating
126
the targets between frames have been reported in the literature. The first of these, by Panta
et al. [29], used the PHD filter for pre-filtering the data input to a Multiple Hypothesis
Tracker. The second technique, proposed by Lin [30], represents the PHD in a resolution
cell to differentiate the peaks of the PHD posterior, and validation gating was used to de-
termine the weights of the particles. The PHD filter estimated the number and locations of
the targets and the results of data association determined the peaks of the PHD.
This chapter presents two techniques for enabling track continuity with the particle PHD
filter based on the pseudo-code for the Particle PHD filter given in the table 1.
127
Table I: Pseudo-code for the Particle PHD filter with track continuity.
step 0. (Initialization at t = 0.)
for i = 1, . . . ,N0
sample x(i)0 ∼ D0|0, the prior PHD .
assign particle weight, ω(i)0 , the mass ω(i)
0 = T0/N.
set t = 1 .
step 1. (Prediction Step, for t ≥ 1.)
for i = 1, ..,Nt−1
sample x(i)t from a proposal density qt(.|x(i)
t−1,Zt).
evaluate the predicted weights ω(i)t|t−1 =
φt|t−1(x(i)t ,x(i)
t−1)
qt(x(i)t |x(i)
t−1,Zt )ω(i)
t−1.
for i = Nt−1 +1, ..,Nt−1 +M
sample x(i)t from another proposal density pt(.|Zt).
compute the weights of newborn particles ω(i)t|t−1 = 1
Mγt(x
(i)t )
pt(x(i)t |Zt)
.
Let Rt = Nt−1 +M.
step 2. (Update Step, for t ≥ 1.)
for z ∈ Zt ,
compute 〈ωt|t−1,ψt,z〉 = ∑Rti=1 ψt,z(x(i)
t )ω(i)t|t−1.
for i = 1, . . . ,Rt ,
update weights, ω(i)t =
[
(1− pD)+ ∑z∈Ztψt,z(x
(i)t ))
κt(z)+〈ωt|t−1,ψt,z〉
]
ω(i)t|t−1.
step 3. (Resampling Step, for t ≥ 1.)
Compute the total particle mass, Tt = ∑Rti=1 ω(i)
t ,
set Nt = N.int(Tt) (where int(Tt) is the nearest integer to Tt ).
for i = 1, . . . ,Rt ,
Resample
ω(i)t
Tt, x(i)
t
Rt
i=1to get
Tt/N,x(i)t
Nt
i=1.
The particles each have weight Tt/N after resampling.
step 4. (Target Estimation, for t ≥ 1.)
Target state estimates obtained from PHD by clustering (tables II and III).
step 5. (Association, for t ≥ 2.)
Estimates associated with existing target tracks,
tracks initiated/ deleted (tables V and VI).
128
7.2 Multi-Target State Estimation
This section addresses step 4 of the Particle PHD filter algorithm summarised in Table I,
namely estimating the target states from the particle representation of the PHD density. We
consider different techniques for clustering the data and compare the two preferred methods
for estimating the target states. The accuracy of the algorithms is measured against the true
target trajectories and the run-time is compared for each of the algorithms.
7.2.1 Cluster Analysis
The aim of cluster analysis is to separate data points into homogeneous groups, or clusters,
based on some discriminative criteria. Three main classes of technique are used in the lit-
erature [70]: agglomerative hierarchical clustering techniques, fitting mixture models, such
as the Expectation-Maximization (EM) algorithm, and optimization methods, such as the
k-means algorithm (e.g. [71]). If N is the number of particles, then the time complexity of
the hierarchical clustering algorithm is O(N2 logN) which is impractical for large numbers
of particles. If T is the number of targets and τ is the number of iterations in the clustering
algorithm, then the time complexity of k-means is O(τT N) and the time complexity of the
EM algorithm is O(τT 2N). These two techniques will be considered further.
Gaussian Mixture Modelling with the Expectation-Maximization Algorithm
Since we are using Gaussian noise for the state and observation equations, we can model the
posterior particle distribution as a multi-modal Gaussian and try to determine its parameters.
Let Tt be the estimated number of targets determined from the total particle mass. Then
129
the set of parameters which specify the Gaussian mixture is
θt = (πt,n,mt,n,St,n)Tkn=1, (7.1)
where the tuple θt,n = (πt,n,mt,n,St,n) contains the probability that a particle is in the nth
Gaussian and the mean and covariance of this Gaussian respectively. See Table II for a
description of this algorithm.
Clustering with the k-means Algorithm
The k-means clustering algorithm takes a set of points, in this case the particles, and sepa-
rates them into k partitions, Pt,1, ...,Pt,k, with means Mt = mt,1, . . . ,mt,k, called centres,
such that the mean squared distance from each point to its nearest centre is minimized.
One of the most common algorithms for k-means clustering is Lloyd’s algorithm which
is based on the observation that the best placement of a centre is at the centroid of the as-
sociated cluster. Each stage of Lloyd’s algorithm moves every centre, mt, j, to the centroid
of its partition, Pt, j, and then updates the partition by recomputing the distance from each
point to its nearest centre. These steps are repeated until a convergence criterion is met. See
Table III for a description of this algorithm.
130
Table II: The EM Algorithm (step IV of Table I).
given: particles x(1)t , . . . ,x(Nt )
t and estimated target number Tt .
step 0. (Initialization.)
For n = 1, . . . , Tt
Initialize (π(1)t,n ,m(1)
t,n ,S(1)t,n ) with
π(1)t,n = 1
Tt,
m(1)t,n = x(i)
t , where i = b(k−1)(Nt −1)/(Tt −1)c+1,
S(1)t,n = 1
Nt ∑Nti=1 x(i)
t x(i)t
T,
Compute p(x|n,θ(1)) = N(x;m(1)t,n ,S(1)
t,n ).
Set j := 2.
repeat:
step 1. (Expectation.)
For n = 1, . . . , Tt
Calculate (provisional) probability particle has Gaussian n, πt,n, mean, mt,n,and covariance, St,n:
πt,n = 1Nt ∑Nt
i=1 p(n|x(i)t ,θ( j))
mt,n = 1Nt πt,n ∑N
n=1 x(i)t p(n|x(i)
t ,θ( j))
St,n = 1Nt πt,n ∑Nt
n=1(x(i)t − mt,n)(x(i)
t − mt,n)t p(n|x(i)t ,θ( j))
Provisional Gaussian is p(x|n,θ( j)) = N(x;m( j)t,n , S( j)
t,n ).
Compute: Q(θ;θ(i)) = E[log p(x|θ)|x,θ( j)], by expanding theexpectation, = ∑Tt
n=1 log p(x,n|θ)p(n|x,θ( j)), using p(x,n|θ) =
p(n|x,θ)p(x|θ), = ∑Ttn=1 ∑Nt
i=1 log[πn p(x|θ)]p(n|x,θ( j)), and Bayes’ rule,= ∑Tt
n=1 ∑Nti=1 log[πn p(x|θ)] p(x|n,θ( j))πn
∑Ttl=1 p(x|l,θ(l))
.
step 2. (Maximization.)
Maximize Q(θ;θ( j)) with respect to θ ∈ Ω(K) using Lagrange multipliers,
(π( j+1),m( j+1),S( j+1)) = arg max(π,m,S)∈Ω(K)
Q(θ;θ( j)).
Set j := j +1.
until: |Q(θ;θ( j))−Q(θ;θ( j−1))| < ε, for specified threshold ε.
output: means and covariances of Tt partitions (xt,1,St,1), . . . ,(xt,Tt ,St,Tt ).
131
Table III: The k-means Algorithm. (step IV of Table I). given: particles
x(1)t , . . . ,x(Nt )
t and estimated target number Tt .
step 0. (Initialization.)
Choose k = Tt particles at random to be the initial centres, m(1)t,1 , . . . ,m(1)
t,Tt :=
x(k1)t , . . . ,x
(kTt )t .
Set j := 2.
repeat:
step 1. (Partition.)
Partition the particles, P( j)t,1 , . . . ,P( j)
t,Tt, such that x(i)
t ∈ P( j)t,1 if argminn ‖x(i)
t −m( j)
t,n ‖ = j.
step 2. (Recalculate centres.)
Calculate means m( j)t,1 = mean(P( j)
t,1 ).
Set j := j +1.
until: |∑Nti=1 ∑Tt
n=1 ‖x(i)t −m( j)
t,n‖−∑Nti=1 ∑Tt
n=1‖x(i)t −m( j−1)
t,n ‖| < εstep 3. (Calculate covariances of partitions.)
Calculate covariances m( j)t,1 = cov(P( j)
t,1 ).
output: means and covariances of Tt partitions (xt,1,St,1), . . . ,(xt,Tt ,St,Tt ).
7.2.2 Multi-target Miss Distance Metrics
To compare the accuracy of the two techniques (k-means and EM algorithms) for estimating
the target states, we need appropriate metrics. We consider both the Hausdorff distance and
the Wasserstein distance.
The Hausdorff distance is a common method for measuring the distance between two
sets, originating from pure mathematics. The Hausdorff distance provides a good means of
assessing overall localisation performance.
While the Hausdorff distance is good at assessing localization performance, it is insen-
sitive to different numbers of targets. Hoffman et al. [72] adopted the Wasserstein distance
from theoretical statistics as a means of defining a metric for multitarget distances that
132
penalizes estimation of an incorrect number of targets. It has been used to assess the per-
formance of the PHD filter [17] [29]; we will compare it with the Hausdorff distance. Table
IV describes the two metrics.
Table IV: Multi-target Miss Distance Metrics
Hausdorff Distance
Let Xt and Xt be the finite sets of target states and estimated target states at time t.
The Hausdorff distance between the two sets is defined as
dH(Xt , Xt) = max
maxxi∈Xt minx j∈Xt d(xi, x j),maxx j∈Xt minxi∈Xt d(x j,xi)
.
Wasserstein Distance
Let Xt and Xt be the finite sets of target states and estimated target states at time t.
The LP Wasserstein distance between the two sets is defined as
dWP (Xt , Xt) = infC
(
∑xi∈Xt ∑x j∈Xt Ci jd(xi, x j)P) 1
P , where C is an |Xt |× |Xt | matrixCi j such that
∀i = 1 . . . |Xt |, j = 1 . . . |Xt | :
∑|Xt |i=1 Ci j = 1
|Xt |, ∑|Xt |
i=1 Ci j = 1|Xt | , Ci j ≥ 0.
The L∞ Wasserstein distance is defined as
dW∞ (Xt , Xt) = inf
Cmax
xi∈Xt ,x j∈XtCi jd(xi, x j),
where Ci j = 1 if Ci j > 0 and Ci j = 0 if Ci j = 0.
7.2.3 Simulated Examples
This section demonstrates results on estimating target locations from the estimated PHD
(step 4 of the algorithm). Trajectories of targets have been simulated, and noise has been
added to generate the measurements. The k-means and EM algorithms have been run on the
particle cloud outputs within the iteration of the PHD filter to obtain target estimates. The
133
run time has been measured for both of the algorithms at each iteration. To determine the
accuracy of the algorithms, we compute the Hausdorff and Wasserstein distances between
the estimates and the true trajectories, using the Euclidean distance to detemine individual
distances between targets. Comparing the error metrics between the outputs of the k-means
and EM algorithms will show if there is any significant difference in the performance.
For simplicity, our simulated examples use a linear Gaussian dynamics with the follow-
ing state space model:
xt =
1 T 0 0
0 1 0 0
0 0 1 T
0 0 0 1
xt−1 +
T 2/2 0
T 0
0 T 2/2
0 T
vt−1, (7.2)
and observation model:
ot =
1 0 0 0
0 0 1 0
xt +nt . (7.3)
vt and nt are the uncorrelated process and measurement noises, respectively.
The state vector is defined as the 2D position and velocity vector of the target:
xt =
(
xt xt yt yt
)T. (7.4)
Example 1
The first example demonstrates how the time complexities of the algorithms are affected by
increasing the number of targets. The program was implemented in C and run on a 2.8GHz
Mobile Pentium 4 HT with 512Mb system memory and 1Mb cache, although the code was
134
not designed to be optimal. We assign 50 particles per target and an additional 40 particles
for newborn targets randomly distributed across the state space. We start with one target,
and introduce another at increments of 30 iterations, until there are nine in total (see Figure
7.2). The graph in Figure 7.1 shows the time taken to estimate the target locations using the
EM and k-means algorithms. The EM algorithm has a quadratic complexity in the number
of targets, whereas the k-means algorithm has a linear complexity in the number of targets.
When the number of targets is low, i.e. between one and three, the computation time to
estimate is fairly similar, but as the number of targets increases, the EM algorithm rapidly
becomes infeasible for realtime operations.
Because there is no clutter in this example, the estimated number of targets is the same
as the number of measurements, and the Hausdorff and Wasserstein distances are the same.
The simulated measurements and estimated positions for the k-means and EM algorithms
are shown in Figures 7.3 and 7.4. By inspection, we can see that there are fewer spurious
estimates using the k-means algorithm. Figures 7.5 and 7.6 show the Hausdorff distances
for each of the algorithms, as well as the maximum measurement errors. The error between
the estimates and the true positions is generally better than the error between the measure-
ments and the true positions with both algorithms. Because we are introducing a new target
at every iteration multiple of 30, there are spikes in the Hausdorff distance corresponding
to a poor initial estimate for the target. This rapidly decreases to below the measurement
error after a few iterations, when all the targets are well estimated. The k-means algorithm
provides fewer spurious estimates than the EM algorithm (which can be seen through fewer
spikes in the Wasserstein distance). Overall, in this example, the k-means algorithm has a
much better run-time and is more accurate than the EM algorithm.
135
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
0 30 60 90 120 150 180 210 240 270
Tim
e (s
econ
ds)
Iteration Number
Time for Target Estimation
k-meansEM algorithm
Figure 7.1: Target Estimation Example 1. Time Comparison: EM vs k-means.
Example 2
In our second example, we have four targets with an average of one clutter point per itera-
tion, distributed according to a Poisson model. Figures 7.7 and 7.8 show the measurements,
including clutter points, and target estimates for each of the algorithms. The PHD filter esti-
mated the correct number of targets (four) except in two of the iterations, when it estimated
that there were five targets. Figures 7.9 and 7.10 show that the Hausdorff distance for the
k-means algorithm is generally lower than the measurement error, but the EM algorithm has
estimated poorly in a couple of cases (spikes in the graph above the measurement error).
Figures 7.11 and 7.12 show the Wasserstein L∞ distance, which is the same as the Hausdorff
distance except in the couple of iterations where the PHD filter has incorrectly estimated
the number of targets, which are shown by the two spikes on the right of the graph. In this
case, all the weights are assigned to the outliers. Again, the k-means algorithm outperforms
the EM algorithm in run-time, with results similar to Figures 7.1 and 7.2 from iterations
90−120, where there are four targets.
136
0
2
4
6
8
10
0 30 60 90 120 150 180 210 240 270
Est
imat
ed T
arge
t Num
ber
Iteration Number
Number of Targets Estimate
target mass
Figure 7.2: Target Estimation Example 1 True number of targets from PHD filter.
7.2.4 PHD Filter Estimated Target Number
The estimated target number from the PHD filter is obtained by computing the total particle
mass (Table I, step 3). If this number is different to the actual number of particle clusters,
then incorrect state estimates may be obtained. A simple example of this with k-means is
when the estimated number of targets is 1 and the actual number of clusters is 2. The state
estimate in this case will be the mean, which will be somewhere between the two clusters.
Similar poor performance will be exhibited with the EM algorithm. When the number of
estimates is higher than the actual number, incorrect state estimates will be given.
In our examples, we have used a high probability of detection (pD ≈ 1) and low clutter.
We have assumed that the false alarms have been filtered out, which is only reasonable in
low uncorrelated clutter. In high clutter environments, smaller clusters of particles form
around false alarms and the estimated number of targets and their states can be incorrect.
A practical application on sonar data recently demonstrated that the PHD filter with track
continuity gives comparable performance to tracking with Kalman filters when the clutter
137
-400
-200
0
200
400
600
800
1000
1200
1400
-800 -600 -400 -200 0 200 400 600 800 1000 1200
y po
sitio
n
x position
Target Estimates
measurementsEM targets estimates
Figure 7.3: Target Estimation Example 1. Measurements and estimated positions: EMalgorithm.
-400
-200
0
200
400
600
800
1000
1200
1400
-800 -600 -400 -200 0 200 400 600 800 1000 1200
y po
sitio
n
x position
Target Estimates
measurementsk-means targets estimates
Figure 7.4: Target Estimation Example 1. Measurements and estimated positions: k-means.
138
0
50
100
150
200
250
300
0 30 60 90 120 150 180 210 240 270
Hau
sdor
ff D
ista
nce
Iteration Number
Hausdorff Target Error
measurement errorEM error
Figure 7.5: Target Estimation Example 1. Hausorff Distances: EM algorithm.
is low [26].
If the number of clusters is not known, it would be useful to be able to determine this
directly from the data. In Bouman’s unsupervised algorithm for modelling Gaussian mix-
tures [73], a measure of goodness of fit is found called the Rissanen criterion or Minimum
Description Length (MDL) estimator. This works by attempting to find the model order
which minimizes the number of bits required to code the data samples and parameter vector.
The final number of clusters chosen is the value which minimizes the MDL over possible
values of k.
A method of determining the correct number of clusters for the k-means algorithm is
called v-fold cross validation which computes the mean squared distance between the par-
ticles and their nearest target state estimate for each value of k. This is plotted against the
number of k, which exhibits a scree-plot pattern, and decreases rapidly as the number of
139
0
50
100
150
200
250
300
0 30 60 90 120 150 180 210 240 270
Hau
sdor
ff D
ista
nce
Iteration Number
Hausdorff Target Error
measurement errork-means error
Figure 7.6: Target Estimation Example 1. Hausorff Distances: k-means.
clusters increases and levels off around the true value.
Although we do not use these techniques here, they could provide an alternative to
using the target number estimate from the PHD filter. An empirical analysis has been
demonstrated on the PHD particle distribution [25]. These techniques involve running the
respective algorithms over a range of values for the number of targets and so they increase
the run-time of the algorithms.
7.2.5 Time Complexity of PHD filter Tracker
The algorithm is initialised with N0 particles drawn from a prior distribution which requires
O(N0) calculations. In the prediction step, Nt particles are sampled from one proposal dis-
tribution and Mt from a birth proposal distribution which requires O(Nt +Mt) calculations.
In the update step, the weights are recalculated, requiring O((Nt + Mt)|Zt |) calculations.
140
0
200
400
600
800
1000
1200
0 100 200 300 400 500 600 700 800
y po
sitio
n
x position
Target Estimates
measurementsEM targets estimates
Figure 7.7: Target Estimation Example 2. Measurements and estimated positions: EMalgorithm.
0
200
400
600
800
1000
1200
0 100 200 300 400 500 600 700 800
y po
sitio
n
x position
Target Estimates
measurementsk-means targets estimates
Figure 7.8: Target Estimation Example 2. Measurements and estimated positions: k-means.
141
0
10
20
30
40
50
60
70
80
90
100
0 10 20 30 40 50 60 70 80 90 100
Hau
sdor
ff D
ista
nce
Iteration Number
Hausdorff Target Error
measurement errorEM error
Figure 7.9: Target Estimation Example 2. Hausorff Distances: EM algorithm.
0
20
40
60
80
100
0 10 20 30 40 50 60 70 80 90 100
Hau
sdor
ff D
ista
nce
Iteration Number
Hausdorff Target Error
measurement errork-means error
Figure 7.10: Target Estimation Example 2. Hausorff Distances: k-means.
142
0
20
40
60
80
100
0 10 20 30 40 50 60 70 80 90 100
Was
sers
tein
Dis
tanc
e
Iteration Number
Wasserstein Target Error
measurement errorEM error
Figure 7.11: Target Estimation Example 2. Wasserstein Distances: EM algorithm.
The resampling step requires O(Nt+1) calculations, where Nt+1 = NTt which depends on
the estimated number of targets Tt . We have used the k-means algorithm here to estimate
the target locations, which has linear time complexity in the estimated number of targets
and number of particles, O(TtNt+1n), where n is the number of iterations for the k-means
algorithm.
7.2.6 Summary
In our experiments, the k-means has outperformed the EM algorithm for our task of extract-
ing the target states from the PHD filter particles in a linear Gaussian tracking scenario. The
k-means algorithm is faster both in terms of the asymptotic time complexity (k-means is lin-
ear in the number of targets whereas the EM algorithm is quadratic) and in the empirical
analysis shown here. The results from the different error metrics comparing the true trajec-
143
0
20
40
60
80
100
0 10 20 30 40 50 60 70 80 90 100
Was
sers
tein
Dis
tanc
e
Iteration Number
Wasserstein Target Error
measurement errork-means error
Figure 7.12: Target Estimation Example 2. Wasserstein Distances: k-means.
tories with the estimated trajectories using the two clustering algorithms showed that the
k-means algorithm has also made fewer errors than the EM algorithm. For a real-time track-
ing algorithm with a variable number of targets, the quadratic time complexity for the EM
algorithm rapidly becomes infeasible. In the following section, we opt to use the k-means
algorithm for providing the estimated target states for track continuity and data association.
7.3 Track Continuity
This section addresses step 5 of the algorithm shown in Table I, namely associating target
state estimates with target tracks. The data association problem in multiple target tracking
usually involves ensuring that the correct measurement is given to each stochastic filter so
that the trajectories of each target can be accurately estimated. This is referred to as mea-
surement to track association. The three main approaches in the literature are the Nearest
144
Neighbour Standard Filter (NNSF), the Joint Probabilistic Data Association Filter (JPDAF)
and the Multiple Hypothesis Tracking filter (MHT filter) [5]. Before describing these, the
required terminology is briefly outlined.
Each track i has an associated innovation covariance It,i, which defines a validation
region
Vt,i(γ) := z : [z− zt,i]T (It,i)−1[z− zt,i] ≤ γ. (7.5)
The predicted measurement zt|t−1,i = Hxt|t−1,i is obtained by projecting the previous esti-
mate using the motion model (xt|t−1,i = Fxt−1,i) and then using the observation function H.
The difference between the new observation and the predicted measurement is called the
innovation: ν ji = |zt, j − zt,i|. The set of targets at time t is Xt = xt,1, ...,xt,Tt, and the set
of measurements is Zt = zt,1, ...,zt,mt.
The NNSF simply takes the validated measurement nearest to the predicted measure-
ment for updating each of the target states. This can result in problems as the nearest
validated measurement may be the same for two different targets. The Joint Probabilistic
Data Association Filter computes the joint probabilities for all the pairings between the pre-
dicted measurements and estimated target states. This technique also has to consider false
alarms from spurious measurements.
The ideal MHT filter maintains probabilities of all possible associations at each time
step. Unlike the NNSF and JPDAF, MHT does not just consider the probabilities from the
previous time step, which allows for backtracking. It also allows for track initiation. In
practice, it is not feasible to keep track of all possible hypotheses as the time and compu-
tational complexity grows exponentially. Techniques for reducing the complexity include
145
gating (ignore measurements outside validated regions), pruning (eliminating low proba-
bility hypotheses) and merging (combining hypotheses into a single track).
The techniques for data association we will consider here are based on peak-to-track
association. In the PHD filter, estimates of the target locations are given at each time step,
but track continuity is not maintained. We present two methods for enabling identification
of the same targets between frames based on the target estimates provided by the PHD filter.
These techniques take advantage of the ability of the PHD filter to estimate the number and
locations of targets, and to filter out the clutter. The complexity of these techniques is less
than the MHT and JPDA filters because the target states are provided by the PHD filter
algorithm. We have assumed that the false alarms are filtered out by the PHD filter and
there is no backtracking.
The previous section showed that the k-means algorithm was a clear favourite for es-
timating the target states from the particle PHD estimate. We use it here as the preferred
choice for associating the target tracks between frames.
7.3.1 Particle Labelling Association
Our first method for data association is based on the observation that in the particle repre-
sentation of the multimodal density, the particles representing one of the modes will tend to
track that mode if the motion of the particles is modelled well. In previous implementations
of the Particle PHD filter, clustering techniques have been employed to extract the peaks of
the PHD distribution. Here, we extend this idea to assign labels to the particles based on a
partitioning of the data created by the k-means clustering technique.
The method can be informally explained as follows. At each iteration, partition the
146
particles in the position domain and give each particle in the same partition the same label
(partitioning is initially only done in the position domain as the variance of the particle
distribution is usually lower than in the velocity domain, so is a better discriminator, and
it is also computationally faster to do so). In subsequent iterations, when resampling, give
the children of a particle the same label as its parent. After resampling, repartition the data,
and if the majority of the particles in one partition have the same label, then associate these
partitions.
The example state model for the particles used here will be the two-dimensional position
and velocity vector. If the data is not well partitioned in the position domain (i.e. not within
2 standard deviations), then it can be partitioned in the velocity domain; this will help to
keep track of targets that cross each other. This technique is presented in Table V.
Panta et al. [32] presented techniques similar to those described in this chapter which
was based on labelling the clusters and finding the maximum sum of particles with the same
label from the previous time step (Table V, step 5, matrix A). If there are two partitions
in the current timestep which have mostly labels from the same partition in the previous
timestep then this can result in the wrong assignment. This problem was reduced here where
we also consider the number of particles which have been resampled from the previous
timestep (Table V, step 5, matrix C).
147
Table V: Track continuity Method 1 (used in conjunction with Table I).
step 1. (Prediction Step, for t ≥ 2.)
For i = 1, . . . ,Nt−1
Assign prediction labels to the particles LPt (x(i)
t ) := Lt−1(x(i)t−1).
Define prediction partitions PPt,1, . . . ,PP
t,Tt−1 := Pt−1,1, . . . ,Pt−1,Tt−1
For i = 1, . . . ,M
Assign labels to newborn particles LPt (x(i)
t ) := LNEW .
Define newborn particle partition PPt,LNEW .
step 2. (Update Step, for t ≥ 2.)
Define update partitions PUt,1, . . . ,PU
t,Tt−1+1 := PPt,1, . . . ,PP
t,Tt−1∪PP
NEW
step 3. (Resampling Step, for t ≥ 2.)
If x( j)t ∈ Child(x(i)
t ), then assign label LRt (x( j)
t ) := LUt (x(i)
t ).
The resampling partitions are defined accordingly PRt,1, . . . ,PR
t,Tt−1+1.
step 4. (Multi-Target State Estimation, for t ≥ 1.)
Determine target state estimates and covariances (see tables II/III),(xt,1,St,1), . . . ,(xt,Tt ,St,Tt ).
If there are state estimates xt,i and xt, j such that exp−1/2(Hxt,i −Hxt,i)T (HT St,iH)(Hxt,i −Hxt,i) < 4 ,
then repartition based on velocity.
Assign labels to new partitions LEt,1, . . . ,LE
t,Tt
Step 5. (Association, for t ≥ 2.)
Define matrices A and C as follows,
If∣
∣
∣i : x(i)t ∈ PR
t, j ∩PEt,k∣
∣
∣> ε1N, A j,k = 1, else A j,k = 0.
C j,k =∣
∣
∣i : Child(x(i)t ) ∈ PR
t, j ∩PEt,k∣
∣
∣
Associate estimates with tracks as follows,
If ∑k A j,k = 0, delete track Lt, j .
else if ∑k A j,k = 1, associate PRt, j with Lt,k for k such that A j,k = 1
else if ∑k A j,k > 1, associate PRt, j with Lt,k for k = argmaxk C j,k;
declare new tracks for k ∈ k1, . . . ,kn targets such that A j,k = 1.
148
7.3.2 Estimate-to-Track Association
At each stage of the PHD filter algorithm, the target states are estimated by clustering the
particles and obtaining means of the clusters. The second association method which we
present here uses these estimated states and finds the best association between them and the
predicted estimate derived from projecting the previous estimates with the motion model.
The method proceeds as follows. In step 4 of the particle filter algorithm, the estimated
target locations are found by clustering the data and taking the mean positions as the esti-
mated set of state vectors for the targets xt,1, .., xt,Tt. Let F be the transition function for
the dynamic model that is used in the prediction model for the particles. Then, before re-
ceiving any measurements, the predicted state vector for the estimated target xt−1, j at time
t is xt|t−1, j := Fxt−1, j and the estimated set is xt|t−1,1, .., xt|t−1,Tt−1.
The estimated target locations are known at each time step. The goal of the association
stage is to connect these locations between time steps so that there is continuity of identity
for each target. Thus, the association here does not involve the measurements, only the
estimated positions. This has three advantages. First, the estimated positions should give
better estimates than the measurements. Second, spurious measurements due to false alarms
should have been filtered out. Finally, the estimated state vectors have the unobservable di-
mensions such as velocity, which allows for better discrimination when the measurement
positions are close but velocities are different. This technique is presented in Table VI.
149
Table VI: Track continuity Method 2 (used in conjunction with Table I).
step 5. (Association, for t ≥ 2.)
For j = 1, . . . , Tt−1,
Use state equation to obtain predicted state estimate xt|t−1, j := Fxt−1, j.
Predicted target state estimates are xt|t−1,1, . . . , xt|t−1,Tt−1.
For i = 1, . . . , Tt ,
Create validation gate for target state estimate, Vt,i(γ) := x : [x− xt,i]T (St,i)
−1[x−xt,i] ≤ γ.
Evaluate βt , the set of validated 1 − 1 correspondences betweenxt|t−1,1, . . . , xt|t−1,Tt−1
and xt|,1, . . . , xt,Tt .
Find association bt ∈ βt such that bt = argmaxb∈βt ∑b exp−1/2(xt,i −xt|t−1, j)(St,i)−1(xt,i − xt|t−1, j).
Declare new target tracks, Lt,k1 , . . . ,Lt,kn, for target state estimates, xt,k1 , . . . , xt,knfor which no association is made.
7.3.3 Simulated Examples
We now demonstrate our two proposed methods for data association and compare the track-
ing results. Simulated trajectories of targets have been generated with added noise to gen-
erate measurements. We use a linear Gaussian model again, and present the estimated
trajectories of the targets. The k-means algorithm partitions the data and provides the target
estimates. The errors for the target estimation were shown in the previous section, so we
concentrate here only on tracking continuity. The particle output for the PHD-filter is the
same in both cases, so the target estimates are the same.
150
300
350
400
450
500
550
600
650
700
750
100 150 200 250 300 350 400 450 500 550 600
y po
sitio
n
x position
Tracking with Data Association
measurementsTrack 1Track 2Track 3Track 4Track 5Track 6Track 7Track 8Track 9
Figure 7.13: Data Association Example 1. Method 1.
Example 1
Our first example demonstrates the tracking on measurements without clutter. Linear tra-
jectories for the targets have been randomly generated. The targets may enter and leave the
field of view, showing the capability for birth and death of targets. In this example, the data
starts with 3 targets and 2 targets enter at time t = 10, one at t = 15 and another at t = 20,
giving a total of 7 targets. Figures 7.13 and 7.14 show results for each of the methods, both
of which manage to follow the correct targets. In the first method, it is initially unable to
keep track of target 7 for a couple of iterations, but tracks it after this. Simlarly, for the
second method, target 7 is lost after it enters, but picked up after a couple of iterations. This
implies that there are an insufficient number of particles in the PHD filter around this target
to estimate it well initially, but the estimates improve quickly.
151
300
350
400
450
500
550
600
650
700
750
100 150 200 250 300 350 400 450 500 550 600
y po
sitio
n
x position
Tracking with Data Association
measurementsTrack 1Track 2Track 3Track 4Track 5Track 6Track 7Track 8
Figure 7.14: Data Association Example 1. Method 2.
Example 2
This section compares the two methods for their ability to track in a cluttered scenario,
where the measurements may be due to noisy observations or false alarms. The data starts
with 3 targets, 2 targets enter at time t = 10, one at t = 15 and another at t = 20, giving
a total of 7 targets. In addition to the measurements from the targets, there are additional
points generated according to a Poisson model with an average of 1 per iteration. The
estimated number of targets from the PHD filter is given in Figure 7.17. This graph shows
that the PHD filter generally gives the correct number, but it may estimate incorrectly due to
false alarms. Figure 7.18 shows the particle output from the PHD filter, the measurements
and the true target positions. The particle clusters are around the correct targets and not
the two clutter points. Figure 7.15 and 7.16 show the tracking and data association for this
example for methods 1 and 2 respectively. These figures show, over the time period where
152
0
100
200
300
400
500
600
700
800
0 200 400 600 800 1000 1200 1400
y po
sitio
n
x position
Tracking with Data Association
measurementsTrack 1Track 2Track 3Track 4Track 5Track 6Track 7Track 8Track 9
Track 10Track 11Track 12Track 13Track 14Track 15Track 16
Figure 7.15: Data Association Example 2. Method 1. Total number of tracks = 16.
the tracker is run, that the total number of tracks in in second method is higher than in
the first method. Since the estimates are detemined by the PHD filter and clustering, this
comparison shows us the relative performance of track maintenance. Therefore, as shown
in the figures, the first method is able to maintain tracks for longer and so performs better
than the second method.
7.3.4 Summary
We have proposed two methods to enable track continuity for the PHD filter, based on
the output of the particle filter algorithm with clustering to partition the particles and pro-
vide the target estimates. Both methods have shown their ability to track sets of targets in
clutter and also handle the introduction of new targets. In the example with no clutter, both
algorithms performed comparably well, but when there is clutter, the first method has main-
153
0
100
200
300
400
500
600
700
800
0 200 400 600 800 1000 1200 1400
y po
sitio
n
x position
Tracking with Data Association
measurementsTrack 1Track 2Track 3Track 4Track 5Track 6Track 7Track 8Track 9
Track 10Track 11Track 12Track 13Track 14Track 15
Track 16Track 17Track 18Track 19Track 20Track 21Track 22Track 23Track 24Track 25Track 26Track 27Track 28Track 29Track 30
Figure 7.16: Data Association Example 2. Method 2. Total number of tracks = 30.
tained the tracks for longer. This can be explained by the nature of the two methods: the
first method uses the particles directly and keeps track of the individual particle movements
to validate and associate target tracks, while the second method uses the observation covari-
ance matrix to validate measurements, which is a much less flexible means of associating
measurements.
7.4 Discussion
This chapter has addressed two fundamental issues for the particle PHD filter algorithm:
target estimation from the PHD filter and data association of the target estimates between
frames. A comparison of the clustering techniques for target estimation from the particles
has shown that the k-means algorithm is an effective technique for extracting target states,
both in terms of the accuracy of the estimates and the time taken to provide these esti-
154
1
2
3
4
5
6
7
8
9
10
0 10 20 30 40 50 60
Est
imat
ed T
arge
t Num
ber
Iteration Number
Number of Targets
Estimated Target MassTrue Number of Targets
Figure 7.17: Data Association Example 2. Estimated Target Number.
mates. In particular, it provides a significant improvement in time complexity over the EM
algorithm when there is a higher number of targets.
This chapter proposed two novel methods for incorporating track continuity into the Par-
ticle PHD filter. These methods are simpler in complexity than other reported techniques
and have been illustrated using simulated data with clutter. The first of the techniques par-
titions the particles at the target extraction stage into clusters around the individual targets,
and these partitions are used between the frames to enable track continuity. The second
method estimates the target in the next frame via the previous target state estimate and the
motion model followed by a validation procedure. Newborn targets are located using the
additional particles provided by the birth-model. Both techniques have demonstrated their
potential for track continuity. The first technique performed better in the example with
clutter, as it uses the particles directly instead of estimating the targets and then using these
155
150
200
250
300
350
400
450
500
550
600
0 100 200 300 400 500 600 700 800 900
Clustering Output from Iteration 27
MeasurementsTrue Target Positions
Track 1Track 2Track 3
Track 10Track 11Track 13Track 16
Figure 7.18: Data Association Example 2. Clustering Output.
distilled results.
A recent study on the PHD filter has demonstrated that when the probability of detec-
tion, pD, is low, a track can be prematurely destroyed [74]. It may be possible to incorporate
a technique commonly used with the Kalman filter, updating a track when there is no de-
tection, into the PHD filter tracking framework. The track could be propagated with the
state equation so that the track could be maintained. This would involve maintaining more
tracks than those estimated at the current time step. Future work could also consider situa-
tions where false target states are estimated.
156
Chapter 8
The GM-PHD Filter Multiple Target
Tracker
8.1 Introduction
In the last chapter, track continuity methods were developed for the Particle PHD filter.
A similar method is developed in this chapter for the Gaussian mixture implementation
described in chapter 5
The Gaussian Mixture Probability Hypothesis Density Filter (GM-PHD Filter) described
in chapter 5 provided a closed form solution to the PHD filter recursion for multiple target
tracking [33]. The posterior intensity function is estimated by a sum of weighted Gaussian
components whose means, weights and covariances can be propagated analytically in time.
In particular, the means and covariances are propagated by the Kalman filter.
The original Gaussian Mixture PHD filter algorithm provided a means of estimating
the number of targets and their states at each point in time. The method for determining
157
the targets simply used the weights of the Gaussian components and did not take into ac-
count temporal continuity. We show that if a target is not detected at each iteration, the
Gaussian components can still track the targets in the presence of some missed detections.
Furthermore, the trajectory of the target in the past, before it has been detected, can also be
determined by keeping the trajectories of each of the Gaussian components.
The original formulation of the GM PHD filter allowed targets to be spawned from ex-
isting targets. For simplicity, we have removed this functionality, although it is anticipated
that the algorithm presented here could be extended to incorporate this scenario.
8.2 The Gaussian Mixture PHD Filter Multiple Target
Tracker
In the previous chapter, a method for enabling track continuity for the Particle PHD filter
was developed. This technique directly uses the empirical PHD distribution using a la-
belling process to identify clusters of particles representing a target. A similar technique
for the Gaussian Mixture PHD filter is presented here which labels each Gaussian instead
of each cluster of particles except that the technique presented here can operate in much
higher clutter levels. A comparison of the two techniques will be given in the next chapter
on real data.
The means and covariances of each Gaussian in the mixture are predicted and updated
with the Kalman filter equations. These are given here for convenience. The predicted state
158
estimate mt|t−1 and state covariance to time t are given by,
mt|t−1 = Ft−1mt−1, (8.1)
Pt|t−1 = Qt−1 +Ft−1Pt−1FTt−1, . (8.2)
When measurement z is received, the updated estimate mt|t and covariance Pt|t are given by,
mt|t(z) = mt|t−1 +Kt(z−Htmt|t−1), (8.3)
Pt|t = [I −KtHt ]Pt|t−1, , (8.4)
Kt = Pt|t−1HTt (HtPt|t−1HT
t +Rt)−1, (8.5)
where Kt is the Kalman gain.
The algorithm presented here is initialised in Step 0 and then iterates through Steps 1
to 5:
Initialisation
Initialise the algorithm with the weighted sum of J0 Gaussians,
D0|0 =J0
∑i=1
w(i)0 N (x;m(i)
0 ,P(i)0 ), (8.6)
159
Each Gaussian in the mixture is given a label,
L0 = L(1)0 , . . . ,L(J0)
0 . (8.7)
The sum of weights,J0
∑i=1
w(i)0 = T0, (8.8)
is the expected number of targets at the start of the algorithm.
Prediction
In the prediction step, each Gaussian component is predicted with (8.1) and (8.2) to give,
DS,t|t−1(x) = pS
Jt−1
∑j=1
w( j)t−1N (x;m( j)
S,t|t−1,P( j)S,t|t−1), (8.9)
(8.10)
where pS is the probability of survival. In addition, new Gaussian components are added
for the spontaneous birth model
γt(x) =Jγ,t
∑i=1
w(i)γ,t N (x;m(i)
γ,t ,P(i)γ,t ), (8.11)
The intensity, Dt|t−1, to time t is then
Dt|t−1(x) = DS,t|t−1(x)+ γt(x), (8.12)
The set of labels from the previous time step are concatenated with new labels from the
160
Gaussians introduced for the spontaneous birth model to form the set of prediction labels,
Lt|t−1 = Lt ∪L(1)γt , . . . ,L(Jγt )
γt . (8.13)
Update
When the measurements, Zt = zt,1, . . . ,zt,mt, are received at time t, compute the posterior
intensity,
Dt|t(x) = (1− pD)Dt|t−1(x)+ ∑z∈Zt
Jt|t−1
∑j=1
w( j)t (z)N (x;m( j)
t|t (z),P( j)t|t ), (8.14)
where the means, m( j)t|t , and covariances, P( j)
t|t , are computed using (8.3) and (8.4), and the
weights are calculated with the PHD filter update equation [33],
w( j)t (z) =
pDw( j)t|t−1N (z;Htm( j)
t|t−1,Rt +HtP( j)t|t−1HT
t )
κt(z)+ pD ∑Jt|t−1`=1 w(`)
t|t−1N (z;Htm(`)t|t−1,Rt +HtP(`)
t|t−1HTt )(z)
. (8.15)
PD is the probability of detection, λt is the expected number of clutter points and ct is the
distribution of these across the state space.
There are (1+ |Zt|)Jt|t−1 Gaussian components, (1+ |Zt |) for each prediction term. For
each component, the same label as its related prediction component is assigned to form the
set of update labels,
Lt,u = LDt|t−1t|t−1 ∪Lz1
t|t−1 ∪ . . .∪Lz|Zt |t|t−1. (8.16)
Pruning
161
The Gaussian components with low weights are eliminated in the pruning stage to en-
sure that the complexity of the algorithm does not grow exponentially. Let the weights
w(1)t , . . . ,w(NP)
t be those which are below the truncation threshold T , and let
Dt|t := ∑Jtl=1 w(l)
t
∑Jtj=NP+1 w( j)
t
Jt
∑i=NP+1
w(i)t N (x;m(i)
t ,P(i)t ). (8.17)
Merging
In the merging stage, Gaussian components whose distance between the means falls within
a threshold, U , defined by the covariance matrix are merged. For example, if the means of
components i and j are such that,
(m(i)t −m( j)
t )T (P(i)t )−1(m(i)
t −m( j)t ) ≤U. (8.18)
then merge. (The full procedure is given in [33] [46].) If two or more components still have
the same label L(i)t , then this is given to the one with the largest weight w(i)
t and new labels
are assigned to the other components.
State Estimation
Target states are determined from Gaussians whose weights are above a specific threshold.
In addition, the Gaussians that have previously had weights above this threshold are also
taken to be target states. These are identified by their label, i.e. the set of live tracks from
time t is
Lt = L(i)t : w(i)
t > 0.5 (8.19)
162
and the set of estimates is
Xt = m(i)t : L(i)
t ∈ L j, j = 1, . . .t. (8.20)
The above procedure allows the determination of the trajectories of the Gaussian com-
ponents in the mixture by keeping the means associated with each identifying tag. In the
original formulation of the GM PHD filter as described in chapter 5, estimates of the tar-
get states were taken at each stage of the algorithm by choosing the components with the
maximum weights. In the version here, we have temporal continuity which enables us to
keep track of targets when their weights fall below the desired threshold. In addition, the
trajectory of the targets in the past can be determined by looking at the previous trajectory
of the Gaussian after the weight is above a given threshold.Once the weight falls below
another threshold, the Gaussian component is deleted indicating that it does not contribute
significantly to the intensity function and so the target is not likely to still exist. Note that if
pD,k < 1, the component is not deleted when a measurement is not received for a target, so
that we can continue to track even with missed detections. If the space requirements for this
do not allow all of the Gaussians to be kept in memory, tracks could be deleted if the weight
was not above a threshold for a specified number of updates. This procedure is significantly
better than the estimate-to-track association used in the particle implementation of the fil-
ter, which only considered estimates in the last frame and relied on the prediction instead of
the updated Gaussian. This shows that the GM PHD filter has the inherent ability to track
multiple targets with track continuity which shall be demonstrated in the simulations.
The probability of survival pS,k is adjusted for the expected lengths of the target tracks.
When this is too low, target tracks are lost more often and when it is too high, the tracks
163
continue for longer after the target has died. In the SMC version of the PHD filter, it was
reported that when the probability of detection pD,k is low that targets are prematurely
destroyed [74]. This could have been due to the particle mass being used to determine
the number of targets and clustering to determine the state estimates. Similar problems
were not encountered here since the weights of the Gaussians were used to determine the
target states and these Gaussians were assumed to represent targets until the weight of the
Gaussian fell below the pruning threshold.
8.3 Simulations
Simulated examples have been created to test the performance of the GM PHD Filter Multi-
ple Target Tracker and the results of these are compared against the track-oriented Multiple
Hypothesis Tracker [75] with a batch of 10 frames where the log-likelihood ratio was used
to rank tracks and the best global hypothesis was selected for data outputs.
8.3.1 Example 1
In this example, a two-dimensional scenario with an unknown and time varying number of
targets has been simulated in clutter over the region [−1000,1000]× [−1000,1000]. The
state xt = [ px,t , py,t , px,t , py,t ]T , of each target consists of position (px,t , py,t) and velocity
(px,t , py,t), while the measurement is a noisy version of the position.
Each target has survival probability pS,k = 0.9, detection probability pD,k = 0.99 and
follows the linear Gaussian dynamics from the previous chapter.
We assume no spawning, and that the spontaneous birth intensity is Poisson with four
164
Gaussian terms distributed across the surveillance region,
γt(x) =4∑i=1
0.1N (x;mγ,i,Pγ).
Note that this does not need to sum to one but reflects the expected number of spontaneously
appearing targets at time k.
The detected measurements are immersed in clutter that can be modelled as a Poisson
RFS Kt with intensity
κt(z) = λtVu(z), (8.21)
where u(·) is the uniform density over the surveillance region, V = 4×106m2 is the area of
the surveillance region, and λt = 5×10−6m−2 is the average number of clutter returns per
unit area which relates to 20 clutter measurements per scan.
The Gaussian mixture PHD filter, with pruning parameters elimination threshold T =
10−5, merging threshold U = 4, and maximum number of Gaussian terms Jmax = 200.
Figure 8.1 (top) shows the simulated scenario with the true target trajectories and an
average of 20 clutter points per scan. Figure 8.1 (bottom) gives results of the PHD filter on
a set of measurements over 100 iterations. The dots show the true target locations and the
lines show the estimated trajectories. It can be seen that the GM PHD Filter Tracker has
very few false tracks, can pick up a track very quickly, does not drop the tracks while the
target still exists and eliminates tracks shortly after the target leaves the surveillance region.
Five hundred sets of measurements for these target trajectories have been generated
to compare the two algorithms. The Wasserstein multi-target miss distance, described in
chapter 7, has been used to compare the accuracy of the estimates and also the expected
165
absolute error in the estimated number of targets.
When the estimated number of targets is incorrect, the Wasserstein distance puts all the
weight on the outliers. Figure 8.2 shows the results of the mean Wasserstein distance over
the 500 measurement sets for each time step. The spikes in the result for the GM-PHD filter
usually indicate that either a new target has entered the scene but has not yet been detected
or has died and has not been eliminated.
Error in Estimating the Number of Targets
The expected absolute error on the number of targets has been calculated for each of the
algorithms,
E| |Xt|− |Xt| |.
Note that standard performance measures such as the mean square distance error are not
applicable to multi-target filters that jointly estimate number of targets and their states.
Figure 8.3 shows the absolute error in the estimation of the number of targets, averaged
over 500 measurement sets. The GM-PHD filter can reliably estimate the correct number
of targets, it has fewer false tracks and can initiate the correct tracks more easily.
8.3.2 Example 2
In this example, we consider the theoretical contraints of the algorithm and illustrate this
through a simulation. Consider a situation where we have two targets, then ideally this
would be represented by two Gaussians,
Dt(x) = w(1)t N (x;m(1)
t ,Pt)+w(2)t N (x;m(2)
t ,Pt). (8.22)
166
(For simplicity, it is assumed that the covariance matrix is the same for each Gaussian.
This can be achieved through diagonalisation, since the covariance matrix is symmetric,
nonnegative and semi-definite.)
Suppose that the targets cross, then Dt(x) is unimodal with mean (m(1)t +m(2)
t )/2 when
(m(1)t −m(2)
t )T P−1t (m(1)
t −m(2)t ) < 4, see [76]. This means that the PHD will fail to distin-
guish between targets within this separation. Furthermore, these components could actually
be merged into the same Gaussian if the means fall within the merging threshold, U . Thus,
if the tracks of the targets are to be maintained when the targets are too close, alternative
methods for data association need to be used. If the trajectories of the targets are known in
the past, these could be used to separate the tracks after the targets have crossed.
A simulation of the above scenario has been created but with Gaussians from the spon-
taneous birth are included to ensure that if a track is lost, then it can be recaptured. Targets 1
and 2 are born at the same time but at two different locations. These two targets travel along
straight lines and their tracks cross at k = 53s, see figure 8.4 for the paths of the targets.
Two sets of measurements have been generated to show how the tracker behaves with
crossing targets, figure 8.5 shows the crossing region with two outcomes. In the first out-
come, the target trajectories are correctly estimated through the crossing point. In the sec-
ond outcome however, whilst the estimates from the GM-PHD filter are not affected, the
tracks follow the wrong trajectories after the crossing point. It is anticipated that this prob-
lem could be resolved by associating tracks from predictions before the Gaussians are in
the merging region with estimates after the targets have crossed, similar to techniques used
with multiple hypothesis tracking.
167
8.4 Conclusions
An algorithm has been presented for tracking multiple targets in high clutter density which
has the ability to estimate the number of targets, track the trajectories of the targets over
time, operate with missed detections and give the trajectories of the targets in the past once
a target has been identified. It has been shown to outperform the track-oriented Multiple
Hypothesis Tracker in its ability to operate in clutter with fewer false tracks and can initiate
and eliminate targets more accurately. The theoretical constraints of the proposed tracking
algorithm have been discussed in the case of crossing targets. It is anticipated that the prob-
lem of retaining the correct target identity in this scenario can be resolved by considering
the previous trajectories of targets.
168
10 20 30 40 50 60 70 80 90 100−1000
−500
0
500
1000
time step
x
10 20 30 40 50 60 70 80 90 100−1000
−500
0
500
1000
time step
y
10 20 30 40 50 60 70 80 90 100−500
0
500
1000
time step
x
10 20 30 40 50 60 70 80 90 100−400
−200
0
200
400
600
800
1000
time step
y
Figure 8.1: True target positions (lines) and measurements (crosses). (top)GM-PHD estimated target trajectories (lines) and true positions (crosses). (bottom)
169
0 10 20 30 40 50 60 70 80 90 1000
100
200
300
400
500
600
Time Step
Mea
n W
asse
rste
in D
ista
nce
Figure 8.2: Mean Wasserstein Distance.
170
0 10 20 30 40 50 60 70 80 90 1000
0.5
1
1.5
2
2.5
3
3.5
4
Time Step
Mea
n Ab
solu
te E
rror i
n Ta
rget
Num
ber E
stim
ate
Figure 8.3: Absolute Error in Target Number Estimate.
171
−400 −200 0 200 400 600 800 1000−1000
−800
−600
−400
−200
0
200
400
x co−ordinate
y co
−or
dina
te
Figure 8.4: Example 2
172
200 250 300 350 400 450 500
−500
−450
−400
−350
−300
−250
−200
x co−ordinate
y co
−ord
inat
e
250 300 350 400 450 500 550
−550
−500
−450
−400
−350
−300
−250
x co−ordinate
y co
−or
dina
te
Figure 8.5: Example 2
173
Chapter 9
Multiple Target Tracking in Sonar
Images
9.1 Introduction
Underwater vehicles can be fitted with a range of sensing equipment, including sonar and
video. As the vehicles traverse through the water column, the sensing equipment is used
to provide sequences of images of the scene. The sequences of data obtained from the
underwater vehicles need to be interpreted to gain an understanding of the environment in
which the vehicles are deployed.
One of the important tasks is to identify objects on the seabed or in the water column
which would need to be avoided in path planning and navigation [3] [77]. In the case
where the navigation of the vehicle is determined by its current environment and the path
of the vehicle is determined by the incoming data, tracking algorithms are required so that
obstacles can be avoided. Three approaches for tracking obstacles in sequences of sonar
174
images are considered in this chapter, the first of which uses an association technique to
assign measurements to single-target filters, the second uses a the Particle PHD filter and
estimate-to-track association presented in chapter 7, and the third uses the Gaussian Mixture
Multiple Target Tracker presented in chapter 8. Measurements are found by pre-processing
the sonar data to find potential objects based on their size and reflected intensity.
The tracking output of each of the algorithms is compared on real and simulated sonar
data. The accuracy of the tracking algorithms can be compared directly since the target
locations are known in the simulated data. An initial comparison between the Kalman fil-
ters approach and the Particle PHD filter real sonar data was given in [26], and between
the Particle PHD filter and GM-PHD filter in [36]. It is shown that the Particle PHD
filter with estimate-to-track association gives comparable tracking performance to the con-
ventional Nearest Neighbour approach with Kalman filters, and that the GM-PHD filter is
demonstrated to give comparable performance in higher levels of clutter.
9.2 Tracking and Data Association
At each point in time t, we have a set of noisy measurements, Zt = zt,1, ...,zt,mt, where zt, j
represents a single target measurement or false alarm and mt is the number of observations
at time t. From this set of measurements, we must estimate how many targets Tt there are
and their set of locations, Xt = xt,1, ...,xt,Tt, where xt,i represents the state of an individual
target and Tt is the number of targets at time t. The first approach considered involves
assigning a single-target stochastic filter to each estimated target and uses a data association
technique to ensure that each filter is assigned the correct measurement. The mechanism
for distributing the correct measurement to each filter is called data association or, more
175
specifically, measurement-to-track association. This approach is compared with the two
different PHD filter implementations with track continuity.
A linear Gaussian dynamic model with following state space model is used:
xt+1 =
1 T 0 0
0 1 0 0
0 0 1 T
0 0 0 1
xt +
T 2/2 0
T 0
0 T 2/2
0 T
vt , (9.1)
and observation model:
zt =
1 0 0 0
0 0 1 0
xt +wt . (9.2)
vt and wt are the process and measurement noises, respectively, and are uncorrelated.
The state vector is defined as the 2D position and velocity vector of the target:
xt =
(
xt xt yt yt
)T. (9.3)
9.2.1 Tracking with Kalman filters
The first multiple tracking model assigns one Kalman filter per object and manages the
measurements for each filter with a measurement-to-track data association technique. The
Kalman filter has been chosen as the single-target filter, since it has been shown to be
effective for multiple-target tracking with measurement-to-track association in sonar [78].
Other techniques, such as the Extended Kalman filter or particle filter could also have been
used. The procedure used for the tracking algorithm is the Nearest Neighbour Standard
176
Filter (NNSF) [5], and the implementation is outlined in figure 9.1. After the sonar data
is acquired, it is segmented and features are extracted. Depending on whether targets are
expected in a region, a Kalman filter is either initialised or measurements are associated.
Regions of interest are set to determine which areas to segment more carefully in subsequent
iterations.
The measurement-to-track data association problem in multiple target tracking involves
ensuring that the correct measurement is given to each stochastic filter so that the trajecto-
ries of each target can be accurately estimated. The approach used for data association here
is based on a variant of nearest neighbour. The predicted measurements, zt|t−1, are calcu-
lated for each target by projecting the previous estimate using the state transition function,
xt|t−1 = Fxt−1, and then the observation function, zt|t−1 = Hxt|t−1. The innovation covari-
ance St is found for each track,
St = (HPt|t−1HT +R). (9.4)
A validation region is defined for each track given,
Vt(γ) := z : [z− zt]T (St)
−1[z− zt] ≤ γ. (9.5)
From this we can choose a validation gate ν. For each track, select the observations that
either fall within the validation gate or intersect the tracked object. Each track will have
a list of possible observations for update greater than or equal to zero: If there are no
observations to update the track, then the track is deleted. If there is one observation, then
update the track with the new observation. If there are two or more observations as possible
177
Track Estimates
Initialise Kalman Filters
Feature Extraction
Coarse Segmentation
Sonar Data Acquisition
Segmentation of Regions of Interest
Feature Extraction
Measurement−to−track Association
Update Kalman FiltersSet Regions of Interest
Figure 9.1: Kalman Filter Tracking Procedure.
updates, choose the closest.
9.2.2 Tracking with the Particle PHD filter
The Particle PHD filter algorithm with state estimation and estimate-to-track association
from chapter 7 is used here and the procedure for the tracking implementation is given in
figure 9.2.The main difference between this approach and the NNSF is that all the extracted
features are used directly as input to the filter and estimates are associated instead of mea-
surements. The number of particles adaptively changes to be proportional to the number of
targets, with N = 1000 particles per target. The particles are propagated with the prediction
and update steps and k-means is used to repartition the particles. Partitions are associated to
a target track if the majority of the particles in the new partition correspond to the particles
propagated from a partition in the previous time-step. This approach was chosen instead of
178
Sonar Data Acquisition
Coarse Segmentation
Feature Extraction
PHD filter Estimates
Set Regions of Interest
Segmentation of Regions of Interest
Feature Extraction
Estimate−to−track Association
Track Estimates
Figure 9.2: Particle PHD Filter Tracking Procedure.
Sonar Data Acquisition Segmentation Feature Extraction GM PHD Filter Estimates Track Estimates
Figure 9.3: GM-PHD Tracking Implementation.
the particle labelling association presented in chapter 7 since it is simpler.
9.2.3 Tracking with the GM-PHD Filter
The GM-PHD filter multiple target tracking algorithm presented in chapter 8 is used and
the procedure for the tracking implementation is given in figure 9.3. Since the GM-PHD
filter tracker can operate in higher density clutter than the particle PHD filter, the feature
extraction process used is simpler, as described in the next section.
179
9.3 Implementation on Forward-Looking Sonar
The multi-target tracking methods have been implemented for tracking obstacles in forward-
looking sonar. Since the Particle PHD filter requires the k-means algorithm, the clutter lev-
els have been reduced so that inaccurate estimates are minimized. The segmentation and
feature extraction methods for determining the measurements are the same for the Nearest
Neighbour and Particle PHD filter approaches but a simpler method is used for the GM-
PHD filter as it can operate successfully in higher clutter levels. An initial comparison of
the Particle PHD filter and Nearest Neighbour techniques is given on simulated data with
estimates of the errors before presenting the results on real sonar data.
9.3.1 Simulated Sonar Data
The tracking methods have been run on simulated forward-looking sonar data. The advan-
tage of using simulated data is that it allows various realistic scenarios and trajectories to
be created easily. The exact locations of the vehicle and objects are known, and thus the
accuracy of the tracker can be determined. This will allow us to directly compare the results
of the multi-target tracking algorithms to the ground truth data.
A sequence of forward-looking sonar images has been generated using the Sonar Simu-
lator developed by Bell [69] which has the capability of modelling sonar in complex under-
water terrain. An artificial seabed is modelled by a 100×100m2 textured image, see figure
9.6. Spherical shaped objects of radius 0.5m have been placed on the seabed.
The specification of the sonar has been modelled to be as close to the sonar equipment
used to provide the real data. The range of the sonar is 40m which scans a sector of 120
degrees, see figure 9.5 for an example image. A sinusoidal trajectory with added noise
180
85
90
95
100
105
110
115
120
125
130
50 60 70 80 90 100 110 120 130 140
Sonar TrajectoryObjects
Figure 9.4: Simulated Sonar Trajectory with Objects. x (metres) y (metres).
has been simulated for the sonar as though it were fitted onto an Autonomous Underwater
Vehicle (AUV), see figure 9.4 for the simulated trajectory with objects.
9.3.2 Real Sonar Data
The sequences of images were obtained from a forward looking multi-beam sonar which
was fitted to an Autonomous Underwater Vehicle (AUV). The vehicle was travelling at a
rate of approximately 1 knot over a region with stationary targets on the seabed. The sonar
was mounted on the front of the AUV scanning forwards for a range of 40m and was angled
towards the seabed. The sonar scanned an angular region of 120 degrees, using 120 beams
each with a vertical beam width of 1 degree and a horizontal beam width of 40 degrees.
The sonar had an operating frequency of 600 kHz.
181
Figure 9.5: Simulated Sonar Image.
Figure 9.6: Artificial seabed.
182
Figure 9.7: Original sonar image (top). Image after filtering (middle). Resulting imageafter segmentation with regions of interests shown as the boxes and potential targets as thewhite segmented areas (bottom). 183
Feature extraction for the Kalman Filter and Particle PHD Filter
Multi-beam sonar images can be very noisy, due to reverberation from the seabed, surface
or water column and so need to be filtered if they are to be of use. See figure 9.7 (top)
for an example sonar image. The objects which we wish to track have a higher reflectivity
property than the surrounding environment, and so the measurements can be determined
by thresholding the sonar images on intensity. A two-layer segmentation has been used to
identify areas of interest, the first of which uses a fast segmentation algorithm based on the
intensity of the returned energy. The second layer more selectively segments regions where
objects are expected based on previous knowledge.
To reduce the speckle noise, the images are first filtered. A mean filter was found to be
effective at removing the noise and has a relatively cheap computational cost, see figure 9.7
(middle) for an example of an image after filtering. A threshold is then applied to identify
regions with high reflected energy where there are potential objects.
A double threshold is applied by firstly using an adaptive threshold to identify regions
of high reflectivity and then using a higher threshold to identify the regions with the high-
est returns. Neighbouring pixels are grouped together to form regions, the centroids of
these regions are taken as the measurements which will be used as input to the tracking
algorithms. After the images have been segmented and the regions with high reflectivity
have been identified, features of the potential targets can be found, and regions which are
too small to be an obstacle are discarded. The features which we use for tracking in our
application here are the centroid positions of the segmented regions. Other features have
been used in the tracking such as the perimeter and area of the objects [78]; although, for
simplicity we restrict ourselves to the positions of the targets. See figure 9.7 (bottom) for
184
an example of a segmented image with regions of interest.
Feature Extraction for the GM-PHD filter tracker
The same double threshold approach is used as above but the measurements are determined
from taking the centroids and does not rely on the tracking prediction. Figure 9.8 shows the
sonar image, the image after filtering, and the segmented image from which the measure-
ments are determined by taking the centroids of high intensity regions. The procedure used
for tracking is shown in figure 9.3. In our approach used here, we simply use the double
thresholding described above which results in higher clutter levels. We demonstrate that
the GM-PHD filter copes well in these circumstances and compare the results with those
obtained for the Particle PHD filter in lower clutter levels.
9.4 Results
This section presents the results for both of the tracking algorithms on real and simulated
forward-looking sonar data. For the simulated data, the positions of the targets are known
which enables us to compare the methods. A direct comparison of the errors in the set of
target state estimates from the true target locations for each of the algorithms is given.
9.4.1 Simulated Data
The simulated sonar data provides us with a ground truth with which we can compare the
accuracy of the target estimation from each of the tracking methods. In this example, there
is no clutter and the number of estimates is the same as the number of targets in view.
Previous studies have demonstrated that the Particle PHD filter can operate successfully in
185
Figure 9.8: Forward-scan sonar image (top). Sonar image after filtering (middle). Imageafter segmentation for GM PHD measurements (bottom).
186
Tracking Technique Kalman Filters Particle PHD filterHausdorff Pixel Error 35.274 35.125RMS Pixel Error 28.26 28.83Pixel Standard Deviation 6.2199 6.5523
Figure 9.9: Comparison of Errors.
higher levels of clutter [79] [29] [30].
The true positions give the centres of the spherical objects in the image. The trackers,
however, estimate the position of the centroid of the highlight of the object from the re-
flected acoustic energy and therefore introduces an inherent bias which is reflected in the
results. Figures 9.10 and 9.11 show the images with tracking results superimposed. The re-
sults of the tracking are very similar, although the nearest neighbour approach with Kalman
filters managed to keep some of the tracks longer.
Let Xt be the set of target states at time t and Xt be the set of estimated target states. We
compare the performance of the algorithms using the L2 pixel errors,
d(xi, x j) =√(
(x1i − x1
j)2 +(x2
i − x2j)
2)
, between the estimates and true positions. For each
iteration, the mean and maximum pixel errors have been calculated. The maximum error
here is the same as the Hausdorff distance [72], maxxi∈Xt minx j∈Xtd(xi, x j), which gives
the tracking error in the worst case. The errors have been averaged over the length of the
sequence and the table of results is given in figure 9.9.
The tracking techniques have given comparable performance in their ability to estimate
the correct position and in the standard deviation of errors. The average error throughout
the sequence was around 30 pixels with a standard deviation of 6 in both cases.
187
9.4.2 Real Data
The tracking algorithms have been tested on the same sequence of sonar data and in this
section a comparison of the different techniques is given. The images in the sequence are
24-bit colour of size 1276× 833 which was converted to grayscale. A mean filter of size
11×11 was used to reduce the impulse noise before segmenting the image by thresholding.
The measurements obtained by this process are fed into the tracking algorithms.
Kalman filters and Particle PHD filter
Selected frames from the sequence are presented in figures 9.12 and 9.13. In the first
frame shown, there are three targets being tracked in each image, the trajectories on the
left are fairly similar. The target on the right has been tracked for longer with the PHD
filter than the Kalman filter, although the ability to track without measurements has been
removed in the case of the Kalman filter [78]. This was to enable a fairer comparison,
since this functionality has not been used with the estimate-to-track PHD filter although
could be incorporated into future implementations. We notice in the next two images, both
techniques have similar target trajectories, although, the Kalman filter tracking is smoother.
This is due to the weight of the model on the Kalman filter.
GM PHD filter
Figures 9.13 and 9.14 show results of the Particle PHD and GM-PHD filters respectively.
The first of these uses a more complex pre-processing procedure for determining the mea-
surements to reduce clutter levels and the second uses simple thresho lding which gives
more clutter points. The average number of clutter points with the first method was less
188
Figure 9.10: Tracking results using Kalman filters on simulated data. Frames 34, 85 and 97in a sequence of 100 frames.
than 1 and with the second around 5.
While the empirical distribution from the Particle PHD filter can handle high clutter
levels and estimate the correct number of targets, estimating the target states relies on clus-
tering techniques which can lead to inaccurate and false estimates being obtained. The
target states are determined from the GM-PHD filter by taking the Gaussians with the high-
est weights. In simulations, it has been shown that individual Gaussians can accurately
track the correct targets [46] in high clutter levels. This has a number of advantages over
the particle implementation. First, the complexity of the algorithm is lower, the number of
Gaussians used (maximum of 200 after pruning and merging compared with 1000 particles
per tar get). Second, the means of the Gaussians are known and don’t need to be determined
through clustering techniques. Finally, individual Gaussians determine the target states and
are tracked more reliably through the labelling process compared to track continuity tech-
niques developed for the Particle PHD filter. For these reasons, the GM-PHD filter can
perform better under higher clutter levels.
The number of clutter points in the sequence ranged between 0 and 12 points. Figure
9.15 gives an example of the original, filtered and segmented images used. There are 5 false
alarms in this example.
189
Figure 9.11: Tracking results using PHD filter on simulated data. Frames 34, 85 and 97 ina sequence of 100 frames.
Figure 9.12: Tracking results using Kalman filters. From left, frames 39, 58, and 98 in asequence of 100 frames.
Figure 9.13: Tracking results using Particle PHD filter. From left, frames 39, 58, 83 in asequence of 100 frames.
Figure 9.14: Tracking results using GM PHD filter. From left, frames 39, 58, 83 in asequence of 100 frames.
190
100 200 300 400 500 600 700 800
50
100
150
200
250
300
350
400
450
500
100 200 300 400 500 600 700 800
50
100
150
200
250
300
350
400
450
500
100 200 300 400 500 600 700 800
50
100
150
200
250
300
350
400
450
500
Figure 9.15: Example with clutter using GM PHD filter. From left, raw, filtered and seg-mented image with tracking superimposed.
9.5 Conclusions
The multiple target tracking techniques developed for the Particle PHD filter and GM PHD
filter have been demonstrated on real forward-scan sonar data. The algorithms are com-
pared on both simulated sonar and real forward-looking sonar data obtained from an Au-
tonomous Underwater Vehicle (AUV) and demonstrate that the PHD filter can be effectively
used for practical multiple target tracking applications and compare well with conventional
approaches for multiple target tracking. The identities of the individual target tracks have
been maintained using the methods presented in this thesis.
A comparison of the Particle PHD filter with the traditional Nearest Neighbour ap-
proach with Kalman filters has shown that the two methods developed in this thesis give
comparable performance to conventional methods of multiple target tracking. Furthermore,
it is shown that the GM PHD filter multi-target tracker can successfully track the correct
targets in reasonably high levels of clutter without the need for data association, since this
is an inherent property of the algorithm.
The performance of the algorithms shown in chapters 7 and 8 show that the Gaussian
mixture PHD filter can operate in higher levels of clutter than the Particle PHD filter due
to the clustering required to determine the target states. Whilst the convergence properties
191
of chapter 4 are not affected by this, the ability to use the algorithm for target tracking
in clutter is. Since clustering is not required in the Gaussian mixture version, it is much
easier to extract the correct target states by taking the Gaussian components with the largest
weights.
192
Chapter 10
Conclusions
10.1 Thesis Summary
The random-set framework for multiple target tracking developed by Mahler [76] offers
a distinct alternative to the traditional approach to multiple target trackingby treating the
collection of individual targets as a set-valued state and the collection of individual obser-
vations as a set-valued observation. The set-valued state is predicted and updated at each
time-step based on the set-valued observation. The multiple target posterior can be esti-
mated using a generalisation of the single target Bayesian filtering equations to a multiple
target scenario. This model can also incorporate clutter, or false measurements, into the
framework.
The complexity of computing this recursion grows exponentially with the number of
targets and so the optimal filter must be approximated. To alleviate the complexity of
computing the multi-target posterior, a recursion was derived for the first order moment of
the multi-target posterior distribution, known as the PHD filter.
The Sequential Monte Carlo implementation of the PHD filter [17], known as the Par-
193
ticle PHD filter, demonstrated that practical applications of the filter were possible. In
chapter 4, a study of the convergence of this algorithm was conducted showing that the em-
pirical measure approximating the PHD converges weakly to the true density. An example
of this algorithm was illustrated with the application of sequentially estimating targets in
forward-looking sonar data in chapter 6.
The closed-form version of the PHD filter for linear-Gaussian target dynamics was de-
veloped recently to provide a multi-target tracker without the complexity of the particle
filtering approach [34], called the Gaussian Mixture (GM) PHD filter. Convergence prop-
erties of the GM-PHD filter were shown in chapter 5, and bounds were found for the ap-
proximation stages to alleviate the computational complexity.
In the Sequential Monte Carlo version of the PHD filter [17], the target estimates needed
to be determined from the particle distribution by using clustering techniques such as the
EM algorithm and k-means. A comparison of the accuracy and time complexity of these
algorithms is given in chapter 7, which showed empirically that the k-means algorithm can
outperform the EM algorithm in both of these properties. In addition to estimating the num-
ber of targets and their states at each point in time, it is also important in tracking scenarios
to know the trajectories of the targets and to be able to distinguish between different targets.
Two novel methods for incorporating track continuity into the Particle PHD filter were pro-
posed. These methods are simpler in complexity than other reported techniques [29] [30]
and have been illustrated using simulated data with clutter in chapter 7.
The Gaussian mixture multi-target tracker, developed in chapter 8, showed that individ-
ual Gaussians within the mixture are able to track targets successfully and hence the ability
to track is an inherent part of the GM-PHD filter and this technique can operate with a high
194
number of false alarms.
The methods developed in this thesis for state estimation and track continuity for both
of the implementations of the PHD filter are demonstrated in chapter 9 for multiple target
tracking in sequences of forward scan sonar. It is shown that these methods can be used for
practical tracking applications and compare well with conventional approaches to multiple
target tracking.
10.2 Current Research
A recent study on the PHD filter showed that the estimate of the number of targets provided
by taking the integral of the PHD over the state space is potentially unstable in the presence
of missed detections and high clutter density when the probability of detection is less than
one [74]. It was argued that the estimate is unstable due to the linearisation formula for
the expected number of targets. Examples of this phenomenon can be seen in chapter 7.
One possible way of resolving this difficulty is to take the average of the number over
several iterations, although the obvious difficulty with this is that there may be a delay in
the target number to be updated and short tracks may be missed. Since the GM-PHD filter
multi-target tracker presented in chapter 8 did not rely on the target number estimate, this
problem was not encountered.
Mahler [80] [81] derived an extension to the PHD filter where the probability distri-
bution in the target number is propagated in addition to the PHD. This filter was named
the Cardinalised PHD (CPHD) filter. A Gaussian mixture approximation to the CPHD fil-
ter has been implemented [82] which extends the GM-PHD filter given in chapter 5. It is
anticipated that the method for multiple-target tracking with the GM-PHD filter proposed
195
in chapter 8 could also be applied to this implementation of the filter to determine target
tracks. Furthermore, particle filter approximations have been implemented by Mahler et al.
and the state estimation and track continuity methods presented in chapter 7 could prove to
be useful for this implementation.
10.3 Future Work
Techniques developed in this thesis have enabled track continuity for the two PHD filter im-
plementations, either by labelling the different clusters or Gaussians which are propagated
with the filter, or by estimating the set of targets at the next time step and gating. It is antic-
ipated that these techniques will be useful for future implementations of PHD filters. These
techniques consider only the time-step immediately preceding the current one and have not
incorporated methods for backtracking. Backtracking methods, such as those used in mul-
tiple hypothesis tracking (MHT) [75], or probabilistic MHT (PMHT) [7], would allow for
the previous trajectories of the targets to be considered and hence be able to discriminate
between crossing and closely spaced targets more accurately. A recent technique devel-
oped for particle filters uses measurements collected over several time-steps to resolve the
problem of determining the correct target trajectories of multiple targets by using fixed-lag
SMC data association [83] which could be used for the Particle PHD filter.
Possible theoretical developments with PHD filters could consider higher order multi-
target moment approximations to improve the accuracy of the PHD filter. The CPHD filter
is a partial second-order filter, which is first order in the states of individual targets but
second order in the target number. The current PHD filter framework does not extend to a
second-order approximation and an alternative approach would be required [81] which may
196
not necessarily be computationally tractable although this remains a future research topic.
197
Bibliography
[1] R. E. Kalman. A new approach to linear filtering and prediction problems. Transac-
tions of the ASME–Journal of Basic Engineering, 82(Series D):35–45, 1960.
[2] N.J. Gordon, D.J. Salmond, and A.F.M. Smith. Novel approach to nonlinear/non-
Gaussian Bayesian state estimation. IEE Proceedings on Radar and Signal Process-
ing, 140:107–113, 1993.
[3] Y. Petillot, I. Tena Ruiz, and D. M. Lane. Underwater vehicle obstacle avoidance and
path planning using a multi-beam forward looking sonar. IEEE Journal of Oceanic
Engineering, Vol. 26, No. 2, 240-251, April 2001.
[4] I. Tena Ruiz, Y. Petillot, and D. M. Lane. AUV navigation using a forward looking
sonar. Unmanned Underwater Vehicle Symposium. Rhode Island, USA., 2000.
[5] Y. Bar-Shalom and T.E. Fortmann. Tracking and Data Association. Academic Press,
1988.
[6] D. Reid. An algorithm for tracking multiple targets. IEEE Trans. Automatic Control,
24 no. 6, 1979.
198
[7] R.L. Streit and T.E. Luginbuhl. Probabilistic Multi-Hypothesis Tracking. NUWC-NPT
Technical Report 10 428, Naval Undersea Warfare Center, Newport, Rhode Island,
1995.
[8] D. Schulz, W Burgard, D. Fox, and A. B. Cremers. People tracking with a mobile
robot using sample-based Joint Probabilistic Data Association Filters. International
Journal of Robotics Research, pages 99–116, 2003.
[9] J. Vermaak, S. J. Godsill, and P. Perez. Monte carlo filtering for multi target tracking
and data association. IEEE Transactions on Aerospace and Electronic Systems, 41,
1:309 – 332, 2005.
[10] I. R. Goodman, R. P. S. Mahler, and H. T. Nguyen. Mathematics of Data Fusion.
Kluwer Academic Publishers, 1997.
[11] R. P. S. Mahler. An introduction to multisource-multitarget statistics and its applica-
tions. Technical monograph, Lockheed Martin, March 2000.
[12] R. Mahler. Multitarget Bayes filtering via first-order multitarget moments. IEEE
Transactions on Aerospace and Electronic Systems, 39, No.4:1152–1178, 2003.
[13] G. Matheron. Random sets and integral geometry. J. Wiley, 1975.
[14] R. Mahler. Global integrated data fusion. in Proc. 7th Nat. Symp. on Sensor Fu-
sion, 1, (Unclassified) Sandia National Laboratories, Albuquerque, ERIM Ann Arbor
MI:187–199, 1994.
[15] D.J. Daley and D. Vere-Jones. An introduction to the theory of point processes.
Springer, 1988.
199
[16] R. Mahler. A theoretical foundation for the Stein-Winter Probability Hypothesis Den-
sity (PHD) multi-target tracking approach. in Proc. 2002 MSS Nat’l Symp. on Sensor
and Data Fusion, 1, (Unclassified) Sandia National Laboratories, San Antonio TX,
2000.
[17] B. Vo, S. Singh, and A. Doucet. Sequential Monte Carlo methods for Multi-target
Filtering with Random Finite Sets. IEEE Trans. Aerospace Elec. Systems, 41,
No.4:1224–1245, 2005.
[18] B. Vo, S. Singh, and A. Doucet. Sequential Monte Carlo Implementation of the PHD
filter for Multi-target Tracking. Proc. FUSION 2003, pages 792–799, 2003.
[19] T. Zajic and R. Mahler. A particle-systems implementation of the PHD multitarget
tracking filter. SPIE Vol. 5096 Signal Processing, Sensor Fusion and Target Recogni-
tion, pages 291–299, 2003.
[20] H. Sidenbladh. Multi-target particle filtering for the Probability Hypothesis Density.
International Conference on Information Fusion, pages 800–806, 2003.
[21] D. E. Clark and J. Bell. Convergence Results for the Particle PHD Filter. IEEE
Transactions on Signal Processing, 54, No.7:2652–2661, 2006.
[22] A. M. Johansen, S. S. Singh, A. Doucet, and B. Vo. Convergence of the SMC imple-
mentation of the PHD filter. Methodology and Computing in Applied Probability, to
appear., 2006.
200
[23] M. Tobias and A. D. Lanterman. Probability Hypothesis Density-based multi-target
tracking with bistatic range and Doppler observations. IEE Radar, Sonar and Naviga-
tion, Volume 152, Issue 3 , p. 195-205., 2005.
[24] D.E. Clark and J. Bell. Bayesian Multiple Target Tracking in Forward Scan Sonar
Images Using the PHD Filter. IEE Radar, Sonar and Navigation, Volume 152, Issue
5, p. 327-334, 2005.
[25] D. E. Clark, J. Bell, Y. de S.-Pern, and Y. Petillot. PHD Filter Multi-target Tracking in
3D Sonar. IEEE Oceans Europe Conference, Brest June 2005. Volume 1, June 20-23,
2005 p265 - 270.
[26] D. E. Clark, I. Tena-Ruiz, Y. Petillot, and J. Bell. Multiple target tracking and data
association in sonar images. IEE Seminar on Target Tracking: Algorithms and Appli-
cations. Birmingham, UK. March 2006., pages 149–154, 2006.
[27] N. Ikoma, T. Uchino, and T. Maeda. Tracking of feature points in image sequence by
SMC implementation of PHD filter. ICE 2004 Annual Conference, 4-6 Aug, 2004. p
1696 - 1701 vol. 2.
[28] B. Vo, W. K. Ma, and S. Singh. Locating an unknown time-varying number of speak-
ers: A Bayesian random finite set approach. in Proc. 2005 IEEE Int. Conf. Acoust.,
Speech, Signal Processing, Philadelphia, 4:1073–1076, 2005.
[29] K. Panta, B. Vo, S. Singh, and A. Doucet. Probability hypothesis density filter versus
multiple hypothesis tracking. Proceedings of SPIE – Volume 5429 Signal Processing,
Sensor Fusion, and Target Recognition XIII, Ivan Kadar, Editor, August 2004, pp.
284-295.
201
[30] Lin Lin. Parameter estimation and data association for multitarget tracking. PhD
Thesis, The University of Connecticut, 2004.
[31] D. E. Clark and J. Bell. Data Association for the PHD Filter. ISSNIP, Melbourne,
Australia. 5th-8th December 2005., pages 217 – 222.
[32] K. Panta, B. Vo, and S. Singh. Improved probability hypothesis density filter (PHD)
for multitarget tracking. Proceedings ICISIP, Bangalore 12th-15th December 2005.
[33] B. Vo and W. K. Ma. The Gaussian Mixture Probability Hypothesis Density Filter.
IEEE Transactions on Signal Processing, to appear, 2006.
[34] B. Vo and W. K. Ma. A closed-form solution to the Probability Hypothesis Density
filter. in Proc. Int’l Conf. on Information Fusion, Philadelphia, 2005.
[35] D. E. Clark and B. Vo. Convergence Analysis of the gaussian mixture PHD Filter.
IEEE Transactions on Signal Processing, to appear, 2006.
[36] D. Clark, B. Vo, and J. Bell. GM-PHD Filter Multi-target Tracking in Sonar Images.
Proc. SPIE Defense and Security Symposium. Orlando, Florida [6235-29], 2006.
[37] Y. C. Ho and R. C. K. Lee. A Bayesian approach to problems in stochastic estimation
and control. IEEE Trans. AC, AC-9:333–339, 1964.
[38] A. Jazwinski. Stochastic processes and filtering theory. Academic Press, 1970.
[39] S. J. Julier and J. K. Uhlmann. A General Method for Approximating Nonlinear Trans-
formations of Probability Distributions. Technical Report, RRG, Dept. of Engineering
Science, University of Oxford., 1996.
202
[40] H. W. Sorenson and D. L. Alspach. Recursive Bayesian estimation using Gaussian
sum. Automatica, 7:465–479, 1971.
[41] A. Doucet, N. de Freitas, and N. Gordon. Sequential Monte Carlo Methods in Practice.
Springer-Verlag, 2001.
[42] B. Vo and S. Singh. Technical aspects of the Probability Hypothesis Density recursion.
Tech. Rep. TR05-006 EEE Dept. The University of Melbourne, Australia, 2005.
[43] D. L. Alspach. A Bayesian Approximation Technique for Estimation and Control of
Discrete Time Systems. PhD thesis, University of California, San Diego, 1970.
[44] B. D. Anderson and J. B. Moore. Optimal Filtering. Prentice-Hall, New Jersey, 1979.
[45] D. E. Clark and J. Bell. Multi-target State Estimation and Track Continuity for the
Particle PHD Filter. IEEE Transactions on Aerospace and Electronic Systems, 43 no
3, July 2007.
[46] D. Clark, K. Panta, and B. Vo. The GM-PHD Filter Multiple Target Tracker. Proc.
International Conference on Information Fusion. Florence., July 2006.
[47] D. E. Clark, I. Tena-Ruiz, Y. Petillot, and J. Bell. Multiple Target Tracking in Sonar
Images. IEEE Transactions on Aerospace and Electronic Systems, 43 no 3, July 2007.
[48] B. Oksendal. Stochastic differential equations, 6th edition. Springer Verlag, Heidel-
berg, 2003.
[49] Venkatarama Krishnan. Nonlinear Filtering and Smoothing : An Introduction to Mar-
tingales, Stochastic Integrals and Estimation. Dover, 2005.
203
[50] S. Julier and J. Uhlmann. A new extension of the kalman filter to nonlinear systems.
In Int. Symp. Aerospace/Defense Sensing, Simul. and Controls, Orlando, FL., 1997.
[51] J. T.-H. Lo. Finite-dimensional sensor orbits and optimal non-linear filtering. IEEE
Trans. IT, IT-18(5):583–588, 1972.
[52] S. Arulampalam, S. Maskell, N. J. Gordon, and T. Clapp. A tutorial on particle filters
for on-line non-linear/non-Gaussian Bayesian tracking. IEEE Trans. SP, 50(2):174–
188, 2002.
[53] A. N. Shiryaev. Probability. Number 95 in Graduate Texts in Mathematics. Springer
Verlag, New York, second edition, 1995.
[54] D. Crisan and A. Doucet. A survey of convergence results on particle filtering for
practitioners, 2002.
[55] D. Crisan and A. Doucet. Convergence of sequential Monte Carlo methods, 2000.
[56] D. Crisan. Sequential Monte Carlo Methods in Practice, chapter 2, pages 17–41.
Springer-Verlag, 2001.
[57] J. Jacod and P. Protter. Probability Essentials. Springer, 2000.
[58] D.L. Hall and J. Llinas, editors. Handbook of Multisensor Data Fusion, chapter 7.
CRC Press, 2001.
[59] S. K. Srinivasan. Stochastic Point Processes and Their Applications. Griffin’s Statis-
tical Monographs and Courses, 1973.
[60] D. R. Cox and V. Isham. Point Processes. Chapman & Hall, 1980.
204
[61] H. Sidenbladh and S.L. Wirkander. Tracking random sets of vehicles in terrain. IEEE
Workshop on Multi-Object Tracking, Madison, WI, USA, 2003.
[62] M. Tobias and A.D. Lanterman. A Probability Hypothesis Density-based multitarget
tracker using multiple bistatic range and velocity measurements. System Theory, 2004.
Proceedings of the Thirty-Sixth Southeastern Symposium on , March 14-16, 2004,
pages 205–209, 2004.
[63] P. Billingsley. Convergence of probability measures. Wiley, New-York, 1968.
[64] B. Rynne and M. Youngson. Linear Functional Analysis. Springer-Verlag, 2000.
[65] D. Salmond. Tracking in Uncertain Environments. PhD thesis, University of Sussex,
1989.
[66] J. L. Williams. Gaussian mixture reduction for tracking multiple maneuvering targets
in clutter. Master’s thesis, Air Force Institute of Technology, 2003.
[67] I. Tena Ruiz, S. Raucourt, Y. Petillot, and D. M. Lane. Concurrent mapping and
localisation using side-scan sonar for autonomous navigation. Oceanic Engineering,
IEEE Journal of, 29, Issue 2:442–456, 2004.
[68] I. Tena Ruiz, D. M. Lane, and M. J. Chantler. A comparison of inter-frame feature
measures for robust object classification in sector scan sonar image sequences. IEEE
Journal of Oceanic Engineering, 24, No.4:458–469, 1999.
[69] J.M. Bell. A model for the simulation of side scan sonar. PhD Thesis. Heriot-Watt
University, 1995.
205
[70] B. S. Everitt and G. Dunn. Applied Multivariate Data Analysis. Arnold, 2nd edition,
2001.
[71] T. Kanungo, D. M. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Y. Wu.
A Local Search Approximation Algorithm for k-means Clustering. Proc. of the 18th
Annual ACM Symp. on Computational Geometry, pages 10–18, 2002.
[72] J. Hoffman and R. Mahler. Multitarget miss distance via optimal assignment. IEEE
Trans. Sys., Man, and Cybernetics-Part A, 34(3):327–336, 2004.
[73] C. A. Bouman. Cluster: An unsupervised algorithm for modeling Gaussian mixtures.
Available from http://www.ece.purdue.edu/˜bouman, April 1997.
[74] O. Erdinc, P. Willet, and Y. Bar-Shalom. Probability Hypothesis Density Filter for
Multitarget Multisensor Tracking. Proc. FUSION 2005.
[75] T. Kurien. Issues in the design of practical multi-target tracking algorithms. Multi-
target Multi-sensor Tracking: Advanced Applications, pages 43–83, 1990.
[76] R. Mahler. Multi-target Bayes filtering via first-order multi-target moments. IEEE
Trans. AES, 39(4):1152–1178, 2003.
[77] I. Tena Ruiz, Y. Petillot, D. M. Lane, and C. Salson. Feature Extraction and Data
Association for AUV Concurrent Mapping and Localisation. Proceedings of the 2001
IEEE Conference on Robotics and Automation. Seoul, Korea. May 2001.
[78] I. Tena Ruiz, Y. Petillot, D. Lane, and J. Bell. Tracking objects in underwater multi-
beam sonar images. Motion Analysis and Tracking (Ref. No. 1999/103), IEE Collo-
quium on , 10 May 1999, pages 11/1 – 11/7, 1999.
206
[79] C. Haworth, Y. de Saint-Pern, D. Clark, E. Trucco, and Y. Petillot. Detection and
tracking of multiple metallic objects in millimetre-wave images. International Journal
of Computer Vision, 71, no. 2:183–196, February 2007.
[80] R. Mahler. A Theory of PHD Filters of Higher Order in Target Number. SPIE Defense
and Security Symposium, Orlando, Florida, 2006.
[81] R. Mahler. PHD Filters of Higher Order in Target Number. submitted to IEEE Trans.
AES., 2005.
[82] B. T. Vo, B. Vo, and A. Cantoni. The CPHD Filter for linear Gaussian multi-target
models. Proc. 40th Annual Conf. on Info. Sciences and Systems (CISS’06), Stanford,
2006.
[83] M. Briers, A. Doucet, S. Maskell, and P. Horridge. Fixed-lag sequential Monte Carlo
data association. SPIE Defense and Security Symposium, Orlando, Florida, 2006.
207
top related