torrione 2002 masters

90
A COMPARISON OF STATISTICAL ALGORITHMS FOR LANDMINE DETECTION by Peter Acerbo Torrione Department of Electrical and Computer Engineering Duke University Date: Approved: Dr. Leslie Collins, Supervisor Dr. Gary Ybarra Dr. Gregg Trahey A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in the Department of Electrical and Computer Engineering in the Graduate School of Duke University 2002

Upload: aasthaa-sharma

Post on 08-Nov-2015

222 views

Category:

Documents


0 download

DESCRIPTION

wireless communication

TRANSCRIPT

  • A COMPARISON OF STATISTICAL ALGORITHMS FOR

    LANDMINE DETECTION

    by

    Peter Acerbo Torrione

    Department of Electrical and Computer EngineeringDuke University

    Date:Approved:

    Dr. Leslie Collins, Supervisor

    Dr. Gary Ybarra

    Dr. Gregg Trahey

    A thesis submitted in partial fulfillment of therequirements for the degree of Master of Science

    in the Department of Electrical and Computer Engineeringin the Graduate School of

    Duke University

    2002

  • Contents

    List of Tables v

    List of Figures vi

    1 Introduction 1

    2 Background 5

    2.1 Electromagnetic Induction Systems . . . . . . . . . . . . . . . . . . . 5

    2.1.1 Physics of EMI Systems . . . . . . . . . . . . . . . . . . . . . 5

    2.1.2 The GEM-3 Sensor . . . . . . . . . . . . . . . . . . . . . . . . 7

    2.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2.3 Parameter Estimation and the Cramer-Rao Lower Bound . . . . . . . 9

    2.4 The Detection Problem: Likelihood Ratios and Generalized LikelihoodRatios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.4.1 The Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . 12

    2.4.2 The Generalized Likelihood Ratio Test . . . . . . . . . . . . . 13

    2.4.3 The Matched Filter . . . . . . . . . . . . . . . . . . . . . . . . 14

    2.5 Linear Algebra Preliminaries and Matched Subspace Detectors . . . . 15

    2.5.1 Linear Algebra Preliminaries . . . . . . . . . . . . . . . . . . . 15

    2.5.2 Invariance of Hypothesis Testing Problems . . . . . . . . . . . 17

    2.5.3 Invariance Tests and Maximal Invariant Statistics . . . . . . . 18

    2.5.4 Matched Subspace Detectors . . . . . . . . . . . . . . . . . . . 18

    2.6 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . 22

    2.6.1 Problem Statement and the Vapnik-Chervonekis Dimension . 23

    ii

  • 2.6.2 Kernel Functions and Avoiding the Complexities of a High Di-mensional Space . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.6.3 Finding the Optimal Hyperplane . . . . . . . . . . . . . . . . 27

    3 The Cramer-Rao Lower Bound 30

    3.1 Additive White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.2 Additive White Noise and DC Term (in-phase) . . . . . . . . . . . . . 36

    3.3 Additive White Noise and Additive Function of Frequency (model 1quadrature) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    3.4 Additive White Noise and Multiplicative Term (model 2 quadrature) 39

    4 Signal Processing Using Matched Subspace Detectors 44

    4.1 Properties of Estimated Landmine Responses . . . . . . . . . . . . . 44

    4.2 Basis Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    4.3 Designing the Matched Subspace Filter . . . . . . . . . . . . . . . . . 48

    4.4 Matched Subspace Results . . . . . . . . . . . . . . . . . . . . . . . . 53

    5 Decay Rate Estimation 56

    5.1 Decay Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    5.2 Estimation Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    5.3 Gaussian Models and Detection . . . . . . . . . . . . . . . . . . . . . 59

    5.4 Decay Rate Estimation Results . . . . . . . . . . . . . . . . . . . . . 62

    6 Support Vector Machine Algorithms 65

    6.1 Building the Support Vector Machine . . . . . . . . . . . . . . . . . . 66

    6.2 Model and Parameter Selection and Implementation . . . . . . . . . . 66

    6.3 Support Vector Machine Results . . . . . . . . . . . . . . . . . . . . . 68

    7 Conclusions and Future Work 74

    iii

  • Bibliography 79

    iv

  • List of Tables

    2.1 Calibration grid landmine type and depth specifications . . . . . . . . 10

    v

  • List of Figures

    2.1 Calibration Lane Data Collection . . . . . . . . . . . . . . . . . . . . 9

    2.2 Blind Lane Data Collection . . . . . . . . . . . . . . . . . . . . . . . 10

    2.3 Data separation in 2 Dimensions . . . . . . . . . . . . . . . . . . . . . 26

    2.4 Data separation in 3 Dimensions . . . . . . . . . . . . . . . . . . . . . 26

    3.1 Typical in-phase and quadrature background measurements versus log-frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.2 Typical in-phase background measurements visibly shifted by someconstant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    3.3 Typical quadrature background measurements corrupted by some mul-tiplicative constant, or some additive term which increases with fre-quency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    3.4 Plots of the Cramer-Rao lower bound, calculated, and sample estima-tor variances versus the standard deviation of k. Parameters: bi = 10,2n = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    4.1 Signatures of VS-50 landmines versus log-frequency . . . . . . . . . . 45

    4.2 Signatures of M-14 landmines versus log-frequency . . . . . . . . . . . 46

    4.3 Actual, mean, and estimated signatures of M-14 landmines . . . . . . 48

    4.4 Comparison of filter bank outputs resulting from landmine and clutterresponses. Note that the sum across the filter banks from the clutterresponse is larger than from the landmine response. . . . . . . . . . . 51

    4.5 Comparison of in-phase and quadrature matched subspace receiveroperating characteristics from the calibration grid . . . . . . . . . . . 54

    vi

  • 4.6 Comparison of quadrature matched subspace detector and baselineenergy detector receiver operating characteristics from the blind andcalibration grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    5.1 Estimation of VS-50 Response . . . . . . . . . . . . . . . . . . . . . . 58

    5.2 Estimation of M-14 Response . . . . . . . . . . . . . . . . . . . . . . 59

    5.3 Estimated landmine decay rates plotted against 1 and 2 in Hz. Eachlandmine type is represented by a different shape. . . . . . . . . . . . 60

    5.4 Estimated landmine decay rates plotted against 1 and 2 in Hz (close-up). Each landmine type is represented by a different shape. Note thehigh degree of spatial correlation between landmines of each type. . . 61

    5.5 Estimated clutter decay rates plotted against 1 and 2 in Hz. Notethat the estimated decay rates for clutter objects are spread through-out a wide frequency range. . . . . . . . . . . . . . . . . . . . . . . . 62

    5.6 Gaussian PDF contours with scattered landmine and clutter decay rates 63

    5.7 ROC for Gaussian-PDF estimated decay rate-based detector operatingin the calibration grid. . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    6.1 Support Vector Machine decision boundaries for non-rejecting SVMsand relevant landmine and clutter parameter locations from the cali-bration grid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    6.2 Receiver operating characteristics of non-rejecting support vector ma-chines trained on decay rates, matched subspace outputs, and fullsignal responses operating in the calibration grid . . . . . . . . . . . . 70

    6.3 Receiver operating characteristics of rejecting support vector machinestrained on decay rates, matched subspace outputs, and full signal re-sponses operating in the calibration grid . . . . . . . . . . . . . . . . 71

    6.4 Receiver operating characteristics for three different support vectormachines operating in the blind grid . . . . . . . . . . . . . . . . . . 72

    7.1 Comparison of detector operating characteristics for matched subspaceand support vector machines . . . . . . . . . . . . . . . . . . . . . . . 75

    vii

  • Chapter 1

    Introduction

    Although estimates vary, agencies including the Red Cross and the United Nations

    concede that there are between 60 and 70 million active landmines in the ground,

    buried across 70 countries around the globe. Every year approximately 26,000 people

    are maimed or killed by landmines and 8,000 to 10,000 of these victims are children

    [1].

    Currently, there are approximately 340 different models of anti-personnel land-

    mines. Although these landmines cost as little as three dollars to produce, their

    presence inflicts a tremendous cost - especially in developing areas. Firstly, the

    cost to safely detect and remove each landmine can range between $300 and $1000.

    Furthermore, many surviving landmine victims require artificial prosthetics. These

    artificial limbs can cost between $100 and $3000, and they must be regularly replaced

    (every 3-5 years in adults, and every 6 months in children) [1]. It is impossible to

    measure the damage landmines inflict upon productivity, emotional well being, and

    the peaceful reconciliation of neighbors after years of war.

    As of 2002, the landmine crisis primarily affects poorer countries for which the

    economic impact of landmines is especially devastating. There are an estimated 22.5

    million landmines in Egypt, 16 million in Iran, and 10 million in Iraq, to list only

    some of the most egregiously affected countries [1].

    The primary contributor to the large cost of landmine removal is a high false alarm

    rate stemming from large amounts of anthropic clutter that pervades minefields. Until

    it is excavated and determined to not pose a threat, this clutter must be considered

    as dangerous as an actual landmine. On occasion, false alarm rates as high as 95%

    1

  • percent have been reported when clearing minefields [2].

    There are two distinct categories of landmine remediation: military and humani-

    tarian [1]. The primary goal of military landmine removal is to clear a path through

    a suspected minefield to allow troop movement in the area. Generally, this must

    be accomplished quickly, usually at night, to avoid exposure to the enemy. Military

    landmine clearance is often accomplished by driving large rollers, flails, or plows over

    landmines to detonate them and clear a path [1]. Unfortunately, these techniques may

    only achieve clearance rates of 80 percent, which is not an acceptable detection level

    for humanitarian situations (the regulations for humanitarian de-mining are described

    by the International Standards for Humanitarian Mine Clearance Operations, see

    reference [3]). Humanitarian landmine removal is a much more arduous process gen-

    erally involving indigenous workers using hand-held devices to locate possible targets

    and safely remove them.

    Landmines are indiscriminate killers, and while the UN is lobbying for a world-

    wide ban on their use, concerted efforts are underway to remove landmines in areas

    where they can cause harm to civilians. The goal of the research presented in this

    thesis is to develop signal processing techniques that expediate accurate detection

    of potential threats by decreasing the false alarm rates associated with currently

    deployed landmine detectors while maintaining high detection rates.

    Several novel sensor modalities have been investigated for discerning the locations

    of buried landmines. Possible ground querying techniques include neutron backscat-

    tering [4], ground penetrating radar [5, 6, 7, 8], seismic detectors [9], and acoustic-to-

    seismic coupling [10]. While many of these technologies hold great promise for future

    application, currently almost all fielded or nearly-fielded landmine detection systems

    use electromagnetic induction (EMI) sensors which operate on the same principals

    as a standard metal-detector.

    2

  • A large body of literature exists dealing with the applications of EMI sensors and

    processing of EMI data to the detection of buried landmines and unexploded ordnance

    (UXO). Some of this work has focused on determining the EMI responses from rota-

    tionally symmetric bodies [11, 12] and development of simplified phenomenological

    models to fit such responses [13]. Several researchers have explored the processing

    of time domain EMI responses to landmine detection using estimated decay rates

    [14, 15, 16, 17, 18, 19]. Other work by Won et al. has indicated that the wideband

    EMI spectral responses from different landmines are unique [20]. Gao et al. have

    derived the complicated optimal wideband EMI detector and have compared its re-

    sults to sub-optimal detectors [21, 22]. Additional signal processing research on the

    detection and classification of low-metallic content landmines via EMI data has been

    performed by Collins et al. [23, 24].

    In this thesis, we build on this body of work in three ways. First, we will address

    the problem of landmine response estimation via soil, or background removal and

    show that our proposed estimator achieves the Cramer Rao lower bound under specific

    statistical models of the received data. Second, we will apply the theory of matched

    subspace detectors [25, 26, 27] to the detection and classification of landmines versus

    clutter. Third, we will explore the possible applications of support vector machines

    (SVMs) [28, 29, 30, 31, 32, 33] to the landmine detection problem.

    The remainder of this thesis is organized as follows.

    In chapter two we review some of the information fundamental to the rest of

    the paper. We begin with a brief overview of electromagnetic induction sensors, the

    data collection procedure used, and the particular EMI sensor used in this study:

    the GEM-3. This is followed by a review of the Cramer-Rao lower bound and some

    linear algebra preliminaries to the matched subspace detector. A full treatment of

    matched subspace detectors is given prior to discussing the derivation and properties

    3

  • of support vector machines.

    Chapter three focuses on applying the Cramer-Rao lower bound to background

    response estimation. A method of estimating the received signal by subtracting an

    estimate of the background signal is proposed. The performance of the estimation

    procedure is considered under four different models of the received data. The esti-

    mation procedure is shown to achieve the Cramer-Rao bound for three of the models

    and to approach the bound for the fourth model.

    Chapter four discusses the particulars of the matched subspace detector as ap-

    plied to the landmine detection problem. This includes subspace basis estimation

    procedures and energy pre-screening. Results from a blind field trial are presented.

    Chapter five deals with the problems of decay rate estimation from frequency-

    domain EMI data. We briefly discuss a simple detection technique based on multiple

    Gaussian probability density functions.

    Chapter six describes the application of support vector machines to the landmine

    detection problem. Three different support vector machines are presented, and their

    receiver operating characteristics from blind field trials are discussed.

    In chapter 7 we will review the research covered by this thesis and present thoughts

    on the results. A comparison of the results of the support vector machine and matched

    subspace landmine detection techniques is presented. Possible avenues for future work

    are also discussed.

    4

  • Chapter 2

    Background

    2.1 Electromagnetic Induction Systems

    In 1831 Michael Faraday made the discovery that a changing magnetic field can

    generate or induce a current in a nearby conductor. Building upon Faradays work,

    Maxwell generated his four most famous equations upon which all electromagnetics is

    based. The phenomenology associated with EMI sensors (like hobby metal-detectors)

    is based directly on these equations.

    2.1.1 Physics of EMI Systems

    A standard EMI sensor has a primary coil, or transmitter coil, composed of wire

    through which alternating current flows. This current flow generates a changing

    magnetic field around the sensor that penetrates the ground. As Faraday noted, the

    changing magnetic field from the transmitter coil induces current flow in the ground.

    The current flowing through the earth (and any contaminants therein) generates

    another magnetic field. Thus, it is possible to use a receiver coil to listen for the

    magnetic field that results from the induced current flow in the earth. Of course, care

    must be taken in the placement of and recording of measurements from the receiver

    coil since the magnetic field of the transmitter will, in general, be much stronger than

    the secondary field resulting from the earths response. The magnitude and phase of

    the measured wideband EMI responses can be used to discern the amount, type, and

    shape of buried metal objects [20, 34, 35].

    Although Maxwells equations completely govern the responses of conducting ma-

    5

  • terials in any shape and orientation, solving these equations for shapes of arbitrary

    complexity is mathematically problematic.

    It has been shown [12, 36] that the frequency-domain response of a buried highly

    conducting object subject to EMI radiation can be modeled as:

    H() = a+n

    bn

    jn (2.1)

    Furthermore, the initial term a has been shown to be non-zero only for ferrous

    targets [37]. Similarly, the time-domain response of such a system has been shown

    [14, 38, 39] to be the weighted sum of exponentials:

    S(t) = a(t) +n

    Anent (2.2)

    where, since the real part of n is negligible, n is real. In practice, the actual

    responses of buried targets are well approximated by the first few terms in each of

    the above summations. The primary parameters of interest are often assumed to be

    the first few decay rates: 1 and 2. A significant amount of work has focused on the

    application of estimated decay rates to landmine detection [14, 15, 11, 19, 17, 18, 40].

    For high metal-content objects, the primary decay rates are generally fairly small,

    resulting in slowly decaying exponential responses. Such responses are relatively easy

    to sample in the time-domain. However, for objects containing small amounts of

    metal (like most modern landmines), the decay rate parameters are very large and

    the resulting exponential signature decays very rapidly. This makes time-domain

    measurement of the decay rates difficult due to the rate at which the signal decays.

    In this work, a wideband frequency-domain EMI sensor is utilized. Since a wide-

    band frequency-domain sensors responses are not time dependent, these sensors are

    advantageous when measuring quickly decaying exponential signals.

    6

  • 2.1.2 The GEM-3 Sensor

    In this work, data from a Geophex GEM-3 sensor was used [41]. This section describes

    the GEM-3 sensor.

    The GEM-3 is a wideband digital electromagnetic sensor weighing about 10

    pounds. The sensor head of the GEM-3 consists of three concentric coils. The

    inner coil is the receiver coil, and the two outer coils comprise the transmitter coil.

    The combination of the magnetic fields induced by the outer coils creates a magnetic

    cavity (area with zero magnetic field) at the receiving coil. This prevents interference

    between the transmitted and induced magnetic fields. [42]

    When operating as a wideband sensor the GEM-3 prompts for a set of frequencies

    at which to collect the induced EMI response. The GEM-3 can operate at frequencies

    between 30 Hz and 24 kHz. In this work, the GEM-3 was programmed to collect data

    at the following ten frequencies:

    750 1410 2370 4050 6030 8250 10890 14430 19450 23970 Hz

    A sensor that operates a multiple frequencies has the advantage of being able

    to see at multiple depths into the medium since low frequency signal will pene-

    trate further into the medium than a high frequency signal. It has been previously

    shown that the GEM-3 performs significantly better for discriminating landmines

    from clutter than several other sensors at blind government run test sites [43].

    It has also been established that different types of landmines generate unique

    frequency-domain signatures, which are relatively independent of target-sensor ori-

    entation and distance for high metal content objects [20, 44]. However, the signatures

    are dependent on target-sensor orientation and distance if the objects metal content

    is low [43]. Recent work has also shown that these signatures change when the objects

    are buried [43]. The goal of this research is to develop algorithms that reduce the ef-

    7

  • fects of the soil on the measured signal and maximize the detection and classification

    of landmines using their frequency-domain EMI signatures.

    2.2 Data Collection

    The GEM-3 data used in this work was taken from a government test site in Virginia.

    The site is segmented into a large (50m x 20 m) grid consisting of squares measuring

    1 meter per side. Before being used as a testing ground, all of the anthropic clutter

    was systematically removed from the site. Some clutter was subsequently replaced to

    provide discrete opportunities for clutter-induced false alarms. At the center of each

    1m x 1m grid square a landmine, a clutter item, or nothing is emplaced. Ground

    truth, i.e. the object buried in each square, is sequestered for this area and is known

    only to the government sponsor. A separate area measuring 25 meters by five meters

    was designated for sensor calibration and algorithm testing. The ground truth for the

    calibration section is available to the public so that algorithms can be tested prior to

    application on the blind grid.

    The calibration data used for algorithm training in this work was recorded from

    various spots throughout the calibration grid. In all, 20 clutter responses and 27

    landmine signatures from 12 different landmine types at varying depths were col-

    lected from the calibration lanes. Data from 980 potential targets was measured in

    the blind grid. In the calibration lanes, where the ground truth is known, two back-

    ground measurements were taken from either side of the center target location as

    shown in Fig. 2.1. In the blind grid, measurements alternated between background

    and potential targets at locations shown in Fig 2.2. All of the central and back-

    ground measurements were taken by human operators. Although the sensor height is

    approximately constant across all measurements, variations are bound to exist due to

    uneven ground, operator height and posture, and other factors. Thus, sensor height

    8

  • 1 m

    1 m

    Figure 2.1: Calibration Lane Data Collection

    is essentially a random variable.

    In summary, for each calibration-grid data point two unique background signals

    were measured on each side of the possible target location. For each blind grid data

    point there are two shared background signals for each square with the exception of

    the first and last squares from each column.

    Table 2.2 indicates the depths and number of occurrences of each landmine type

    in the calibration area. In the table, HE means high explosive present. For addi-

    tional information regarding the data collection, please see the Hand Held Metallic

    Mine Detector Performance Baselining Collection Plan [45, 46].

    2.3 Parameter Estimation and the Cramer-Rao Lower Bound

    Estimating an unknown parameter from data is a research topic that has been studied

    extensively [47, 27, 48]. In this section two standard approaches to parameter esti-

    mation and the Cramer-Rao bound which places limits on the best possible unbiased

    9

  • 1 m

    1 m

    Figure 2.2: Blind Lane Data Collection

    Minetype Number of measurements Depth Range (in)

    VS-50 5 0 - 2.25TS-50 3 0 - 1.75M-14 3 .25 - 1.75

    M-14 (HE) 2 .5 - 1.125PMA-3 2 0 - 1.5VAL69 1 0VS-2.2 2 1.50 - 3M-19 2 1.25 - 2.5TMA-4 2 1.75 - 3TM62P3 2 1.50 - 3T-72 1 1.25TM-46 1 3VS1.6 1 1

    Table 2.1: Calibration grid landmine type and depth specifications

    10

  • estimator are discussed.

    Consider a data set x consisting of xi data points drawn from some distribution

    F with parameter : F (x,). The goal of an estimator is to predict the value of

    using only the set of data given and (possibly) some prior knowledge of F. The

    estimated value is then referred to as . is said to be an unbiased estimator of

    if E(|x) = (where E represents the expected value). is said to be a consistentestimator of if the variance of (E(()2|x)) tends toward zero with probabilityone as the size of the data set grows to infinity.

    There are two common approaches to parameter estimation: Bayesian and Maxi-

    mum Likelihood [47]. In Bayesian estimation one assumes a prior distribution on the

    parameter of interest F (). One then considers the distribution Fx(x|), and

    = E(|x) =

    f(|x) (2.3)

    where:

    f(|x) = f(x|)f()f(x)

    (2.4)

    In maximum likelihood estimation one considers the density f(x,) and maximizes

    this function such that given a set of data x, is chosen to maximize f(x, ).

    Often it is difficult to derive, implement, or show that the optimal estimator ex-

    ists for a given problem [47]. Although consistency guarantees that the variance of

    an estimate tends to zero, there are often some estimators whose variance will ap-

    proaches zero more quickly than others. It is useful to determine if a given estimator

    approaches or achieves the statistics of the best possible estimator; the Cramer Rao

    lower bound (CRLB) provides such a tool [47, 49]. The CRLB is a measure of the

    smallest variance that an unbiased estimator can achieve on a given set of data. If an

    estimator achieves this bound, the estimator is the best unbiased estimator. Consider

    11

  • an estimator of some parameter . Further, consider a set of data X = xi drawn

    from the density f(xi,). In mathematical terms, the CRLB states that the variance

    of an estimator satisfies:

    V AR() 1J()

    (2.5)

    where J is the Fischer information defined as:

    J() = E[

    ln(f(x; ))]2 (2.6)

    An alternative formulation of J(X) is given in [48] as:

    J() = E[

    2

    ln(f(x|))|] (2.7)

    2.4 The Detection Problem: Likelihood Ratios and Gener-

    alized Likelihood Ratios

    This thesis is primarily concerned with the detection of signals in noise. In this

    section the optimal solution to the hypothesis testing problem - the likelihood ratio,

    and a sub-optimal version of this test - the generalized likelihood ratio are reviewed.

    2.4.1 The Likelihood Ratio Test

    In most binary decision problems, one has a set of data and wishes to determine

    which of two separate distributions the data was drawn from. The two hypotheses

    are generally termed H0, and H1, or the null and alternative hypotheses respectively.

    The likelihood ratio is the optimal decision statistic for a wide range of decision

    problems [48] and is defined as:

    (x) =p(x|H1)p(x|H0)

    >< (2.8)

    12

  • The null hypothesis is accepted if (x) is less than a certain threshold, , otherwise

    the alternative hypothesis is accepted.

    Determining the optimal threshold value to use depends on the performance cri-

    teria chosen. The two most commonly used performance criteria are the Neyman-

    Pearson criteria and the Bayes criteria [48].

    2.4.2 The Generalized Likelihood Ratio Test

    The standard likelihood ratio test assumes that the conditional distributions of the

    data under the two hypotheses are known. Often this assumption is invalid. When

    the two probability density functions are not known or are difficult to estimate,

    the Generalized Likelihood Ratio Test (GLRT) is often utilized. The GLRT is an

    intuitive (although not optimal) mechanism by which to approach the problem of

    unknown distributions in a two-hypothesis decision scenario. Consider again the two

    probability distribution functions, except assume that some parameter, denoted ,

    associated with the probability density function p is unknown:

    p(x|H1) p(x|, H1) (2.9)

    p(x|H0) p(x|, H0) (2.10)

    The likelihood ratio is [47]:

    (x) =

    (p(x|, H1) p(|H1)d)(p(x|, H0) p(|H0)d)

    (2.11)

    In practice, the calculation of this integral is often difficult, or if p(|H1) is un-known, impossible. One sub-optimal solution results from substituting estimates of

    the unknown into the density functions. This formulation is termed the generalized

    likelihood ratio test [48]:

    (x) =p(x|, H1)|p(x|, H0)|

    (2.12)

    13

  • 2.4.3 The Matched Filter

    One simple and commonly encountered hypothesis testing problem involves deter-

    mining the presence of a known signal s in the presence of additive zero-mean white

    noise. In this case, the likelihood ratio reduces to a filter known as a correlation

    detector or matched filter [48].

    Let s and n be length i vectors consisting of the known signal and statistically

    independent, N (0, I2n) noise respectively. Consider a received data vector x. Underthe null and alternative hypotheses

    H0 : x = n

    H1 : x = s+ n

    The distributions of x under H0 and H1 are:

    f(x|H0) =i

    j=1

    12pi2n

    expx2j22n

    f(x|H1) =i

    j=1

    12pi2n

    exp(xj sj)2

    22n

    The likelihood ratio (equation 2.8) is

    (x) =i

    j=1

    exp(2xjsj s2j)

    22n

    Taking the natural logarithm and incorporating the known values (2n,si) into the

    threshold () yields:

    (x) =i

    j=1

    xjsj><

    which is the well known matched filter.

    14

  • 2.5 Linear Algebra Preliminaries and Matched Subspace De-

    tectors

    The common matched filter is a special case of a more general class of filters termed

    matched subspace detectors [27]. Scharfs derivation of the matched subspace de-

    tectors (see [27, 25]) requires some linear algebra preliminaries which allow him to

    show that the matched subspace detector has many interesting and powerful proper-

    ties including invariance to rotations in certain subspaces and optimal performance

    under certain assumptions. In this section the linear algebra associated with projec-

    tion matrices (which are an integral part of matched subspace filters) is discussed. A

    summary of Scharfs definitions of invariance and maximal invariant statistics (closely

    following the discussion from [27]) is given, and finally, summaries of Scharfs appli-

    cation of these ideas to the development of the matched subspace filter and his proof

    that the matched subspace detector is a uniformly most powerful test are provided.

    2.5.1 Linear Algebra Preliminaries

    Before discussing matched subspace filters, it is important to review the formation

    and properties of projection matrices.

    The span of a set of vectors [v1v2, ...vN ] is defined as the set of all linear combi-

    nations of {v1,v2, ...vN}. A vector b is then an element of the span of {vi} if andonly if the equation:

    b = a1 v1 + a2 v2 + ...+ aN vN (2.13)

    has a solution. When the vectors {vi} are considered columns in a matrixH, the spanof {vi} is equivalent to the subspace denoted by . The orthogonal complementof is denoted .

    15

  • A projection matrix E is a square matrix that gives a projection onto a given

    subspace. The projection onto a subspace is denoted as EH . A projection

    matrix must be idempotent (equal to its own square):

    E2 = E (2.14)

    An orthogonal projection matrix has the additional constraint of being Hermitian

    (equal to its Hermitian transpose). Such projections are denoted with the letter P:

    PH = P (2.15)

    The most common orthogonal projection matrices are the Cartesian coordinate pro-

    jections in

  • An orthogonal projection onto maps vectors contained in the subspace

    to themselves, and maps vectors lying in to the zero vector. This can be seen

    using the Cartesian projections:

    Px

    [c

    0

    ]=

    [c

    0

    ](2.22)

    Px

    [0d

    ]=

    [00

    ](2.23)

    2.5.2 Invariance of Hypothesis Testing Problems

    In many decision problems, there are parameters associated with the probability

    distribution functions of the measured signals which are considered nuisance pa-

    rameters. In these cases it is desirable to reduce the set of viable decision rules to

    those which are (in some sense) invariant to changes in the nuisance parameters.

    As Scharf states:

    This leads to the key idea behind invariance in hypothesis testing: When

    presented with nuisance parameters that are extraneous to the hypothesis

    test, look for transformations of the measured data that would introduce

    these nuisance parameters and then look for a decision rule that is invari-

    ant to these transformations. [27] pg. 128

    Consider the hypothesis testing problem of determining if X was drawn from

    F1(x) or F0(x). If for every g in G:

    x : F(x) (2.24)

    y = g(x) (2.25)

    F(y) = P[g(X) y] (2.26)17

  • where F(y) is the distribution of y with parameter , and

    F(y) = Fg()(y) (2.27)

    (that is - if the only effect of the function g(x) on the distribution F(x) is to change

    the parameter from to g()) then the family of distributions for which equation

    2.27 holds is said to be invariant to G. Also, if the transformation g maintains thedichotomy between H1 and H0, the hypothesis testing problem is said to be invariant

    to G.

    2.5.3 Invariance Tests and Maximal Invariant Statistics

    A hypothesis test is invariant to G if (g(x)) = (x) [27]. Furthermore, a statistic ismaximally invariant if

    M(g(x)) =M(x) for all g in G (invariant) (2.28)

    M(x1) =M(x2) implies x1 = g(x2) for some g in G (maximal) (2.29)

    Thus, all invariant tests may be written as a function of a maximally invariant statistic

    [27]:

    (x) = (M(x)) (2.30)

    These results are important for the landmine detection problem because they

    show that when deriving a decision rule for all invariant hypothesis testing problems,

    it is possible consider only functions of a maximal invariant statistic.

    2.5.4 Matched Subspace Detectors

    In this section a review of Scharfs work is presented which shows that the problem

    statement leading up to the matched filter is naturally invariant to a set of trans-

    formations and that the matched subspace detector is a maximal invariant statistic.

    18

  • Scharfs explanation of why the matched subspace detector is uniformly most pow-

    erful is also reviewed.

    In a detection problem, the exact form of the signal of interest is often unknown.

    The signal may be subject to an arbitrary gain, or it may be a random (unknown)

    combination of a set of basis vectors. As has been previously noted, a vector x which

    lies in the subspace can always be represented by a linear combination of a set

    of vectors comprising the matrix H. The signal x can then be represented as:

    x =n

    nhn = H (2.31)

    where H is an N X P matrix and is a P X 1 vector containing the coordinates of

    x in . If the weight vector is known a priori then, since the subspace

    is known, the vector x is completely determined, and the optimal detector is the

    matched filter. However, if is unknown, then all that is known about the vector

    x is that it lies somewhere in the space spanned by H. Under these assumptions, if

    x is corrupted by white noise and biased in , Scharf [25] has shown that the

    optimal test statistic is:

    2 = xTPHx (2.32)

    Here we summarize his proof.

    Let X = H + N with N : N [0, 2I]. If a channel also rotates the signalin and adds a bias v in the subspace =, this can be described

    mathematically as:

    QH(X+ v) (2.33)

    where QH is a rotation matrix in and v lies in the subspace . Note that

    the rotation of v leaves v unchanged (since we are rotating in ), and H is

    mapped to H. Let

    19

  • y = QH(X+ v) (2.34)

    y : N [H + v, 2I] (2.35)

    The hypothesis test is then to discern between the null hypothesis ( = 0) and

    the alternative ( > 0). As mentioned above, since QH and v are unknown, they are

    considered nuisance parameters and the matched subspace detector should ideally be

    invariant to them. To show that the matched subspace detector is uniformly most

    powerful, Scharf shows that the distribution of y is invariant to these parameters,

    the matched subspace detector is a maximal invariant statistic, and the matched

    subspace detector has a monotone likelihood ratio.

    It can be shown that the hypothesis testing problem in this case is invariant to

    the set of functions

    G = {g : g(y) = QH(y+w)} (2.36)

    since the distribution of QH(y+w) is

    N [H + v+w, 2I] (2.37)

    and the distribution of y is given by eq. 2.35. Note that the form of the distribution

    has not changed (only the mean parameter has been altered), thus the distribution

    of y is invariant to G. Also, since the transformation of the parameter (H+v) is:

    g(H+ v) = H + v+w (2.38)

    the transformations of the hypothesis are:

    g(H0) = v+w = H0 (2.39)

    and

    g(H1) = H + v+w = H1 (2.40)

    20

  • the dichotomy of the original parameter space is maintained, and the hypothesis

    testing problem is G-invariant.To show that the matched subspace statistic

    2 =M(y) = yTPHy (2.41)

    is maximal invariant to the group G, Scharf shows that eq. 2.28 and 2.29 hold with:

    g(y) = QH(y+ v) (2.42)

    For eq. 2.28:

    (QH(y+ v))TPH(QH(y+ v)) (2.43)

    since QTHQH = I:

    = (y+ v)TPH(y+ v) (2.44)

    and since v is in

    = yTPHy (2.45)

    For eq. 2.29:

    yT1PHyT1 = y

    T2PHy

    T2 (2.46)

    note that the quadratic form involving PH is the energy of the vectors in the subspace

    . Since the energies of both y1 and y2 in the subspace are the same, y2

    must be a rotation of y1 and/or differ only in the subspace . Thus:

    y1 = QH(y2 + v) (2.47)

    for some QH and v.

    Since the statistic 2/2 (2 from eq. 2.41) is primarily the square of a Gaussian-

    distributed vector, it can be shown (see [27]) that it is distributed as a chi-squared

    21

  • random variable. By the Karlin-Rubin theorem, since all 2 random variables have

    monotone likelihood ratios, the 2 test is uniformly most powerful [27].

    In the above discussions, the variance of the noise (2) has been assumed to be

    known. If this is not the case, then the maximal invariant statistic becomes:

    F =xTPHx

    xT (PH)x(2.48)

    or

    F =xTPHx

    xT (IPH)x (2.49)

    Furthermore, note that the constant false alarm rate matched filter can be de-

    scribed using a cosine statistic as [50]:

    cos2 =xTPHx

    xTx(2.50)

    Although matched subspace detectors are significantly more complicated than the

    special case of the matched filter, they provide a wide range of invariances and are

    significantly more robust than matched filters when the signal of interest is not known

    exactly, as is the case in the particular problem of landmine detection.

    2.6 Support Vector Machines

    Support vector machines (SVMs) are a relatively new type of learning machine that

    have many interesting properties [29, 32, 28, 31]. Support vector machines operate

    by mapping the data of interest to a high dimensional space and generating a sep-

    arating hyperplane in that space. The high dimensional separating hyperplane can

    then be used for hypothesis testing. In this section, we describe the mathematics

    associated with SVMs and review how they avoid the complexities usually associated

    with decision making in a high dimensional space.

    22

  • 2.6.1 Problem Statement and the Vapnik-Chervonekis Dimension

    Assume that a set of training vectors {xi} are available which were drawn from someprobability density function P (x, y) where y Y : {1, 1}. Here, y represents theclassification of the training data into one of two sets or hypotheses. Let y = 1correspond to H0 and y = 1 correspond to H1. Then consider then the sets of training

    data:

    (x1, y1), ..., (xN , yN)

  • Unfortunately, the equations governing the VC dimension are complicated and

    usually not of practical value [29]. If the search for f is restricted to linear forms:

    f(x) = (w x) + b (2.54)

    (hyperplanes in some space), it can be shown [28] that the VC dimension is bounded

    by the minimal distance from the hyperplane to a data point; this distance is called

    the margin.

    2.6.2 Kernel Functions and Avoiding the Complexities of a High Dimen-

    sional Space

    Although the linear restrictions suggested above appear to be somewhat limiting,

    this apparent shortcoming can be overcome by mapping the observed data into high

    dimensional spaces. Consider a function :

  • A simple example from [32] and [29] illustrates this point. Consider a set of data

    distributed in

  • X1

    X2

    Figure 2.3: Data separation in 2 Dimensions

    Z1Z2

    Z3

    Figure 2.4: Data separation in 3 Dimensions

    26

  • Special rules can be applied to determine if a function is a valid kernel. In this

    thesis, we restrict ourselves to polynomial and Gaussian functions of the 2-norm of

    the data. These are valid kernel functions by Mercers Theorem [29].

    2.6.3 Finding the Optimal Hyperplane

    Previous work, including the illustrative example above, has shown that in some cases

    mapping data into higher dimensions may decrease the complexity of the data separa-

    tion problem. Furthermore, kernel functions provide a tool to obtain dot products of

    vectors in high-dimensional spaces without actually performing the high-dimensional

    mapping. However, a technique for determining the optimal hyperplane as to achieve

    the best possible performance has not been presented. In order to find the optimal

    hyperplane, the discussion given in [29] is reviewed.

    Optimal performance, and thus the optimal hyperplane, can be found by mini-

    mizing the expected risk. Since the expected risk is generally unknown, the optimal

    hyperplane is found by minimizing the upper bound on the expected risk via [28]:

    R[f ] R[f ] +h ln(ln(2n

    h+ 1) ln(

    4))

    n(2.59)

    with probability of at least 1 for n > h.

    where h is the VC dimension of the function class F .If the training data is assumed to be perfectly separable by f , then R[f ] is zero,

    and the risk is bounded by a monotonic function of the VC dimension h [29].

    Furthermore, Vapnik has shown [28] that for linear classifiers (like the one deter-

    mined by the optimal hyperplane) the VC dimension itself is bounded by a monotonic

    function of w. Thus, one can find the optimal hyperplane by minimizing w while

    maintaining perfect training data separation:

    yi((w (xi)) + b) 1, i = 1, ..., n. (2.60)27

  • This minimization is complicated, but through Lagrange multipliers, it is possible to

    arrive at the following quadratic programming formula [29, 32, 28]:

    max

    T1 12TD (2.61)

    subject to:

    TY = 0 (2.62)

    0 (2.63)

    where:

    1T = [1, ..., 1] (2.64)

    T = [1, ..., n] (2.65)

    w =ni=1

    iyi(xi) (2.66)

    Dij = yiyj(k(xi,xj)) (2.67)

    k being the kernel function

    The decision statistic is then:

    f(x) = sign

    [ni=1

    yii((x) (xi)) + b]

    (2.68)

    or:

    f(x) = sign

    [ni=1

    yiik(x,xi) + b

    ](2.69)

    In the above discussion it is assumed that the training data available is perfectly

    separable by a hyperplane in F . If this is not the case, a hyperplane that is a solutionto:

    max

    T1 12[TD +

    2maxC

    ] (2.70)

    28

  • (subject to the same constraints) must be determined.

    There is a substantial body of literature on solving the quadratic programming

    problem (for a list of references, see [29]). In this work, we use Cawleys SVM package

    (available from [30] or [51]). It achieves good performance by splitting the quadratic

    optimization problem into mini-problems of size two using the sequential minimal

    optimization technique [29].

    29

  • Chapter 3

    The Cramer-Rao Lower Bound

    The response of the ground to wideband EMI sensors is a random vector b which

    depends upon the makeup of the soil and the height of the sensor above the ground.

    When measuring the EMI responses of buried targets in the earth, the variability in

    the background response degrades our received signal. Thus, the measured response

    from a buried M-14 landmine will differ significantly depending on the composition

    of the soil under which the landmine is buried [43]. Since landmines are found

    throughout the world in varying environments, background interference adversely

    affects ones ability to define a robust non-adaptive decision algorithm.

    One approach to reducing the effect of the background response is to take mea-

    surements near the potential target and use these measurements to estimate the

    background signal at the target location. In this chapter we discuss several models of

    the received background data and show that under certain assumptions the Cramer-

    Rao lower bound can be achieved by using the available background measurements to

    remove an estimate of the background signature from the potential target location.

    In the measurements from the site in Virginia, two background signals were taken

    for each potential target (see figures 2.1 and 2.2). We will assume that the background

    response at the site is constant over a distance of one meter. This allows us to

    model the background response as constant over the potential target location and

    two neighboring background measurements. The assumption that the background

    response is constant is reasonable since the composition of the soil is not expected

    to change substantially over one meter and it has been shown [43] that sensor drift

    occurs over a longer time span than would be required to take EMI readings over a

    30

  • 103 104

    80

    70

    60

    50

    40

    30

    20

    10

    0

    LogFrequency

    Res

    pons

    e

    Typical Inphase and Quadrature Background Signals vs. LogFrequency

    QuadratureInphase

    Figure 3.1: Typical in-phase and quadrature background measurements versuslog-frequency

    one meter square. For examples of in-phase and quadrature background signals, see

    figure 3.1.

    This chapter is divided into four sections each considering a different model of

    the background response: additive zero mean white Gaussian noise, additive white

    Gaussian noise with an additive constant term across frequencies, additive white

    Gaussian noise with an additive non-constant variance term across frequencies, and

    additive white Gaussian noise with a multiplicative term across frequencies.

    3.1 Additive White Noise

    For each target we have three measurements from the GEM-3. They will be denoted

    si and are modeled as:

    31

  • s1 = n1 + b (3.1)

    s2 = n2 + b+ r (3.2)

    s3 = n3 + b (3.3)

    where

    b is some unknown (but constant across the three measurements) vector representing

    the ground response

    ni is additive zero-mean white Gaussian noise [43]

    r is the response of a buried target.

    represents an arbitrary (non-negative) gain affecting the target response due to

    the targets depth beneath the ground and the sensors height above the ground

    The hypothesis test will be to decide between > 0 and = 0. First, we are

    concerned with obtaining the best estimate of b so that we can estimate r via

    r = s2 b. (3.4)

    We propose the estimator

    b =s1 + s3

    2. (3.5)

    This estimator is widely used in practice [43], but little analysis has been performed

    to evaluate its statistical properties. First we must show that b is unbiased. This is

    easily shown by:

    32

  • E[b] = E[s1 + s3

    2] (3.6)

    =1

    2E[n1 + b+ n3 + b] (3.7)

    = b. (3.8)

    Note that b is a vector. In the following mathematical treatment we exploit

    the assumption that the interfering noise is always white [43], so the measurements

    between data points are uncorrelated. We use bi to represent the ith element of b

    and show that our estimators satisfy our criteria for general bi and thus for b (also,

    xji represents the ith data point in vector x from ground measurement j {1, 2, 3}).

    The variance of bi is:

    VAR[bi] = E[(bi bi)2] (3.9)

    = E[b2i ] b2i (3.10)

    =1

    4E[(s1i + s3i)

    2] b2i (3.11)

    =1

    4E[n21i + n

    23i + 4n1ibi + 4n3ibi + 2n1in3i + 4b

    2i ] b2i (3.12)

    =2n2

    (3.13)

    33

  • To determine optimality, we must show that the variance of bi achieves the CRLB

    (eq. 2.5), using eq. 2.7 for the Fisher information. Since s1 and s3 are distributed as

    N (b, 2nI), we have:

    J(bi) = Ebi [

    bi

    2

    ln(f(s1i, s3i|bi))] (3.14)

    Simplifying from the inside out:

    f(s1i, s3i|bi) = C exp (s1i bi)2 (s3i bi)222n

    (3.15)

    ln(f(s1i, s3i|bi)) = ln(C) + 122n

    (s21i 2s1ibi + b2i + s23i 2s3ibi + b2i ) (3.16)

    biln(f(s1i, s3i|bi)) = 1

    22n(2s1i + 2bi 2s1i + 2bi) (3.17)

    differentiating again yields:

    22n

    (3.18)

    Finally, taking the expected value and multiplying by 1, we have

    J(bi) =2

    2n(3.19)

    And the CRLB is satisfied:

    1

    J(bi)=2n2

    = V AR(bi) (3.20)

    34

  • 103 104

    80

    70

    60

    50

    40

    30

    20

    LogFrequency

    Res

    pons

    e

    Typical Inphase Background Signals vs. LogFrequency

    Figure 3.2: Typical in-phase background measurements visibly shifted by someconstant

    Thus we have the optimal estimator of b given s1 and s3.

    In analyzing the experimental data, we noted that the data received from the

    GEM-3 processor for adjacent background measurements was more variable than

    could be accounted for by additive zero-mean Gaussian noise. The in-phase readings

    appeared to be shifted by some additive constant across the frequency range, and

    the quadrature readings appeared to either be corrupted by an additive term with

    variance that increases across the frequency range, or have some small multiplicative

    noise effects. For examples of these effects, see figures 3.2 and 3.3.

    For clarity, we will refer to the additive-noise quadrature model as quadrature

    model 1, and the multiplicative-noise quadrature model as quadrature model 2. In-

    tuitively, it is reasonable to assume that the in-phase and quadrature signals should

    35

  • 103 104

    80

    70

    60

    50

    40

    30

    20

    10

    LogFrequency

    Res

    pons

    e

    Typical Quadrature Background Signals vs. LogFrequency

    Figure 3.3: Typical quadrature background measurements corrupted by some mul-tiplicative constant, or some additive term which increases with frequency

    be subject to the same noise effects (additive, multiplicative, etc...). However, it is

    unclear which statistical assumptions better model the background interference. For

    completeness, we present the Cramer-Rao lower bound derivations for both cases. We

    proceed to determine whether the previously posed estimator is still optimal when

    the assumptions regarding the statistics of the noise are modified.

    3.2 Additive White Noise and DC Term (in-phase)

    For the in-phase case we will model the extra interference as a random DC term cj

    with variance 2c :

    s1 = n1 + b+ c1 (3.21)

    36

  • s2 = n2 + b+ r+ c2 (3.22)

    s3 = n3 + b+ c3 (3.23)

    We assume that the cj are distributed as N (0, 2c ). Note that while the b andn vectors are functions of frequency, the DC terms cj are constant across frequency.

    Under these assumptions, the si are distributed N (b, I(2n + 2c )). It is easy to showthat the estimator b is unbiased, and that its variance is 2b

    = 2n+

    2c

    2. From the

    distribution of f(s1i, s3i|bi), we can show that the form of the CRLB correspondingto equation 3.16 is:

    ln(f(s1i, s3i|bi)) = ln(C) + 12(2n +

    2c )[(s1i bi)2 + (s3i bi)2] (3.24)

    Differentiating twice with respect to bi yields the equation corresponding to 3.18:

    2(2n +

    2c ). (3.25)

    Multiplying by negative one and taking the inverse, we again find the CRLB equal

    to the variance of the estimator and the estimator is thus optimal under the in-phase

    hypothesis.

    3.3 Additive White Noise and Additive Function of Fre-

    quency (model 1 quadrature)

    This derivation is very similar to the in-phase model. In fact, the in-phase model of

    an additive DC term is really a special case of the general additive vector encountered

    here. In this model, the extra interference is modeled as a vector cj whose individual

    terms cji have variance 2ci:

    37

  • s1 = n1 + b+ c1 (3.26)

    s2 = n2 + b+ r+ c2 (3.27)

    s3 = n3 + b+ c3 (3.28)

    From the observed data, we can see that the variance of the cji increases with fre-

    quency. We assume that the cji are distributed as N (0, 2ci). Let 2c be the vector ofci variances.

    2c =[2c1 ,

    2ci, ..., 2cn

    ]T(3.29)

    The cj vectors are distributed as N (0, I2c). Under these assumptions, the sj aredistributed N (b, I(2n+2c)). It is easy to show that the estimator b is unbiased, andthat its variance is 2b

    = 2n+

    2c

    2. The distribution of the individual sji is:

    f(s1i, s3i|bi) = C exp (s1i bi)2 (s3i bi)2

    2(2n + 2ci)

    (3.30)

    ln(f(s1i, s3i|bi)) = ln(C)+ 12(2n +

    2ci)(s21i2s1ibi+b2i +s23i2s3ibi+b2i ) (3.31)

    biln(f(s1i, s3i|bi)) = 1

    2(2n + 2ci)(2s1i + 2bi 2s1i + 2bi) (3.32)

    differentiating again yields:

    2(2n +

    2ci)

    (3.33)

    38

  • Taking the expected value and multiplying by 1, we have

    J(bi) =2

    (2n + 2ci)

    (3.34)

    And the CRLB is satisfied:

    1

    J(bi)=

    (2n + 2ci)

    2= V AR(bi) (3.35)

    3.4 Additive White Noise and Multiplicative Term (model

    2 quadrature)

    We now consider the quadrature case and assume that multiplicative Gaussian noise

    is affecting the measured background signals. In this model, the multiplicative scaling

    effects known to affect target responses are also assumed to affect the background

    responses. This makes this model perhaps the most intuitively satisfying of all the

    statistical models presented.

    The multiplicative noise terms affecting the background responses are denoted kj

    and are assumed to be distributed as N (1, 2k). The received signals are modeled as:

    s1 = n1 + k1b (3.36)

    s2 = n2 + k2b+ r (3.37)

    s3 = n3 + k3b (3.38)

    39

  • Note that s1 and s3 are distributedN (b, (b22k+2n)I). Furthermore, the estimatorb = s1+s3

    2is still unbiased.

    Since the mean value (bi) enters the signal distribution in the variance as well as

    the mean, the calculations are more complicated. Since we assume that the noise

    interference is white, we can consider the scalar equivalents of the pdf. The variance

    of bi is given by:

    VAR[bi] = E[b2i ] b2i (3.39)

    = E[(n1i + k1i bi + n3i + k3i bi

    2)2] b2i (3.40)

    =1

    4E[n21i + 2 n1i k1i bi + 2 n1i n3i + 2 n1i k3i bi + k21i b2i+ (3.41)

    2 k1i bi n3i + 2 k1i b2i k3i + n23i + 2 n3i k3i bi + k23i b2i ] b2i

    Taking the expected value, we obtain:

    VAR[bi] =2si2

    (3.42)

    with

    2si = (b2i

    2k +

    2n) (3.43)

    To determine optimality, we begin with the conditional probability density func-

    tion:

    f(s1i, s3i|bi) = 12pi2si

    exp[ 122si

    ((s1i bi)2 + (s3i bi)2)] (3.44)

    and apply equation 3.16. After taking the natural logarithm, the equation can be

    separated into two terms from the coefficient and exponential portions of equation

    3.44:

    ln (1

    2pi2si) 1

    22si((s1i bi)2 + (s3i bi)2) (3.45)

    40

  • Differentiating equations 3.45 twice with respect to bi yields:

    (62nb2i

    2k 24n + 2s1ib34k (3.46)

    6s1ibi2k2n + 2s3ib3i4k 6s3ibi2k2n

    3s234kb2i + s232k2n 34ks21ib2i

    +s21i2k

    2n + 2b

    46k 22k4n)/

    (b2i2k +

    2n)

    3

    The expected value operator then replaces s2ji with 2si+b2i and sji with bi, yielding:

    2(2b2i

    4k + b

    2i

    2k +

    2n)

    (b2i2k +

    2n)

    2(3.47)

    The Cramer-Rao lower bound is given by:

    12 (2b2i 4k+b2i 2k+2n)

    (b2i 2k+2n)

    2

    (3.48)

    or:

    1

    2

    4si

    2 b2i 4k + 2si(3.49)

    Note that in this case, our estimator does not achieve the Cramer Rao lower

    bound. In order to determine how close the variance of the proposed estimator is to

    the variance of the optimal estimator, consider the term:

    2 b2i 4k (3.50)

    in the denominator. Since this term differentiates the CRLB from the variance of the

    proposed estimator, as the term approaches zero, the difference between the variances

    becomes negligible.

    41

  • 1e005 0.0333 0.0667 0.1 0.133 0.167 0.2 0.233 0.267 0.30

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    4.5

    5

    k

    b2

    Comparison of CRLB, Sample, and Calculated variances vs. k

    CRLB

    Sample Variance (k

    2*b2 +

    n2

    )/2

    Figure 3.4: Plots of the Cramer-Rao lower bound, calculated, and sample estimatorvariances versus the standard deviation of k. Parameters: bi = 10,

    2n = 1.

    To determine how well the proposed estimator performs compared to the CRLB,

    a set of data was generated under the proposed assumptions and the actual (sample)

    variance of the estimator was compared with the theoretically calculated variance

    of the estimator and the Cramer-Rao lower bound. Figure 3.4 shows the Cramer-

    Rao lower bound, the sample variance from a set of ten thousand data points, and

    the calculated variance of the estimator (2s/2). Note that the difference between the

    CRLB and the sample and computed variances is small, especially for small 2k values.

    In experiments, almost all estimated 2k values were found to be below 0.1 (except for

    the lowest frequency measurement which, due to near-zero average magnitude, had a

    high estimated 2k). Thus, despite not achieving the CRLB, the proposed estimator

    is expected to perform well on this data set.

    42

  • We have shown that the intuitive estimation procedure that involves subtracting

    the mean of the received background signals is optimal under three different assump-

    tions regarding the underlying stochastic nature of the received signals:

    1. if the signal is corrupted by additive white noise

    2. if the signal is corrupted by additive white noise and a Gaussian-distributed

    additive DC term (in-phase)

    3. if the signal is corrupted by additive white noise and a Gaussian-distributed

    additive vector (quadrature model 1)

    and although not optimal, the intuitive procedure is a low-variance estimate when the

    signal is corrupted by additive white noise and a Gaussian-distributed multiplicative

    term (quadrature model 2). In the following chapters we will utilize the proposed

    estimation technique to obtain estimates of the actual target responses for use in our

    detection algorithms.

    43

  • Chapter 4

    Signal Processing Using Matched

    Subspace Detectors

    In chapter 3 we proposed an estimator of the background signal b which is an optimal

    or low-variance estimator under several models of the underlying stochastic processes.

    Using this estimator, we can now estimate the target response via

    r = s2 b. (4.1)

    Using this target response estimate, a detection algorithm that distinguishes between

    landmines and clutter and between different landmine types can be developed. In this

    section the application of matched subspace filters to correctly identify and classify

    landmines is presented.

    4.1 Properties of Estimated Landmine Responses

    We begin by inspecting the responses of different landmine types. As expected, the

    landmines all have unique wideband EMI signatures [20].

    Figure 4.1 shows the estimated in-phase and quadrature responses of five VS-50

    landmines which were obtained by subtracting the estimated background as suggested

    in Chapter 3. These five landmines were buried at depths from 0 to 1.875 inches.

    Figure 4.2 shows the estimated in-phase and quadrature responses of three M-14

    landmines which were obtained in the same manner. These three landmines were

    buried at depths from 0.25 to 1.75 inches.

    The responses of different landmine types are distinguishable from one another

    and, as has been shown (see [20, 43]), the general shape of the responses stays constant

    44

  • 103 104

    100

    0

    100

    200

    300

    400

    500

    LogFrequency

    Res

    pons

    e

    Estimated VS50 Landmine Responses vs. LogFrequency

    InphaseQuadrature

    Figure 4.1: Signatures of VS-50 landmines versus log-frequency

    across measurements despite differences in target-sensor orientation and mine depth.

    Note that the final data point, corresponding to 23,970 Hz, in the estimated sig-

    nals appears to be markedly out of place - especially in the quadrature measurements.

    Comparisons to previous work on landmine responses and the theoretical treatment

    of responses given in chapter 2 led us to believe that the final data points are dis-

    torted. Whether this corruption is a function of the sensor (it is operating at the

    very limit of its frequency range), the additional noise inherent to measurements at

    these frequencies, or user error is unclear. Due to the apparent erroneous nature of

    the highest frequency measurement, the final data point is excluded in the work that

    follows.

    Although the landmine responses are discernible from one another and maintain

    their approximate shape despite differences in their depth, it is clear that the ener-

    45

  • 103 104

    15

    10

    5

    0

    5

    10

    15

    LogFrequency

    Res

    pons

    e

    Estimated M14 Landmine Responses vs. LogFrequency

    InphaseQuadrature

    Figure 4.2: Signatures of M-14 landmines versus log-frequency

    gies of the responses from any particular landmine type vary widely. This problem is

    inherent in real-fielded landmine detection: the depth at which a landmine is buried

    substantially alters the energy of the received signal [36]. This is particularly ev-

    ident in the quadrature responses of high metal-content mines like the VS-50 (see

    figure 4.1). This signal distortion can be modeled as an uncertainty parameter in

    the distributions of our data. Consider an unknown parameter which acts as a

    multiplicative gain on the received data. Physically, represents the depth at which

    the landmine is buried. An effective detector should be robust or invariant to changes

    in the uncertainty parameter . The matched subspace detector is such a detector

    [52].

    46

  • 4.2 Basis Estimation

    In order to apply a matched subspace detector, a linear subspace containing the

    received signals is needed. Alternatively, a set of basis functions that spans the

    responses from a particular landmine type must be found.

    Estimating a signal subspace is a well studied problem [27], but the maximum

    likelihood solution was not appropriate in this situation. The maximum likelihood

    estimate of a signal subspace consists of the p largest eigenvectors of the sample co-

    variance matrix [27]. However, the calibration data available often only contained

    between one and three instances of any particular landmine type. The sample covari-

    ance matrix in this case would clearly be inaccurate. Furthermore, if it is assumed

    that variation in target-sensor distance leads primarily to a change in the gain of the

    received signals, we can very easily model the subspace in a much simpler fashion:

    as scaled versions of a mean vector.

    In figure 4.2 an actual M-14 landmine quadrature response, the mean of all M-14

    landmine responses, and an estimate of the actual M-14 using a scaled version of the

    mean are shown. The error in the resulting signal estimation is about 0.7% of the

    original signals energy. In this particular case the estimation of a landmine response

    as a scaled version of the mean of all landmine responses is very accurate, and this

    result holds across all different landmine types (although the technique performs

    significantly better on the quadrature data).

    The decision to model the different responses as scaled versions of a single re-

    sponse is also intuitively satisfying, since it applies a simple law to account for

    distance-induced differences in measurements. Furthermore, the scaling relationship

    associated with target-sensor distance is well known [36, 35].

    47

  • 103 1042

    4

    6

    8

    10

    12

    14

    16

    LogFrequency

    Res

    pons

    e

    Mean, Actual, and Estimated M14 Responses vs. LogFrequency

    Mean M14 ResponseActual M14 ResponseEstimated M14 Response

    Figure 4.3: Actual, mean, and estimated signatures of M-14 landmines

    4.3 Designing the Matched Subspace Filter

    The clutter present in the blind grid poses a unique problem to traditional subspace

    detection techniques. Clutter is by nature difficult to classify (generally made up of

    anthropic and natural conductors with an enormous range of sizes and shapes). Also,

    the calibration data set contained only 20 clutter responses. One approach considered

    was to model the clutter as a set of basis functions and have a clutter-detection

    algorithm to compare against our landmine detection algorithm. However, attempts

    to formulate a basis to model clutter are inherently limited since clutter is comprised

    of an infinite set of possible shapes, sizes, and materials. Despite the wide range

    of clutter which impedes most detection techniques, a matched subspace detector

    should be somewhat naturally robust to clutter interference. Consider a piece of

    48

  • random clutter whose response is some vector x. Our decision statistic is the cosine

    statistic (equation 2.50):

    =xPHx

    xx. (4.2)

    The numerator can be considered a matched-energy detector since the output of

    the numerator is the amount of the energy in x which lies in the subspace spanned

    by . We have assumed that there is only one basis vector in H corresponding to

    the mean of the landmine responses for a given landmine-type. Therefore, for clutter

    to register a large response in the detector, it must look much like a scaled version

    of our landmine response (i.e. lie in the subspace spanned by the mean vector of the

    landmine responses).

    The standard matched subspace detector is appropriate for finding a single land-

    mine type amongst background or clutter (binary hypothesis test). However, the

    blind grid is populated with various landmine types. In the multiple hypothesis

    test case our detector must decide between H0 and all the alternative hypotheses:

    {H1,H2,...,Hn}. The standard likelihood ratio then becomes:

    =p(x|{H1, H2, ..., Hn})

    p(x|H0) (4.3)

    =p(x|H1)p(H1) + p(x|H2)p(H2) + ...+ p(x|Hn)p(Hn)

    p(x|H0) (4.4)

    =i

    i(x)p(Hi) (4.5)

    Where p(Hi) represents the a priori probability of minetype i. Since all mine types

    are considered equally likely a priori, this reduces to:

    =i

    i(x) (4.6)

    49

  • Equation 4.6 suggests implementing a bank of matched subspace filters and sum-

    ming their outputs to form a decision statistic. However, this formulation also as-

    sumes that the distribution p(x|H0) is known, but in this work, the distribution ofclutter is unknown and difficult to estimate. As an illustrative example of the prob-

    lems encountered when p(x|H0) is unknown, consider n matched subspace filters eachtuned to a specific landmine type. When a landmine response is presented to the

    bank of filters, a typical set of outputs contains one large response coinciding with

    the matched subspace filter tuned to that landmine type. When a clutter response

    is fed to the same bank of filters, although no filter bank produces a particularly

    large result, the clutter vector generates significant responses from several different

    filter banks because the clutter model in the denominator which would normally off-

    set the numerator is missing. That clutter induces significant responses from several

    filter banks makes intuitive sense since all of the landmine responses, when taken

    together, span a large subspace and clutter will undoubtedly have some energy in

    the span of this space. For typical examples of the matched subspace filter bank

    outputs for clutter and landmine data, see figure 4.3. Note that the sum of the out-

    puts across filter banks for the input clutter vector is larger than the sum for the

    landmine vector. In this case, a better (although sub-optimal) decision statistic than

    the summation across the filter banks is the maximum value across the filter banks.

    Although this technique is not equivalent to the Bayesian solution to the multiple

    hypothesis test problem, the similarities are evident. The Bayesian solution to the

    multiple hypothesis testing problem is to choose Hi such that Hi maximizes the a

    posteriori probability p(Hi|x) [48].A bank of matched subspace detectors was thus generated, with each filter tuned

    to a specific landmine type. The decision statistic chosen was the maximum value

    50

  • 2 4 6 8 10 120

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Filter Bank Number

    Filte

    r Ban

    k O

    utpu

    t

    Matched Subspace Outputs vs. Filter Bank for Landmine and Clutter Responses

    Landmine Filter Bank Outputs, sum = 1.0110Clutter Filter Bank Outputs, sum = 1.2803

    Figure 4.4: Comparison of filter bank outputs resulting from landmine and clutterresponses. Note that the sum across the filter banks from the clutter response islarger than from the landmine response.

    across the bank of filters.

    = maxi

    i (4.7)

    Despite not being optimal, we shall see that the performance of this statistic is very

    good. Furthermore, the maximum value across the filter banks provides an intuitive

    method to perform landmine classification - the landmine type corresponding to the

    largest filter bank output is considered our best guess of the underlying landmine type.

    This is different from the maximum a posteriori Bayesian solution which chooses Hi

    to maximize p(Hi|x); here Hi is chosen to maximize the percent energy of x in PHiwhich is an intuitive measure of p(x|Hi).

    Note that since equation 4.2 contains a normalization term in the denominator

    (xx), the detector ensures that the maximum output from the detector is one -

    51

  • regardless of the energy of the input vector. This is important in a bank of detectors

    since the numerator (xPHx) will very often produce a large result for a large input

    energy x.

    The invariance that matched subspace detectors provide to gain occasionally has

    some drawbacks. The primary drawback in this work stems from very low energy

    clutter which often looks like deeply buried high-energy landmines. Consider the

    VS-50 landmine (see fig. 4.1) which has a relatively high energy and rather flat fre-

    quency response. A substantial amount of low energy clutter also has a flat frequency

    response. As a result, scaled low energy clutter often looks like a VS-50 landmine

    to a matched subspace detector.

    However, our prior knowledge regarding the depths at which landmines can be

    buried leads us to conclude that very low energy flat signatures are not landmines

    buried meters in the ground, rather they are small pieces of clutter. In this work we

    assume that landmines will not be buried beyond their tactical depths. We further

    assume that the distribution of landmine depths in the blind grid is uniform and com-

    mensurate with the depths found in the calibration grid. Under these assumptions,

    we implemented an energy pre-screener that evaluates the energy of each potential

    target vector to ensure that it is commensurate with the current filter bank landmine

    type (within one order of magnitude from the lowest and highest energies from the

    calibration grid for that particular landmine type). If the energy is within limits, the

    subspace detector proceeds normally, otherwise that particular bank of the subspace

    detector (wherever the input energy was found to be outside the reasonable range of

    energies for that landmine type) is manually assigned a low output value.

    Besides discriminating between clutter and landmines, detection algorithms must

    also discriminate between empty ground signatures and landmines. While the blind

    grid contains several blank squares containing neither anthropic clutter or landmines,

    52

  • no such squares were measured in the calibration grid, so our detector may be subject

    to false alarms caused by empty grid squares. We did not consider this a serious prob-

    lem because background-corrected responses from blank grid squares should contain

    very little energy and be automatically rejected by the energy pre-screener.

    4.4 Matched Subspace Results

    To determine the effectiveness of our matched subspace detector in discriminating

    landmines from clutter, receiver operating characteristic (ROC) curves were gener-

    ated for the calibration and blind grids. The calibration grid ROCs were generated

    manually, and the blind grid ROCs were generated by the government sponsor of the

    test site. We expect our calibration lane ROCs to be very good since the filter was

    trained on that data, while good results from the blind grid would be an indicator of

    the algorithms robustness.

    Before sending our results to be scored, the algorithm was run on the calibration

    grid to determine its effectiveness. Two separate detectors utilizing the in-phase and

    quadrature data were created and tested. As can be seen in figure 4.4, the algorithm

    performs significantly better on the quadrature data than on the in-phase data. In

    fact, the in-phase results are not significantly better than a simple energy detector.

    We believe the poor in-phase performance is due to the relatively high amount of

    noise inherent in the in-phase readings. Alternatively, the in-phase data may be

    more difficult to model as a linear combination of a set of vectors. Future efforts that

    may improve the in-phase processor results are discussed in chapter 7.

    Figure 4.4 shows the ROCs of the matched subspace filter operating on the blind

    and calibration data as well as a simple baseline energy detector operating on the

    blind grid data. The matched subspace detector is nearly as effective on the blind

    grid as on the calibration grid. This indicates that the algorithm is fairly robust and

    53

  • 0 0.2 0.4 0.6 0.8 10

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Pf

    Pd

    Quadrature MSS ROCInPhase MSS ROC

    Figure 4.5: Comparison of in-phase and quadrature matched subspace receiveroperating characteristics from the calibration grid

    that the assumptions made regarding the interfering noise statistics are reasonable.

    Further, we note the substantial decrease in the false alarm rate as compared to the

    simple energy detector. The matched subspace detector achieves a false alarm rate of

    11% (at a probability of detection of 95%) in the blind grid, which is an improvement

    of over a factor of 6 versus the energy detector.

    The major difference between the two matched subspace curves appears between

    the 60% and 95% probability of detection range. We believe the difference between

    the two curves here stems from the vast amount of clutter present in the blind grid

    as compared to the calibration lanes. The smoothness of the blind-grid ROC stems

    from the 800 or so pieces of clutter present therein, and the discrete-jump nature of

    the calibration ROC stems from the 20 pieces of clutter found there.

    54

  • 0 0.2 0.4 0.6 0.8 10

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Pf

    Pd

    Matched Subspace Detector ROCs in Calibration and Blind Grids

    Calibration MSS ROCBlind MSS ROCBlind Energy Detector ROC

    Figure 4.6: Comparison of quadrature matched subspace detector and baselineenergy detector receiver operating characteristics from the blind and calibration grids

    55

  • Chapter 5

    Decay Rate Estimation

    As discussed in chapter 1, a popular method of discriminating landmines from clutter

    is through characteristic decay rate estimates. In this chapter we discuss why decay

    rates may be useful for target identification and discrimination, the estimation proce-

    dure that has been utilized, the relative locations of poles from the calibration lanes,

    and a simple method of discrimination using Gaussian probability density functions.

    5.1 Decay Rates

    The EMI responses of a highly conducting body are given by equations 2.1 and 2.2

    which are repeated here for convenience.

    H() = a+n

    bn

    jn (5.1)

    S(t) = a(t) +n

    Anent (5.2)

    There is a substantial amount of work in the literature pertaining to estimating

    decay rates from time-domain signals (see [14, 15, 17, 18, 19, 53, 54]). Decay rates

    have been investigated for several reasons. First, they provide a compact space with

    which to model landmine responses. In our experiments two decay rates are used to

    model a signal which has eighteen data points (9 in-phase and 9 quadrature), thus

    the computational load on our detection algorithm is reduced (note that obtaining

    these decay rates is, however, computationally very expensive). Furthermore, as has

    been noted [37], the decay rates should be purely target dependent at the frequencies

    the GEM-3 sensor is operating at. Arguments against using decay rates cite the

    56

  • computational load required to estimate these parameters and the fact that decay

    rates do not provide a sufficient statistic [15].

    5.2 Estimation Procedure

    In this work, we focused on estimating the two primary decay rates from our EMI

    data. In order to estimate 1 and 2, an objective function was generated to minimize

    the mean-square error between our estimated responses and the data. The MATLAB

    function FMINUNC (in the optimization toolbox) was then used to find the optimal

    five parameters to model each landmine. (Five parameters: DC term a, two gains b1,

    b2 and two decay rates 1, 2.)

    Often (especially when modeling clutter), the algorithm used by FMINUNC could

    not find potential solutions any significant distance from the initial values provided.

    This may be due to a local minimum in the objective function near the initial guess.

    In these cases (when the resulting parameters were deemed too close to the initial

    guesses), the initial decay rates were varied over a wide range and the optimization

    was carried out at each point. The resulting estimate with the lowest error was chosen

    as the best estimate of the target decay rates.

    The error in these models was very low across a wide range of mine energies.

    Figures 5.1 and 5.2 show the parametrized fits to the data for one high-energy and

    one low-energy landmine.

    Since the estimated decay rates approximate the actual responses well and the

    estimated responses shapes are highly correlated, it is intuitive to suppose that the

    decay rates estimated from different responses from the same landmine type would be

    clustered together to some degree. Such clustering would indicate that the estimated

    decay rates are drawn from some target dependent distribution and could facilitate

    the formulation of a detector based on them.

    57

  • 103 104

    50

    0

    50

    100

    150

    200

    250

    300

    350

    LogFrequency

    Fitte

    d Re

    spon

    se E

    rror =

    0.3

    3042

    %

    Estimated and Fitted VS50 Responses

    Quadrature DataInphase DataQuadrature FitInphase Fit

    Figure 5.1: Estimation of VS-50 Response

    Several attempts were made to use clustering algorithms available in MATLAB to

    group the different landmines automatically. However, the results obtained seemed

    slightly counter-intuitive and did not take into account our a priori knowledge of

    which estimates were from which landmine types. Figure 5.3 illustrates a clustering

    of decay rates by landmine type made manually, and figure 5.4 provides a closeup of

    the same figure.

    Note the high degree of intra-mine type correlation. The majority of landmines for

    a given type were grouped together. The only two instances where all the responses

    from a particular landmine type were not grouped together were the M-14 HE / non-

    HE landmines. In the calibration lanes, two of the M-14 landmines were measured

    with their primary high-explosive fills present. This altered the responses enough to

    warrant the separation of these M-14s from their counterparts (the difference between

    58

  • 103 104

    15

    10

    5

    0

    5

    10

    15

    LogFrequency

    Fitte

    d Re

    spon

    se E

    rror =

    0.0

    4756

    5%

    Estimated and Fitted M14 Responses

    Quadrature DataInphase DataQuadrature FitInphase Fit

    Figure 5.2: Estimation of M-14 Response

    HE and non-HE landmine responses is documented in [43]).

    The decay rate estimates for the clutter from the calibration grid is shown in

    figure 5.5. As can be seen from the figure, the clutter decay rates are distributed

    throughout the range of frequencies but are more densely concentrated at low values

    of the first decay rate 1.

    5.3 Gaussian Models and Detection

    One of the simplest approaches to incorporate the estimated decay rates into a detec-

    tion algorithm is to model their statistical distribution with a 2-Dimensional Gaussian

    probability density function and generate detectors based on these PDFs. By com-

    bining the probability density functions for the different landmine types, a mixture of

    59

  • 0 1 2 3 4 5x 104

    0

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    4.5

    5x 104

    1

    2

    Clustering of Mine Decay Rate Estimates by Mine Type

    VS50 TS50 M14 PMA3 VAL69 VS2.2M19 TMA4 TM62P3T72 TM46 V31.6

    Figure 5.3: Estimated landmine decay rates plotted against 1 and 2 in Hz. Eachlandmine type is represented by a different shape.

    Gaussian densities is formed. For each cluster of decay rates (clusters do not necessar-

    ily represent all landmines of a given landmine type) the sample mean and variance

    were calculated using standard techniques. However, with so few data points for each

    landmine type, these estimates are suspect. For example, the calibration grid con-

    tains only one instance of certain landmines. These solitary landmines are clustered

    alone. To estimate the variance of their decay rate distribution functions, an estimate

    of the average decay rate variance across landmine types was used. Also, no attempt

    was made to generate estimated correlation matrices since there was rarely enough

    data to make for a decent estimation. Contours of some of the resulting estimated

    Gaussian distributions are shown in figure 5.6. The combination of the separate de-

    cay rate PDFs results in a mixture of Gaussian pdfs across the range of i values.

    60

  • 1000 2000 3000 4000 5000 6000 7000 8000

    0.5

    1

    1.5

    2

    2.5

    x 104

    1

    2

    Clustering of Mine Decay Rate Estimates by Mine Type

    VS50 TS50 M14 PMA3 VAL69 VS2.2M19 TMA4 TM62P3T72 TM46 V31.6

    Figure 5.4: Estimated landmine decay rates plotted against 1 and 2 in Hz(close-up). Each landmine type is represented by a different shape. Note the highdegree of spatial correlation between landmines of each type.

    We assumed that the clutter decay rates were totally random (uniform across

    the range of frequencies) since we had little information to base any general clutter

    model upon. Under this assumption the optimal detector for each landmine type is a

    threshold on the mixture of Gaussian PDFs (or a monotonic function there of). Since

    we have estimated the means and variances of the landmine clusters, this decision

    statistic is a GLRT. To make the detector capable of discerning between all landmine

    types and clutter, we followed the filter bank procedure outlined in the Chapter 5.

    Thus, our results could be used to discriminate between landmine types by choosing

    the filter bank with the highest response to an estimated set of decay rates.

    61

  • 0 2 4 6 8 10x 104

    0

    0.5

    1

    1.5

    2

    2.5

    3

    3.5

    4

    4.5

    5x 105

    2

    1

    Clustering of Estimated Decay Rates for Clutter

    Figure 5.5: Estimated clutter decay rates plotted against 1 and 2 in Hz. Note thatthe estimated decay rates for clutter objects are spread throughout a wide frequencyrange.

    5.4 Decay Rate Estimation Results

    In this section we briefly discuss the ROC curves generated from the parameter based

    detector discussed above. Figure 5.7 shows the ROC generated from the calibration

    data.

    Note that the algorithm does not achieve a 95% detection rate until its false alarm

    rate approaches 35% and the algorithm only achieve a 100% detection rate at a 45%

    false alarm rate. Furthermore, we have good reason to believe that the detector will

    n