contextual learning in ground-penetrating radar data using

11
Contextual Learning in Ground-Penetrating Radar Data Using Dirichlet Process Priors Christopher R. Ratto, Kenneth D. Morton, Jr., Leslie M. Collins, Peter A. Torrione Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708 ABSTRACT In landmine detection applications, fluctuation of environmental and operating conditions can limit the perfor- mance of sensors based on ground-penetrating radar (GPR) technology. As these conditions vary, the classifi- cation and fusion rules necessary for achieving high detection and low false alarm rates may change. Therefore, context-dependent learning algorithms that exploit contextual variations of GPR data to alter decision rules have been considered for improving the performance of landmine detection systems. Past approaches to contextual learning have used both generative and discriminative methods to learn a probabilistic mixture of contexts, such as a Gaussian mixture, fuzzy c-means clustering, or a mixture of random sets. However, in these approaches the number of mixture components is pre-defined, which could be problematic if the number of contexts in a data collection is unknown a priori. In this work, a generative context model is proposed which requires no a priori knowledge in the number of mixture components. This was achieved through modeling the contextual distribu- tion in a physics-based feature space with a Gaussian mixture, while also incorporating a Dirichlet process prior to model uncertainty in the number of mixture components. This Dirichlet process Gaussian mixture model (DPGMM) was then incorporated in the previously-developed Context-Dependent Feature Selection (CDFS) framework for fusion of multiple landmine detection algorithms. Experimental results suggest that when the DPGMM was incorporated into CDFS, the degree of performance improvement over conventional fusion was greater than when a conventional fixed-order context model was used. Keywords: Context-dependent learning, Dirichlet process, ground-penetrating radar, landmine detection 1. INTRODUCTION In recent years, ground-penetrating radar (GPR) has emerged as a useful subsurface sensing technology in landmine detection, 1 and has been successfuly deployed in both humanitarian 2 and military 3 demining systems. A major advantage of GPR over conventional metal detectors is that nonmetal targets often have strong GPR signatures. However, this property also causes nonmetal clutter to be confused with targets in GPR data. To mitigate the false alarm rate of GPR systems, a variety of statistical signal processing solutions have been developed. These include real-time prescreening techniques 4 as well as feature-based classification algorithms. 5–7 It is important for landmine detection systems to have robust performance in varying environmental condi- tions. Unfortunately, GPR is susceptible to scattering and propagation effects of environmental factors such as soil hydrology 8 and rough-surface scattering. 9 Variations in soil moisture alter the electromagnetic properties of the subsurface environment, potentially rendering a target indistinguishable from the surrounding background, and rough surfaces are a potential source of additional clutter. Furthermore, despite the fact that many of the leading algorithms for GPR-based landmine detection are motivated by environmentally-invariant features, comparisons of their performance show that their relative performance is “context-dependent”. 10, 11 Context-dependent fusion has been proposed in recent years to improve system robustness in a changing environment e.g. 11–13 While conventional machine learning techniques are generally used to train a classifier that performs well on aggregate data, context-dependent learning can be used to learn a mixture of classifiers that are better-suited for specific conditions (i.e., contexts ). Context-dependent learning consists of two phases: context identification and target classification. Some of these approaches may be described as discriminative, in that contexts are learned to best discriminate targets and false alarms. 11, 12 An alternative approach is generative Further author information: Send correspondence to P. Torrione ([email protected])

Upload: others

Post on 14-Mar-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Contextual Learning in Ground-Penetrating Radar DataUsing Dirichlet Process Priors

Christopher R. Ratto, Kenneth D. Morton, Jr., Leslie M. Collins, Peter A. Torrione

Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708

ABSTRACT

In landmine detection applications, fluctuation of environmental and operating conditions can limit the perfor-mance of sensors based on ground-penetrating radar (GPR) technology. As these conditions vary, the classifi-cation and fusion rules necessary for achieving high detection and low false alarm rates may change. Therefore,context-dependent learning algorithms that exploit contextual variations of GPR data to alter decision rules havebeen considered for improving the performance of landmine detection systems. Past approaches to contextuallearning have used both generative and discriminative methods to learn a probabilistic mixture of contexts, suchas a Gaussian mixture, fuzzy c-means clustering, or a mixture of random sets. However, in these approaches thenumber of mixture components is pre-defined, which could be problematic if the number of contexts in a datacollection is unknown a priori. In this work, a generative context model is proposed which requires no a prioriknowledge in the number of mixture components. This was achieved through modeling the contextual distribu-tion in a physics-based feature space with a Gaussian mixture, while also incorporating a Dirichlet process priorto model uncertainty in the number of mixture components. This Dirichlet process Gaussian mixture model(DPGMM) was then incorporated in the previously-developed Context-Dependent Feature Selection (CDFS)framework for fusion of multiple landmine detection algorithms. Experimental results suggest that when theDPGMM was incorporated into CDFS, the degree of performance improvement over conventional fusion wasgreater than when a conventional fixed-order context model was used.

Keywords: Context-dependent learning, Dirichlet process, ground-penetrating radar, landmine detection

1. INTRODUCTION

In recent years, ground-penetrating radar (GPR) has emerged as a useful subsurface sensing technology inlandmine detection,1 and has been successfuly deployed in both humanitarian2 and military3 demining systems.A major advantage of GPR over conventional metal detectors is that nonmetal targets often have strong GPRsignatures. However, this property also causes nonmetal clutter to be confused with targets in GPR data.To mitigate the false alarm rate of GPR systems, a variety of statistical signal processing solutions have beendeveloped. These include real-time prescreening techniques4 as well as feature-based classification algorithms.5–7

It is important for landmine detection systems to have robust performance in varying environmental condi-tions. Unfortunately, GPR is susceptible to scattering and propagation effects of environmental factors such assoil hydrology8 and rough-surface scattering.9 Variations in soil moisture alter the electromagnetic properties ofthe subsurface environment, potentially rendering a target indistinguishable from the surrounding background,and rough surfaces are a potential source of additional clutter. Furthermore, despite the fact that many ofthe leading algorithms for GPR-based landmine detection are motivated by environmentally-invariant features,comparisons of their performance show that their relative performance is “context-dependent”.10,11

Context-dependent fusion has been proposed in recent years to improve system robustness in a changingenvironment e.g.11–13 While conventional machine learning techniques are generally used to train a classifierthat performs well on aggregate data, context-dependent learning can be used to learn a mixture of classifiersthat are better-suited for specific conditions (i.e., contexts). Context-dependent learning consists of two phases:context identification and target classification. Some of these approaches may be described as discriminative, inthat contexts are learned to best discriminate targets and false alarms.11,12 An alternative approach is generative

Further author information: Send correspondence to P. Torrione ([email protected])

context learning, in which physically-motivated contextual factors are inferred from the raw data without regardto overall classification performance.13–15 However, in learning a statistical model for context identification,both approaches have historically required the user to specify the model order (i.e., the number of contexts toconsider). This is a major disadvantage for classifying data from uncertain environments.

This work focuses on the generative approach to contextual learning. In previous work, supervised techniqueswere used for training a probabilistic context model with qualitative soil/moisture labels corresponding to thelanes over which data was collected.13,14 However, supervised context models are somewhat impractical becauseheuristic labels may not accurately characterize the relevant environmental factors affecting the data. Therefore,unsupervised context models were also considered.15 While performance improvements were achieved by usingunsupervised context models, the degree of improvement varied substantially as a function of model order. Inthis work, the unsupervised context modeling technique was improved upon by incorporating Dirichlet process(DP) priors into the learning procedure. The use of DP priors promotes sparseness in the model’s parameters,and automates learning of the model order.16 Experimental results show that DP priors allow context-dependentlearning to be performed without specifying the number of contexts, while still achieving improved performanceover conventional learning.

The remainder of this paper is organized as follows. Section 2 presents a summary of the GPR system anddata under consideration in this paper. The context-dependent feature selection (CDFS) framework used in thispaper, as well as in previous work, is described in Section 3. Background information regarding the DP andvariational Bayes inference are reviewed in Section 4. Experimental results are then illustrated in Section 5.Finally, concluding remarks and plans for future work are made in Section 6.

2. GPR SYSTEM AND DATA

The GPR used to collect data for this work was manufactured by NIITEK, Inc.17 for vehicular route clearanceapplications. The antenna array consists of 51 independent channels, spanning about 3 m across the front ofa countermine vehicle. Measurements are collected from each channels at increments of roughly 5 cm as thevehicle moves forward. The GPR signal for this system is a differentiated Gaussian pulse, with an approximatebandwidth of 200 MHz - 7 GHz.

The GPR data used in this work, as well as in previous investigations,14,15 was collected with the NIITEKvehicle-mounted GPR system between June 2004 and March 2007. Data was collected at three US governmenttest sites in geographically distinct locations: a temperate Eastern site, a temperate Central site, and a aridWestern site. Data was collected over 12 different test lanes between the three sites. In total, data was collectedover an area of 16,770 square meters. Each lane was prepared with emplaced targets, and one lane includedemplaced clutter. The targets consisted of 13 types of antitank landmines and several large artillery shells. Aprescreening algorithm4 was run on the raw GPR data to record the locations of subsurface anomalies (i.e.,alarms) to be classified by subsequent processing. In total, 1,843 alarms were recorded.

Qualitative context labels were available from the data collection logs: The lane construction was either dirt,gravel, or asphalt, and moisture conditions were recorded as “dry”, “mid”, and “wet.” The distribution of alarmsover the 7 recorded combinations of lane type and moisture is summarized by the pie chart in Figure 1.

Examples of GPR B-scans collected at prescreener alarms are shown in Figure 2. Each image represents datafrom a single channel, at several locations in the downtrack direction (the direction of forward motion), over asubsurface anomaly. The top row illustrates the signature of low-metal anti-tank (AT) mine, buried 3 inchesbelow the surface, in 4 different contexts. The bottom row shows examples of false alarms in each of the samecontexts. False alarms may be due to subsurface clutter, such as rocks, subsurface layering, or emplaced objects.False alarms may also be due to errors in removing the ground response (the strong reflection occuring aroundtime index 100) during prescreening.

3. CONTEXT-DEPENDENT FEATURE SELECTION

The approach to context-dependent classification of prescreener alarms was originally presented as context-dependent feature selection (CDFS).14 CDFS consists of two phases: context identification followed by target

Figure 1. Pie chart summarizing the distribution of prescreener alarms across the 7 labeled contexts.

Figure 2. Example GPR B-scans of a low-metal AT landmine (top row) and false alarms (bottom row) in four differentcontexts.

Figure 3. Flowchart illustrating the specific procedures taken by the CDFS framework.15

classification. Each phase is briefly described in the following subsections, and a flowchart outlining the CDFSframework is presented in Figure 3.

3.1 Context Identification

In the context identification step, contextual features (fC) are extracted from the GPR background, preprocessedusing normalization and principal components analysis (PCA), and fed to a probabilistic C-ary classifier (aGMM, in this case). The contextual features include 10 measurements of the ground bounce arrival time (tGB),estimation of the air/ground reflection coefficient (Γ), and mean-square error of 10 AR models implemented ontime-slices of GPR data (PAR). The details of extracting these features are published in previous work.13,15 A30-dimensional feature vector was formed from these features for each alarm, i.e. fC = [tGB,Γ,PAR].

The GMM context model may be trained through supervised or unsupervised means. If supervised, in whicheach Gaussian component represents a single context, the GMM reduced to a simple Bayesian hypothesis test. Ifunsupervised, the GMM parameters may be learned through the expectation-maximization or variational Bayestechniques. However, both approaches require the number of contexts to be pre-determined. Instead, this workuses the DPGMM discussed previously to automatically learn the number of contexts, C. The DPGMM wastrained and implemented on the 3-dimensinal principal components projection of fC. The DPGMM outputs Ccontext posteriors, p(ci|fC), i = 1, 2, ..., C, for each prescreener alarm.

3.2 Target Classification

In the target classification phase, features (fT) are extracted from the data to characterize the prescreener alarmas a target signature or as clutter. As in previous work,13,14 the target classification features were extractedusing multiple algorithms from the recent literature. The features are summarized by Table 1, and the targetclassification feature vector was 72 dimensons total.

The target features were fed to a mixture of relevance vector machines (RVMs), which were trained throughvariational Bayes inference.19 RVMs are useful in context-dependent learning because the sparseness-promotingpriors force many of the discriminant weights to zero. If the features are only transformed by a DC bias kernel(i.e. φ(fT) = [1, fT]), individual columns of the Gramm matrix correspond to individual features. Therefore,implementing an RVM will also perform feature selection.20 In context-dependent learning, this property allowsfor different contexts to potentially utilize different subsets of fT in classifying targets from clutter. Each RVMoutptuts its own within-context target posterior, p(H1|fT, ci), i = 1, 2, ..., C.

Table 1. Target Features

Algorithm Features Used

Prescreener4 Confidence value (scalar)

EHD6 Mean confidence value (scalar)

Mean edge histogram (40-D)

HMM5 Confidence value (scalar)

Observation sequence (8-D)

SPSCF7 Confidence value (scalar)

Subspace projection coefficients (20-D)

GEOM18 Mean confidence value (scalar)

Mean area/filled area ratio (scalar)

Mean fixed compactness (scalar)

Mean adaptive compactness (scalar)

Mean eccentricity (scalar)

Mean solidity (scalar)

3.3 Calculation of Posterior Confidence

After performing context identification and target classification, a final confidence value can be calculated byintegrating the RVM output over the posterior uncertainty in context:

p (H1|x = [fT, fC]) =

C∑i=1

p(H1|fT, ci)p(ci|fC) (1)

This value is ultimately thresholded for decision-making purposes. Alarms with p(H1|x) greater than thethreshold are declared targets, and alarms with p(H1|x) lower than the threshold are dismissed. By thresholdingp(H1|x) with different values, a pseudo-ROC curve can be plotted illustrating probability of detection (PD) as afunction of false alarm rate (FAR), measured in false alarms per square meter. In calculating PD, alarms withina radius of 25 cm of a target were considered successful detections. To calculate FAR, the total number of falsealarms were divided by the total collection area.

4. DIRICHLET PROCESS MIXTURE MODELS

As mentioned previously, the GMM can be used for inferring a mixture distribution through unsupervisedlearning. The likelihood function for a GMM is a product of C independent Normal distributions, each withmean µ, precision matrix Γ, and proportion π:

p(x|Θ) =

C∑i=1

πiN (x|µi,Γ−1i ) (2)

The model given by (2) is of fixed order (C). To express uncertainty in the GMM model order, DP priorscan be used.16 The DP serves as a prior density for the density (G) of model parameters, characterized by abase distribution (G0) and sparseness parameter (γ):

G ∼ DP (G0, γ) (3)

In a nonaparametric Bayesian model, there can potentially be as many unique parameters as there are data.Therefore, the parameters are independently drawn from G:

θn ∼ G, n = 1, 2, ..., N (4)

Figure 4. Example of DPGMM learned on mixture of 9 Gaussian distribution. The predictive density at iterations 1-9 ofVB learning is shown on the top row, and the matrix of component posterior probabilities is shown on the bottom row.

For the GMM, let each θn = {µn,Γ−1n }, and let θ∗

i , i = 1, 2, ... be the unique values of θ. The prior, G, isgiven by the stick-breaking representation of the DP:21

1. Draw vi|γ ∼ Beta(1, γ)

2. Draw θ∗i |G0 ∼ N (µ∗

i |m0, (β0Γ∗i )−1)W(Γ∗

i |W0, ν0), i = 1, 2, ...

3. Calculate mixture proportions π(v) = vi∏i−1

j=1(1− vj), i = 1, 2, ...

4. For n = 1, 2, ..., N

(a) Draw indicator variable cn|v ∼ Multi(π(v))

(b) Draw data xn|cn ∼ N (x|θ∗i )

Although this model permits an infinite number of unique parameters, the DP prior exhibits a clustering effect inθ∗i . This is analagous to breaking a stick into an infinite number of pieces, where the vast majority of the pieces

are of negligible size. Therefore, the DP prior will enforce a de facto model order (C) of mixture componentswith non-negligible proportions πvi , i = 1, 2, ..., C.

Because closed-form Bayesian inferene cannot be performed with the DP prior, variational inference was usedin this work to train the DPGMM model.16 This approach begins with an arbitrarily-high truncation level, T , andcalculates a variational lower bound to the log-evidence at each iteration. The variational parameters are thenupdated to maximize the variational lower bound for the next iteration. As the expectations of certain values ofπ(v) become small, the corresponding mixture components are effectively pruned from the model. Consider theexample illustrated by Figure 4, in which a DPGMM is trained on a mixture of 9 Gaussian distributions arrangedin a diamond shape. The VB algorithm was initialized with T = 20 components using c-means clustering, andextraneous clusters were pruned from the model at each iteration. By the ninth iteration, the 9 clusters werecorrectly identified.

5. EXPERIMENTAL RESULTS

CDFS was evaluated on the GPR data described in Section 2. Two CDFS techniques were compared, with theonly difference between the two being the choice of context model. One technique incorporated the DPGMM

Figure 5. Scatter plots of 3-D PCA of GPR context features, with points colored according to context label (left) andMAP DPGMM context (right).

context model (CDFS-DPGMM) and another used a supervised context model (CDFS-Supervised). The super-vised context model was a simple Bayesian hypothesis test, with each context described by a single Gaussiandistribution in feature space.

5.1 Context Identification Results

Context identification was performed on the 3-D principal components projection of the context features. Thisreduced-dimensionality space is plotted in Figure 5. The left plot illustrates points colored according to thecontext label obtained at data collection, and the right plot illustrates point collored according to their maximuma posteriori DPGMM component. There appears to be some similarity between the context labels and theDPGMM result. One technique for measuring the similarity between different clusterings is the adjusted mutualinformation (AMI) measure,22 which calculates the mutual information between two different partitions of adata set and corrects for chance similarity. An AMI of zero corresponds to two random partitions, and an AMIof one corresponds to two identical partitions. The AMI of the context labels and the DPGMM result for thisdata is 0.42.

Details of the similarity between the labeled and DPGMM contexts are summarized by the similarity matrixin Figure 6. DPGMM context 1 mostly consists of alarms from wet dirt, with the remaing alarms split betweenmid dirt and wet gravel. Context 2 almost entirely consists of wet asphalt alarms. Context 3 is split betweendry dirt and dry asphalt. Virtually all of the alarms in context 5 are from dry asphalt, and virtually all of thealarms in context 6 are from mid dirt. Context 4 is somewhat evenly-split across 5 of the labeled contexts.

These results suggest that the DPGMM may have identified phyiscally-relevant contexts in this data set.One criticism of the supervised approach to context identification is that different soil labels may have similarelectromagnetic properties, so GPR would behave similarly in both contexts. Another criticism is that the labelsare too qualitative to describe the differences in electromagnetic properties across soils with different hydrologicalproperties. The modest AMI between the DPGMM result and the context labels suggests that the unsupervisedcontexts may be informative of factors beyound the scope of the available context labels.

5.2 Target Classification Results

After performing context identification, RVMs were trained on target classification features from each context.As mentioned in Section 3.2, RVM training essentially performs feature selection by enforcing sparseness inthe weights. Stem plots of the RVM weights for each target classification feature in each context are shownin Figure 7. As shown by the plot, each context required a unique subset of target classification features.

Figure 6. Siimilarity matrix illustrating overlap of labeled and DPGMM-identified contexts.

Figure 7. Stem plot of RVM weights assigned to target classification features for each DPGMM context.

No features, including the confidence values of the fused algorithms, appear to be universally relevant. Somealgorithms, such as the prescreener and the HMM, only appear to be relevant in a few contexts. This is a similarresult to other approaches10,11 which adapt fusion based the algorithms’ local performance.

5.3 Overall Classification Performance

CDFS was evaluated on the alarm set using 10-fold crossvalidation, in which all alarms corresponding to thesame physical object (target or clutter) were included in the same crossvalidation fold. Each crossvalidationfold maintained roughly equal ratios of target types from each lane. The performance of CDFS-DPGMM was

Figure 8. Pseudo-ROC curve comparing classificaiton performance of CDFS with DPGMM context model (blue) to CDFSwith supervised context model (green) and RVM (red). Performance of EHD (orange dashed), SPSCF (magenta dashed),HMM (cyan dashed), and Prescreener (black dashed) also shown.

compared to CDFS-Supervised, a single RVM incorporating no contextual information, and the EHD, SPSCF,HMM, and prescreener algorithms.

The pseudo-ROC curves for this experiment are shown in Figure 8. The best performance was achieved byCDFS-DPGMM, indicated by the blue curve. The error bars indicate its 90% confidence region, calculated byassuming a Bernoulli distribution for sucessful detections. The ROC curves of CDFS-Supervised and the RVMare outside of the error bars at PD ≤ 0.90, indicating a significant improvement in performance at these levels.The legend indicates the FAR at benchmark PDs of 0.85, 0.90, and 0.95. CDFS-DPGMM improved over theRVM and CDFS-Supervised at each of these levels. Compared to the RVM, CDFS-DPGMM reduced the FARby 33.2% at PD = 0.85, 33.8% at PD = 0.90, and 22.7% at PD = 0.95. Compared to CDFS-Supervised, FARwas improved by 44.2% at PD = 0.85, 36.6% at PD = 0.90, and 13.0% at PD = 0.95.

These results indicate that CDFS-DPGMM exploited more relevant contextual information than CDFS-Supervised in performing context-dependent learning. Although CDFS-Supervised shows some improvementover the RVM, the CDFS-DPGMM ROC curve illustrates more consistent performance gains. Furthermore,CDFS-DPGMM shows significant reductions in FAR ad PD ≥ 0.9. This illustrates that the contexts learnedthrough the DPGMM were relevant in improving the classification of difficult alarms, in addition to easy ones.

6. CONCLUSION

This work presented a technique for performing context-dependent fusion without specificing the number ofcontexts a priori. Automatic determination of the number of contexts was achieved by expressing uncertainty inthe order of a statistical context model through a DP prior. To illustrate the efficacy of this approach, a DPGMMcontext model was incorporated into the CDFS framework developed in previous work. The DPGMM wastrained on physics-based GPR features that were developed in prior work to characterize sensor phenomenology.Analysis of the performance of context identification showed that the contexts identified by the DPGMM maybe physically-motivated, but are beyond the scope of the available context labels. Analysis of the performanceof target classification verified the results of previous studies, in which certain algorithms are found to be morerelevant than others in classifying alarms in certain contexts. Finally, a comparison of the ROC curves of

CDFS-DPGMM and CDFS-Supervised illustrated that CDFS-DPGMM achieved more consistent performanceimprovements over conventional RVM-based fusion.

In order to better-complement the existing landmine detection literature,11,12,23 future work must explorethe application of DP priors to discriminative context-dependent learning, which has been explored in severallandmine detection papers. However, this application should also incorporate the physics-based features thatwere used in this work and shown to aid in improving classification performance. Furthermore, the DP shouldalso be considered as an additional means for exploiting spatial variations in GPR context. This was originallyaccomplished by modeling context with a fixed-order HMM,24 and a natural extension of this work would use theDP to express uncertainty in the number of HMM states.25 This potential application of DP priors in context-dependent learning may yield additional improvements to classification, as spatial dependencies and physicalcharacteristics of the data would be incorporated into inference of the underlying contextual factors.

ACKNOWLEDGMENTS

This work was sponsored by the U.S. Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate,via a grant administered by the Army Research Office (#W911NF-09-1-0487 and #W911NF-06-1-0357).

REFERENCES

[1] Daniels, D. J., [Unexploded Ordnance Detection and Mitigation ], ch. Ground Penetrating Radar for BuriedLandmine and IED Detection, 89–111, Springer (2009).

[2] Sato, M., “GPR evaluation test for humanitarian demining in cambodia,” in [Geoscience and Remote SensingSymposium (IGARSS), 2010 IEEE International ], 4322–4325 (2010).

[3] Sherbondy, K., “Status of vehicular mounted mine detection (vmmd) program,” in [Second InternationalConference on the Detection of Abandoned Land Mines ], 203 –207 (October 1998).

[4] Torrione, P. A., Throckmorton, C. S., and Collins, L. M., “Performance of an adaptive feature-basedprocessor for a wideband ground penetrating radar system,” IEEE Transactions on Aerospace and ElectronicSystems 42(2), 644 (2006).

[5] Gader, P. D. and Zhao, M. Y., “Landmine detection with ground penetrating radar using hidden markovmodels,” IEEE Transactions on Geoscience and Remote Sensing 39(6), 1231–1244 (2001).

[6] Frigui, H. and Gader, P., “Detection and discrimination of land mines in ground-penetrating radar basedon edge histogram descriptors and a possibilistic k-nearest neighbor classifier,” IEEE Transactions on FuzzySystems 17(1), 185–199 (2009).

[7] Ho, K. C., Gader, P. D., Wilson, J. N., and Frigui, H., “On improving subspace spectral feature techniquefor the detection of weak scattering plastic antitank landmines,” in [Proceedings of the SPIE Detection andSensing of Mines, Explosive Objects, and Obscured Targets XIV ], 7303, 73032D, SPIE (2009).

[8] Miller, T. W., Hendrickx, J. M. H., and Borchers, B., “Radar detection of buried landmines in field soils,”Vadose Zone Journal 3(4), 1116–1127 (2004).

[9] El-Shenawee, M. and Rappaport, C. M., “Quantifying the effects of different rough surface statistics for minedetection using the FDTD technique,” in [Proceedings of the SPIE Detection and Remediation Technologiesfor Mines and Minelike Targets V ], 4038, 966–975 (2000).

[10] Wilson, J. N., Gader, P., Lee, W. H., Frigui, H., and Ho, K. C., “A large-scale systematic evaluation ofalgorithms using ground-penetrating radar for landmine detection and discrimination,” IEEE Transactionson Geoscience and Remote Sensing 45(8), 2560–2572 (2007).

[11] Frigui, H., Zhang, L., and Gader, P., “Context-dependent multisensor fusion and its application to landmine detection,” IEEE Transactions on Geoscience and Remote Sensing 48(6), 2528–2543 (2010).

[12] Bolton, J. and Gader, P., “Random set framework for context-based classification with hyperspectral im-agery,” IEEE Transactions on Geoscience and Remote Sensing 47(11), 3810–3821 (2009).

[13] Ratto, C. R., Torrione, P. A., and Collins, L. M., “Exploiting Ground-Penetrating radar phenomenologyin a Context-Dependent framework for landmine detection and discrimination,” Geoscience and RemoteSensing, IEEE Transactions on PP(99), 1–12 (2010).

[14] Ratto, C. R., Torrione, P. A., and Collins, L. M., “Context-dependent feature selection for landmine detec-tion with ground-penetrating radar,” in [Proceedings of the SPIE: Detection and Sensing of Mines, ExplosiveObjects, and Obscured Targets XIV ], 7303, 730327 (2009).

[15] Ratto, C. R., Torrione, P. A., and Collins, L. M., “Context-dependent feature selection using unsupervisedcontexts applied to gpr-based landmine detection,” in [Proceedings of the SPIE: Detection and Sensing ofMines, Explosive Objects, and Obscured Targets XV ], Harmon, R. S., Broach, J. T., and Holloway, Jr, J. H.,eds., 7664, 7664–88 (April 2010).

[16] Blei, D. and Jordan, M., “Variational inference for dirichlet process mixtures,” Bayesian Analysis 1(1),121–144 (2006).

[17] NIITEK, Inc., NIITEK Web Site. Available: http://www.niitek.com.

[18] Gader, P., Lee, W. H., and Wilson, J. N., “Detecting landmines with ground-penetrating radar usingfeature-based rules, order statistics, and adaptive whitening,” IEEE Transactions on Geoscience and RemoteSensing 42(11), 2522–2534 (2004).

[19] Bishop, C. M. and Tipping, M. E., “Variational relevance vector machines,” in [Proceedings of the 16thConference on Uncertainty in Artificial Intelligence ], 4653 (2000).

[20] Li, Y., Campbell, C., and Tipping, M., “Bayesian automatic relevance determination algorithms for classi-fying gene expression data,” Bioinformatics 18, 1332–1339 (Oct. 2002).

[21] Ishwaran, H. and James, L. F., “Gibbs sampling methods for stick-breaking priors,” Journal of the AmericanStatistical Association 96(453), 161–173 (2001).

[22] Vinh, N. X., Epps, J., and Bailey, J., “Information theoretic measures for clusterings comparison: is acorrection for chance necessary?,” in [Proceedings of the 26th Annual International Conference on MachineLearning ], 1073–1080, ACM, Montreal, Quebec, Canada (2009).

[23] Xue, Y., Liao, X., Carin, L., and Krishnapuram, B., “Multi-task learning for classification with dirichletprocess priors,” Journal of Machine Learning Research 8, 35–63 (2007).

[24] Ratto, C., Torrione, P., Morton, K., and Collins, L., “Context-dependent landmine detection with ground-penetrating radar using a hidden markov context model,” in [IEEE International Geoscience and RemoteSensing Symposium (IGARSS) ], (2010).

[25] Paisley, J. and Carin, L., “Hidden markov models with Stick-Breaking priors,” Signal Processing, IEEETransactions on 57(10), 3905–3917 (2009).