nathalie japkowicz, colin bellinger, shiven sharma, rodney berg, kurt ungar university of ottawa,...

Anomaly Detection in Gamma Ray Spectra: A Machine Learning

Perspective

Nathalie Japkowicz, Colin Bellinger, Shiven Sharma, Rodney Berg, Kurt

Ungar

University of Ottawa, Northern Illinois UniversityRadiation Protection Bureau, Health Canada

Goal and MethodologyGoal: To identify people concealing radioactive

material that may represent a threat to attendees at public gatherings.

Methodology: Analysis of Gamma-Ray spectra produced by spectrometer s at short intervals of time and decision on the fly of whether a threat is present.

General idea: to place spectrometers in strategic locations (e.g., the entry points to the event) and try to detect whether the new spectra coming in are similar or different from a normal spectrum for this particular location.

Gamma-Ray Spectroscopy (Wikipedia)

The gamma-ray spectrum of natural uranium, showing about a dozen discrete lines superimposed on a smooth continuum, allows the identification the nuclides 226Ra, 214Pb, and 214Bi of the uranium decay chain.

The quantitative study of theEnergy spectra of gamma-ray Sources.

Most radioactive sources produce gamma rays ofvarious energy levels and intensities

http://en.wikipedia.org/wiki/File:Gammaspektrum_Uranerz.jpg


http://en.wikipedia.org/wiki/Uranium

http://en.wikipedia.org/wiki/Nuclide

http://en.wikipedia.org/wiki/Radium

http://en.wikipedia.org/wiki/Lead

http://en.wikipedia.org/wiki/Bismuth

http://en.wikipedia.org/wiki/Decay_chain



The data

I= Iodine, Tc=Technicium, Th= Thallium, Cs=Cesium, Co=Cobalt

Approach To apply Machine Learning/Pattern recognition

techniques to the data.Issue 1: There is a lot of background data, but very

few alarms. E.g., for one station: 24,712/6Data was augmented with simulated Cobalt entries

(though we only used that data for testing)We used one-class learning/anomaly detection

algorithms to deal with this extreme class imbalanceIssue 2: We discovered that rain was a problem as it

masked the presence of isotopes in the spectra.Since we had labelled data of both the rain and non-rain

classes, we used binary classification on this problem.

The effect of rain

Hypothesis

Separating rain from non-rain data in a first phase and

applying an anomaly detection system on each

group of data separately in a second phase could help

us improve the results.

Approach (cont’d)

Experiments

Experiments (Cont’d)We experimented with different classifiers in

both phases.Phase 1:

Classifiers tried: SVM, J48, NB, MLP and IBL.Winner: NB

Phase 2:Classifiers tried: oc-SVM, AA, Mahalanobis

DistanceWinner: Mahalanobis Distance

Experiments (Cont’d)

Results

Conclusions and report on further experiments

nathalie japkowicz, colin bellinger, shiven sharma, rodney berg, kurt ungar university of ottawa,...

Documents

cobalt slide

experiments contd slide

effect of rain slide

approach contd slide

mahalanobis distance

nonrain data

health canada slide

group of data