nathalie japkowicz, colin bellinger, shiven sharma, rodney berg, kurt ungar university of ottawa,...
TRANSCRIPT
Anomaly Detection in Gamma Ray Spectra: A Machine Learning
Perspective
Nathalie Japkowicz, Colin Bellinger, Shiven Sharma, Rodney Berg, Kurt
Ungar
University of Ottawa, Northern Illinois UniversityRadiation Protection Bureau, Health Canada
Goal and MethodologyGoal: To identify people concealing radioactive
material that may represent a threat to attendees at public gatherings.
Methodology: Analysis of Gamma-Ray spectra produced by spectrometer s at short intervals of time and decision on the fly of whether a threat is present.
General idea: to place spectrometers in strategic locations (e.g., the entry points to the event) and try to detect whether the new spectra coming in are similar or different from a normal spectrum for this particular location.
Gamma-Ray Spectroscopy (Wikipedia)
The gamma-ray spectrum of natural uranium, showing about a dozen discrete lines superimposed on a smooth continuum, allows the identification the nuclides 226Ra, 214Pb, and 214Bi of the uranium decay chain.
The quantitative study of theEnergy spectra of gamma-ray Sources.
Most radioactive sources produce gamma rays ofvarious energy levels and intensities
The data
I= Iodine, Tc=Technicium, Th= Thallium, Cs=Cesium, Co=Cobalt
Approach To apply Machine Learning/Pattern recognition
techniques to the data.Issue 1: There is a lot of background data, but very
few alarms. E.g., for one station: 24,712/6Data was augmented with simulated Cobalt entries
(though we only used that data for testing)We used one-class learning/anomaly detection
algorithms to deal with this extreme class imbalanceIssue 2: We discovered that rain was a problem as it
masked the presence of isotopes in the spectra.Since we had labelled data of both the rain and non-rain
classes, we used binary classification on this problem.
The effect of rain
Hypothesis
Separating rain from non-rain data in a first phase and
applying an anomaly detection system on each
group of data separately in a second phase could help
us improve the results.
Approach (cont’d)
Experiments
Experiments (Cont’d)We experimented with different classifiers in
both phases.Phase 1:
Classifiers tried: SVM, J48, NB, MLP and IBL.Winner: NB
Phase 2:Classifiers tried: oc-SVM, AA, Mahalanobis
DistanceWinner: Mahalanobis Distance
Experiments (Cont’d)
Results
Conclusions and report on further experiments