Download - Automatic detection of microchiroptera echolocation calls

Automatic detection of microchiroptera echolocation calls from field recordings

using machine learning algorithms

Mark D. Skowronski and John G. Harris

Computational Neuro-Engineering Lab

Electrical and Computer Engineering

University of Florida, Gainesville, FL, USA

May 19, 2005

Overview• Motivations for acoustic bat detection

• Machine learning paradigm

• Detection experiments

• Conclusions

Bat detection motivations• Bats are among the most diverse yet least

studied mammals (~25% of all mammal species are bats).

• Bats affect agriculture and carry diseases (directly or through parasites).

• Acoustical domain is significant for echolocating bats and is non-invasive.

• Recorded data can be volumous automated algorithms for objective and repeatable detection & classification desired.

Conventional methods• Conventional bat detection/classification parallels

acoustic-phonetic paradigm of automatic speech recognition from 1970s.

• Characteristics of acoustic phonetics:– Originally mimicked human expert methods– First, boundaries between regions determined – Second, features for each region were extracted– Third, features compared with decision trees, DFA

• Limitations:– Boundaries ill-defined, sensitive to noise– Many feature extraction algorithms with varying degrees of noise

robustness

Machine learning• Acoustic phonetics gave way to machine

learning for ASR in 1980s:• Advantages:

– Decisions based on more information– Mature statistical foundation for algorithms– Frame-based features, from expert knowledge– Improved noise robustness

• For bats: increased detection range

Detection experiments• Database of bat calls

– 7 different recording sites, 8 species– 1265 hand-labeled calls (from spectrogram

readings)

• Detection experiment design– Discrete events: 20-ms bins– Discrete outcomes: Yes or No: does a bin

contain any part of a bat call?

Detectors• Baseline

– Threshold for frame energy

• Gaussian mixture model (GMM)– Model of probability distribution of call features– Threshold for model output probability

• Hidden Markov model (HMM)– Similar to GMM, but includes temporal constraints through piecewise-

stationary states– Threshold for model output probability along Viterbi path

Feature extraction• Baseline

– Normalization: session noise floor at 0 dB– Feature: frame power

• Machine learning– Blackman window, zero-padded FFT– Normalization: log amplitude mean subtraction

• From ASR: ~cepstral mean subtraction• Removes transfer function of recording environment• Mean across time for each FFT bin

– Features:• Maximum FFT amplitude, dB• Frequency at maximum amplitude, Hz• First and second temporal derivatives (slope, concavity)

Feature extraction examples

Feature extraction examples

Six features: Power, Frequency, P, F P, F

Detection example

Experiment results

Conclusions• Machine learning algorithms improve detection

when specificity is high (>.6).• HMM slightly superior to GMM, uses more

temporal information, but slower to train/test.• Hand labels determined using spectrogram,

biased towards high-power calls.• Machine learning models applicable to other

species.

Bioacoustic applications• To apply machine learning to other species:

– Determine ground truth training data through expert hand labels

– Extract relevant frame-based features, considering domain-specific noise sources (echos, propellor noise, other biological sources)

– Train models of features from hand-labeled data– Consider training “silence” models for discriminant

detection/classification

Further information• http://www.cnel.ufl.edu/~markskow• [email protected]

AcknowledgementsBat data kindly provided by:

Brock Fenton, U. of Western Ontario, Canada

Download - Automatic detection of microchiroptera echolocation calls

Top Related