detecting intoxicated speech - columbia universitydgw2109/presentation.pdf · alc – alcohol...
TRANSCRIPT
{
Detecting Intoxicated Speech
Daniel Wilkey John Graham CS6998
Given speech, was the speaker intoxicated?
Interspeech 2011 Intoxication Challenge
Application for field sobriety testing, ignition-guards
Background
ALC – Alcohol Language Corpus
162 total participants: 84 male, 78 female
Participants reached a BAC .28 – 1.75
Read 15 minutes of intoxicated speech
Returned 2 weeks later
Read 30 minutes of sober speech
The Corpus
5400 samples in total, 75 per person
Divided into 3 sets:
Development, Training, Test
Development & Training are labeled with 4368 features
Used cross validation to obtain results
The Corpus p2
Shrikanth Narayanan of UCLA
Global speaker normalization
Normalizing by the sober class
Relative improvement of 7.04% overall
Professor Hirchberg
Phonotactic and phonetic cues
Experiment tests un-weighted average recall… why?
We chose f-measure
Includes recall and precision
Prior Research
Remove extraneous features with WEKA
Info-gain ratio algorithm
MFCC features performed well
No F0-based features near the top
Experiment Preparation
Ignore test set
unlabeled
Down-sampling the training set
Achieved 50/50 ratio of alcoholised to non-alcoholised speech
Experiment Preparation
Global Speaker Normalization (Narayanan) Insignificant negative change
Sober class normalization (Narayanan) Insignificant negative change
Gender class normalization Insignificant positive change
Combining global speaker with gender normalization 10.75% relative improvement in f-measure
Poor performance potentially related to some F0 features being filtered out
Normalization Attempts
Tried retesting data with fringe cases omitted
Fringe case BAC between .08% and .16% proposed by Batliner
We tried .02% to .08%
Difference in data set and threshold
Relative decrease of F-measure by 3.25%
On the Fringe
Machine Learning Optimizations
Optimizing the SVM
Varied polynomial kernels
Radial basis function (RBF)
Varying number
Folds
Iterations
Optimization Techniques
Configuration
SVM kernel n=3
10-fold cross validation
Gender normaliation
Sober class normalization
Final Results Difficult to compare!!
Difficult to compare results
Need better corpus
Extend with GMM super-vectors
Conclusions / Extensions