rapid and accurate spoken term detection owen kimball bbn technologies 15 december 2006
Post on 27-Dec-2015
217 Views
Preview:
TRANSCRIPT
Rapid and Accurate Rapid and Accurate Spoken Term DetectionSpoken Term Detection
Owen Kimball
BBN Technologies
15 December 2006
15-Dec-06Rapid and Accurate Spoken Term Detection 2
Overview of TalkOverview of Talk
• BBN Levantine system description
• Evaluation results
• Diacritics
• Out-of-vocabulary issues
15-Dec-06Rapid and Accurate Spoken Term Detection 3
BBN Evaluation TeamBBN Evaluation Team
Core Team• Chia-lin Kao• Owen Kimball• Michael Kleber• David Miller
Additional assistance• Thomas Colthurst• Herb Gish• Steve Lowe• Rich Schwartz
15-Dec-06Rapid and Accurate Spoken Term Detection 4
BBN System OverviewBBN System Overview
Byblos STT
indexer
detector
decider
latticesphonetic-transcripts
indexscored
detectionlists
final outputwith YES/NO
decisions
audiosearc
hterms
ATWV cost
parameters
15-Dec-06Rapid and Accurate Spoken Term Detection 5
Levantine STT ConfigurationLevantine STT Configuration
• STT generates a lattice of hypotheses and a phonetic transcript for each input file.
• Word-based system:– Orthography based on Modern Standard Arabic
(MSA), no short vowel diacritics– Acoustic: 57.3 hours LDC
(noise words, no mixture exponents)– Language: 250 hours of data, 1.3M words
• 38.5K dictionary, grapheme-as-phoneme based plus 100 manual pronunciations
– unknown short vowel (U), 39 phonemes
• 42.32% WER on STD Dev06 CTS data
15-Dec-06Rapid and Accurate Spoken Term Detection 6
Levantine CTS ResultsLevantine CTS Results
0.3467Eval06
0.410DryRun
0.515Dev06
ATWV Data
15-Dec-06Rapid and Accurate Spoken Term Detection 7
OOV Pipeline: DetectorOOV Pipeline: Detector
• Word-based STT produces 1-best transcript: pronounce it 1-best phonetic transcript.
• Query is OOV if it contains any OOV word.
• OOV query detection:– Pronounce query (grapheme-as-phoneme)– Find minimal edit-distance alignments (agrep)– Score = % error = phonemes#
distanceedit 1
15-Dec-06Rapid and Accurate Spoken Term Detection 8
OOV Pipeline: DeciderOOV Pipeline: Decider
• Need different Yes/No decision procedure:IV-decider requires posterior probabilities.
• Simple OOV decision procedure:– Constant threshold on score (~ 0.7)– Cap on maximum number of hits (0-3)– Values set to maximize ATWV on Dev06 data.
15-Dec-06Rapid and Accurate Spoken Term Detection 9
OOV Pipeline: ResultsOOV Pipeline: Results
• ATWV remained good:0.3450 IV
0.3635 OOV
• Searches take longer: ~10-15x IV speed on Dev06 and DryRun06,
with no attempt at indexing.
15-Dec-06Rapid and Accurate Spoken Term Detection 10
OOV Directions for ImprovementOOV Directions for Improvement
• Score substitutions using phoneme confusion matrix instead of flat edit distance
• Speed: indexing phonetic transcripts for approximate matching
• Search lattices beyond 1-best transcripts
15-Dec-06Rapid and Accurate Spoken Term Detection 11
Levantine Diacritic IssuesLevantine Diacritic Issues
• Originally looked at diacritized Levantine
• Trained STT engine using LDC 45 hour set
• Ran STD without knowing WER (no diacritized STT test set to measure WER).– Found very high false alarm rate
• Examining FAs found hits that were legitimate alternate spellings
15-Dec-06Rapid and Accurate Spoken Term Detection 12
Levantine Diacritics- Alternate SpellingsLevantine Diacritics- Alternate Spellings
• Examining query words found more of same:– In first 22 terms of dry run term list, 14 are “alternate
diacritic” spellings of 5 underlying words, i.e. there were just 13 unique words in the first 22 terms
– Min~ahumo v Minohumo
– AlHayaApi v AlHayaAp
– Waliko v Walika
– qabilo v qabola v qabolo
• LDC training and STD test set had additional pervasive differences
15-Dec-06Rapid and Accurate Spoken Term Detection 13
No-Diacritic Levantine IssuesNo-Diacritic Levantine Issues
• A quick look turned up a smaller number of problems for no-diacritic Levantine– Looking at 7 top-FA terms in dev set, found
• “bHky” vs “b>Hky” but no other spelling confusions
• One ref instance of term with 0 duration
• It would be interesting to QC test sets for inconsistent spellings and other issues
top related