12.5 - low power speech enhancement david halupka ph.d. candidate electronics group june 24 th, 2005
Post on 29-Jan-2016
221 Views
Preview:
TRANSCRIPT
12.5 - Low Power Speech Enhancement
David HalupkaPh.D. CandidateElectronics Group
June 24th, 2005
June 24th, 2005 University of Toronto 2 of 6
Motivation
Today’s recognition systems can achieve a 95%+ recognition accuracy after extensive training
Research systems: same accuracy with no training Typically: 10% accuracy in the presence of noise,
reverberations, and conflicting conversations Humans are equipped to deal with noisy environments
Two ears → let us localize and focus on a single speaker
Complex noise: one sensor doesn’t cut it Multiple microphones → superhuman noise filtering
June 24th, 2005 University of Toronto 3 of 6
Step 1: Sound Localization
dx+τν
x
t
tm2(t)
m1(t)Time-Based Cross-Correlation
June 24th, 2005 University of Toronto 4 of 6
Step 2: Speech Enhancement
June 24th, 2005 University of Toronto 5 of 6
A Hard Case for Hardware
Localization is a exhaustive linear search Gradient search, etc. not applicable
Each time delay must be checked Each likelihood can be evaluated in parallel
1 GHz Intel Pentium III needed just for real-time localization → consumes 35 W
Speech interface is beneficial for handheld devices, but battery life is limited. Palm M100 → 150 mW
June 24th, 2005 University of Toronto 6 of 6
Results – 0.18 μm CMOS
Die Size: 2.51 mm x 2.51 mmPower Utilization: 29 mW
Die Size: 1.51 mm x 1.38 mmPower Utilization: 3.45 mWFPGA: 184 mWDSP: 650 mW
top related