12.5 - low power speech enhancement david halupka ph.d. candidate electronics group june 24 th, 2005

Post on 29-Jan-2016

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

12.5 - Low Power Speech Enhancement

David HalupkaPh.D. CandidateElectronics Group

June 24th, 2005

June 24th, 2005 University of Toronto 2 of 6

Motivation

Today’s recognition systems can achieve a 95%+ recognition accuracy after extensive training

Research systems: same accuracy with no training Typically: 10% accuracy in the presence of noise,

reverberations, and conflicting conversations Humans are equipped to deal with noisy environments

Two ears → let us localize and focus on a single speaker

Complex noise: one sensor doesn’t cut it Multiple microphones → superhuman noise filtering

June 24th, 2005 University of Toronto 3 of 6

Step 1: Sound Localization

dx+τν

x

t

tm2(t)

m1(t)Time-Based Cross-Correlation

June 24th, 2005 University of Toronto 4 of 6

Step 2: Speech Enhancement

June 24th, 2005 University of Toronto 5 of 6

A Hard Case for Hardware

Localization is a exhaustive linear search Gradient search, etc. not applicable

Each time delay must be checked Each likelihood can be evaluated in parallel

1 GHz Intel Pentium III needed just for real-time localization → consumes 35 W

Speech interface is beneficial for handheld devices, but battery life is limited. Palm M100 → 150 mW

June 24th, 2005 University of Toronto 6 of 6

Results – 0.18 μm CMOS

Die Size: 2.51 mm x 2.51 mmPower Utilization: 29 mW

Die Size: 1.51 mm x 1.38 mmPower Utilization: 3.45 mWFPGA: 184 mWDSP: 650 mW

top related