advances in wp1 trento meeting 11-12 january 2007
TRANSCRIPT
2
WP1: Environment & Sensor RobustnessT1.2 Noise Independence
Noise Reduction:
– Spectral Subtraction (YEAR 1) and Spectral Attenuation (YEAR2)
– Evaluation of feature normalization techniques in Loquendo ASR(HEQ study + Revision 2) (Q3/4 YEAR2)
(PEQ study) (YEAR3)
4
Y1,2+PEQ Front End configuration
Front End Configuration
RPLP Rasta-PLP frame with energy plus 12 CEP, plus first(D) and second(DD) derivative
+ PEQ Parameter Equalization (UGR) applyied to the RPLP frame, plus D, DD
+ WIE SNR Modified Rasta-PLP with Wiener-based denoising technique (PowSpec subtraction), plus D, DD
+ EM SNR Modified Rasta-PLP with Ephraim-Malah-based denoising technique (PowSpec attenuation), plus D,DD
+ EM SNR + PEQ Modified Rasta-PLP with Ephraim-Malah plus Parameter Equalization, plus D, DD
6
Y1,2+PEQ Performance evaluations
Performances in terms of Word Accuracy and
(Error Reduction – with respect to RPLP experiment)
CLEAN Models Test A Test B Test C A-B-C
RPLP 75.6 77.5 75.3 76.3
+ PEQ 87.1(47.1) 87.3(43.5) 86.9(47.0) 87.1(45.6)
+ WIE SNR 84.0(34.4) 84.4(30.7) 83.3(32.4) 84.0(32.5)
+ EM SNR 85.3(39.7) 84.2(29.5) 84.8(34.5) 84.8(35.9)
+ EM SNR + PEQ 86.4(44.3) 85.8(36.9) 86.6(45.7) 86.2(41.8)
7
Y1,2+PEQ Performance evaluations
MULTI Models Test A Test B Test C A-B-C
RPLP 93.5 91.1 90.2 91.9
+ PEQ 92.8(-10.8) 91.4(3.4) 92.2(20.4) 92.1(2.5)
+ WIE SNR 93.9(6.1) 92.1(11.2) 90.5(3.1) 92.5(7.4)
+ EM SNR 94.0(7.7) 92.0(10.1) 91.1(9.2) 92.6(8.6)
+ EM SNR + PEQ 92.7(-12.3) 91.1(0.0) 91.7(15.3) 91.9(0.0)
Performances in terms of Word Accuracy and
(Error Reduction – with respect to RPLP experiment)
9
Y1,2+PEQ Performance evaluations
Performances in terms of Word Accuracy and
(Error Reduction – with respect to RPLP experiment)
Aurora3 8kHz Ita WM Ita HM Spa WM Spa HM
RPLP 98.2 46.6 97.3 74.6
+ PEQ 97.6 (-33.3) 79.7 (62.0) 97.3 ( 0.0) 87.1 (49.2)
+ WIE SNR 98.3 ( 5.5) 77.5 (59.4) 97.6 (11.1) 89.8 (59.8)
+ EM SNR 98.4 (11.1) 82.2 (66.7) 97.7 (14.8) 88.8 (55.8)
+ EM SNR + PEQ 98.0 (-11.0) 87.0 (75.6) 97.8 (18.5) 90.8 (63.8)
11
Y1,2+PEQ Performance evaluations(Sennheiser microphone)
CLEAN 8kHz
Clean Car Babble Rest. Street Airport Train Station
Noise Avg.
RPLP 85.2 54.3 23.1 29.4 34.0 29.3 32.3 33.7
+ PEQ 84.8 (-2.7)
69.5 (33.2)
54.3 (40.6)
50.2 (29.5)
52.7 (28.3)
53.5 (34.2)
53.8 (31.7)
55.7 (33.2)
+ WIE SNR 85.2 (0.0)
67.0 (27.8)
36.6 (17.5)
30.7 (1.8)
43.1 (13.8)
31.9 (3.7)
48.8 (24.4)
43.0 (14.0)
+ EM SNR 85.5 (2.0)
70.4 (35.2)
37.1 (18.2)
31.6 (3.1)
45.8 (13.8)
31.6 (3.2)
53.7 (31.6)
45.0 (17.0)
+ EM SNR + PEQ 85.4 (1.3)
70.9 (36.3)
53.9 (40.0)
49.5 (28.5)
54.9 (31.7)
51.3 (31.1)
59.1 (39.6)
56.6 (34.5)
Performances in terms of Word Accuracy and
(Error Reduction – with respect to RPLP experiment)
12
Y1,2+PEQ Performance evaluations(second microphone)
CLEAN 8kHz
Clean Car Babble Rest. Street Airport Train Station
Noise Avg.
RPLP 59.4 35.7 16.2 21.4 22.9 19.3 22.8 23.1
+ PEQ 72.4 (33.0)
58.2 (34.9)
46.5 (36.1)
41.9 (26.1)
43.2 (26.3)
45.1 (31.9)
45.4 (29.3)
46.7 (30.7)
+ WIE SNR 60.1 (1.7)
50.2 (22.5)
25.7 (11.3)
23.6 (2.8)
29.2 (8.2)
22.6 (4.1)
35.1 (15.9)
31.1 (10.4)
+ EM SNR 60.7 (3.2)
52.3 (25.8)
27.8 (13.8)
23.4 (2.5)
31.0 (10.5)
23.5 (5.2)
39.1 (21.1)
32.9 (12.7)
+ EM SNR + PEQ 73.6 (34.9)
58.0 (34.6)
46.0 (35.5)
39.4 (22.9)
42.7 (25.7)
43.2 (29.6)
49.4 (34.4)
46.4 (30.3)
Performances in terms of Word Accuracy and
(Error Reduction – with respect to RPLP experiment)
13
WP1: Workplan
• Selection of suitable benchmark databases; (m6)
• Completion of LASR baseline experimentation of Spectral Subtraction (Wiener SNR
dependent) (m12)
• Discriminative VAD (training+AURORA3 testing) (m16)
• Exprimentation of Spectral Attenuation rule
(Ephraim-Malah SNR dependent) (m21)
• Preliminary results on spectral subtraction and HEQ techniques (m24)
• Integration of denoising and normalization techniques (PEQ) (m33)