anechoic signal bathroom center 16 bathroom corner 16 ping−pong...

Post on 07-Aug-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Effects of Reverberant Energy on Statistics of SpeechMadhu Shashanka1, Barbara Shinn-Cunningham1, and Martin Cooke2

1Boston University, 2University of Sheffield{ shashanka,shinn} @cns.bu.edu, m.cooke@dcs.shef.ac.uk

Anechoic Signal

Anechoic

Bathroom Center 16

Bath Ctr 16

Bathroom Corner 16

Bath Cnr 16

Ping−pong Center 16

PP Ctr 16

Ping−pong Corner 16

PP Cnr 16

Ping−pong Side 16

PP Side 16

Anechoic Signal

Anechoic

Bathroom Center 50

Bath Ctr 50

Bathroom Corner 50

Bath Cnr 50

Ping−pong Center 50

PP Ctr 50

Ping−pong Corner 50

PP Cnr 50

Ping−pong Side 50

PP Side 50

Anechoic Signal

Anechoic

Bathroom Center 100

Bath Ctr 100

Bathroom Corner 100

Bath Cnr 100

Ping−pong Center 100

PP Ctr 100

Ping−pong Corner 100

PP Cnr 100

Ping−pong Side 100

PP Side 100

Anechoic Signal

Anechoic

Bathroom Center 200

Bath Ctr 200

Bathroom Corner 200

Bath Cnr 200

Ping−pong Center 200

PP Ctr 200

Ping−pong Corner 200

PP Cnr 200

Ping−pong Side 200

PP Side 200

Objective

Study how reverberation affects sparsity andstatistics of speech.

Measurements: We measured impulse responses in five dif ferent room/positionconditions and at four dif ferent distancesfrom the sour ce.

• Bathr oom

– Center

– Corner

• PingPong Room

– Center

– Corner

– Side

At every location/condition, responses wer erecorded at two mics (left and right) whichwer e 0.4 m apart.

How does reverb distort a speechsignal?Reverberation smears energy across time. Areverb signal is less sparse in time andfrequency than the anechoic signal. Thewaveforms and spectrograms of an anechoicsignal and a reverb signal for a particularcondition are shown below.

Waveforms

ANECHOIC BR CENTER 2m

Spectrograms

Approach

We compar ed energy dif ferences betweensignals using a time-fr equencyrepresentation (spectr ogram). Reverberantsignals wer e first time-aligned with theanechoic signal and levels wer e corr ected sothat RMS energy of the dir ect sound wasequal to that of the anechoic signal.

−100 −50 0 500

5

10

15

20

Energy (dB)

stinu FT fo eg atn ec r eP

Clean0.16 m0.50 m1.0 m2.0 m

BR CENTER

−100 −50 0 500

5

10

15

20

Energy (dB)

stinu FT fo e gatn e cre P

CleanBr CenterBr CornerPr CenterPr CornerPr Side

2.0 m

Speech and Silence

We consider another measur e of the effectsof reverberation – percentage of TF units forwhich anechoic energy was within ± 3 dB ofreverb energy.

• Speech - if energy is within 20 dB of maxenergy.

• Silence - if energy is less than max energyby more than 20 dB.

Below, we plot these percentages separatelyfor the bathroom and the pingpong room.

Bathroom :

16 50 100 2000

10

20

16 50 100 2000

20

40

)ciohcena(E fo Bd 3 ni htiw )b re ver(E

Speech

16 50 100 2000

5

10

15 Silence

CenterCorner

Pingpong room :

16 50 100 2000

20

40

20

40

60

80

0

20

40

CenterCornerSide

)ciohcena(E f o B d 3 n iht iw ) brev er(E

Speech Silence

Below, we plot the histogram of the energydif ference between S1 and S2 in the anechoicand reverberant cases.To visualize thedif ference more clearly , we also plot thedif ference between the two curves (right).

Energy(S1-S2) Histograms and theirdif ference

−100 −50 0 50 1000

5

10

15

20

Energy(S1) − Energy(S2)

stinu FT fo %

AnechoicReverb

BathroomCenter2.0m

−100 −50 0 50 100−3

−2

−1

0

1

2

3

4

)ciohcenA(% − )br ev e

R(%

Energy(S1) − Energy(S2)

BathroomCenter2.0m

We can also see to what extent energies inthe two signals overlapped. Below, we plotthe percentage of TF units for which energyin speech sample 1 was within 3 dB ofenergy in the other sample.

Bathroom :

16 50 100 20018

19

20

20

25

30

35

( ygrene ni pal rev o hcihw stinu FT

) Bd 3

Speech

18

19

20

21 Silence

CenterCornerAnechoic

Pingpong room :

16 50 100 20018

19

20CenterCornerSideAnechoic

20

25

30

35

19

20

21

( ygrene ni pa lrev o hcihw stinu FT

) Bd 3

Speech Silence

Energy of S2 in the gaps of S1

We consider how much energy of speech sample2 is present in the silence regions of speech sample 1. The plot below shows the energy of speech sample 2 and dark blue regions representTF units where sample 1 has high energy.

Sample 2 in the "silence" of Sample 1

Below, we plot energy histograms.

Silence of S1

−100 −80 −60 −40 −20 00

5

10

15

20

Energy

stinu FT fo %

AnechoicReverb

Energy of S1

−100 −80 −60 −40 −20 00

5

10

15

20

Energy

stinu FT fo %

AnechoicReverb

Energy of S2

Energy of (S1+S2)

−100 −80 −60 −40 −20 00

5

10

15

20

Energy

stinu FT fo %

AnechoicReverb

Silence of S1

−100 −80 −60 −40 −20 00

5

10

15

20

Energy

stinu FT fo %

Silence of S2AnechoicReverb

Reverberation More EnergyBelow are histograms of the energy intime frequency (TF) units.

The relative amount of energy fromreverberation grows with distance.Histogram of TF energy for thebathroom center condition.

The hard-walled bathroom (BR)produces more reverberant energythan a classroom (the ping-pong room, PR). Histogram of TF energy for different room conditions, for a 2-m distance.

Silence is filled in by reverberantenergy, especially in the hard-walled bathroom. As distanceincreases, the effects of reverberationincrease, both for silence and forspeech. However, filling in the silencedestroys the normal modulations that convey speech meaning. Plots show the percentage of TF unitsfor which the energy in the reverberantcondition is within 3 dB of what it was in the clean speech.

16 50 100 200 16 50 100 200

Simultaneous SpeechWe now consider the effect of reverberationon simultaneous speech signals.

We analyzed all TF units togethe, but alsobroke the TF units into two categories:

16 50 100 200 16 50 100 200

16 50 100 200 16 50 100 200

Distance (cm)

Distance (cm)

Distance (cm)

Distance (cm)

All TF units

All TF units

All TF units

All TF units

Histograms of energy of signal 1 (left panel) andsignal 2 (right panel) in the silence regions of signal 1.

Histograms of energy of both signals in silenceregions of signal 1 (left panel) andsignal 2 (right panel).

Acknowledgements1. Tim Streeter for helping with measu-rements of impulse responses.2. Portions of this work were supportedby grants from (a) Air Force Office of Scientific Research(b) National Science Foundation.

top related