methods for low bitrate coding enhancement part ii: spatial … · 2017. 9. 22. · stereo with...

22
© Fraunhofer IIS 1 08.09.2017 Methods for Low Bitrate Coding Enhancement Part II: Spatial Enhancement Christian Uhle 1,2 , Patrick Gampp 1 , Oliver Hellmuth 1 , Peter Prokein 1 , Jürgen Herre 2,1 , Sascha Disch 1,2 , Julia Havenstein 1 , Antonios Karampourniotis 1 1 Fraunhofer IIS, Erlangen, Germany 2 International Audio Laboratories Erlangen, Germany

Upload: others

Post on 27-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   1 08.09.2017

Methods for Low Bitrate Coding Enhancement Part II: Spatial Enhancement

Christian Uhle1,2, Patrick Gampp1, Oliver Hellmuth1, Peter Prokein1, 

Jürgen Herre2,1, Sascha Disch1,2, Julia Havenstein1, Antonios Karampourniotis1

1 Fraunhofer IIS, Erlangen, Germany2 International Audio Laboratories Erlangen, Germany

Page 2: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   2 08.09.2017

Methods for Low Bitrate Coding Enhancement Part II: Spatial Enhancement

1. Introduction

2. System Overview

3. Ambient Sound Enhancement

4. Stereo Width Enhancement

5. Evaluation

6. Conclusion

Page 3: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   3 08.09.2017

1. IntroductionMotivation

Perceptual Audio Coding (PAC) is applied for storage and transmission of audio signals.

Perceptual transparency is achieved when bitrate is high enough. Original and coded/decoded signals are indistinguishable when listening in an 

optimal listening environment.  At low bitrates, artifacts can be introduced and the sound quality is reduced. Width of stereo image is reduced, e.g. due to  Decreased difference signal (M/S Coding), Increased correlation between channel signals (Intensity Stereo Coding).   

Page 4: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   4 08.09.2017

1. IntroductionMotivation

Aim is to apply post‐processing for improving the sound quality. Single‐ended, i.e. without having information about the coding (codec, bit rate). Criterion is pleasantness, not transparency.

Page 5: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   5 08.09.2017

2. System OverviewIntegration into the Automotive Sound System

Audio  Decoder

Car Head Unit

(Degraded)PCM Audio 

SignalAudio Source Manager

(Enhanced) PCM Audio 

Signal

)))

)))

)))

))))))

(Compressed) Audio 

Bitstream

PCM Loudspeaker 

Signal

Car Amplifier

Car Sound Processing

Low Bitrate Coding   Enhancement Suite

Spectral Restoration

Spatial Enhancement

...

Audio  Decoder

Car Head Unit

(Degraded)PCM Audio 

SignalX’Audio 

Source Manager

Low Bitrate Coding Enhancement Suite

Spectral Restoration

Spatial Enhancement

(Enhanced) PCM Audio 

Signal

)))

)))

)))

))))))

(Compressed) Audio 

Bitstream

PCM Loudspeaker 

Signal

...

Car Amplifier

Car Sound Processing

Operation in the Head Unit

Operation in the Amplifier

Page 6: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   6 08.09.2017

3. Ambient Sound EnhancementOverview

Improve the perceived stereo image by applying artificial decorrelation to the background signal components.

Background sounds: ambient sounds, background music (radio broadcast) and musical accompaniment. 

Foreground sounds: singers, talkers, soloists, loud instruments (drums). Maintains the timbral qualities without introducing coloration and artifacts. Decorrelation can impair the sound quality when applied to foreground sounds 

(e.g. speech, drums). Decorrelation is not required for foreground sounds (directional sounds are 

locatable). The intensity of the decorrelation is controlled using a model of reverberance

(perceptual attribute that relates to the intensity of reverberation).

Page 7: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   7 08.09.2017

3. Ambient Sound EnhancementBlock Diagram

Page 8: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   8 08.09.2017

3. Ambient Sound EnhancementBlock Diagram

Background sounds are separated by attenuating transient and tonal signals.

Page 9: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   9 08.09.2017

3. Ambient Sound EnhancementSeparation of the Background Sounds

STFT, Spectral weighting, i.e. scaling of the spectral coefficients, Spectral weights (for each time‐frequency bin) to attenuate transient signal 

components, Spectral weights for attenuating tonal signal components, Combination of these spectral weights (by taking the minimum of both), Inverse STFT.

Page 10: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   10 08.09.2017

3. Ambient Sound EnhancementAttenuation of Transient Signals

Signal model: Input signal is an additive mixture of a transient signal component and a sustained signal component (in the STFT domain, time frame index k and frequency bin index m): 

The transient signal is attenuated by spectral weighting

The spectral weights are computed from estimates of the sustained signal and the transient signal

The sustained signal magnitude is estimated by means of low‐pass filtering of the sub‐band magnitudes along time and limiting the sustained signal by the input.

|Ytrns(k,m)| = Gtrns(k,m)|X(k,m)|

Gtrns =|X̂s| + |X̂t |

|X |

|X(k,m)| = |Xt(k,m)| + |Xs(k,m)|

Page 11: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   11 08.09.2017

3. Ambient Sound EnhancementAttenuation of Transient Signals (2)

Sound example: Input signal (black)           overlaid by output signal (red)

Time [s]0 1 2 3 4 5

Ampl

itude

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4InputOutput

Page 12: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   12 08.09.2017

3. Ambient Sound EnhancementAttenuation of Tonal Signals

Attenuate spectral components that exceed an estimate of the noise floor, i.e. a locally flat magnitude spectrum.

Page 13: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   13 08.09.2017

3. Ambient Sound EnhancementDecorrelation of Background Sounds

Linear time‐invariant processing in the time domain with a dense and short impulse response.

Decorrelation filter structure is a trade‐off between sound quality and complexity (computational load, memory requirements and tuning effort).

Here: 3 nested all‐pass filters in parallel per output channel. The tuning of the parameters (delays and gains of the all‐pass filters) is of crucial 

importance.

Page 14: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   14 08.09.2017

3. Ambient Sound EnhancementDecorrelator Gain Control

The perceived level of decorrelation (and reverberation) depends on both, the processing (impulse response) and the input signal. Lower effect intensity for stationary input signals than for transient signals or 

frequency modulated signals (e.g. speech). Level of decorrelation is controlled using a model for the perceived intensity of 

decorrelation. Modified version of a model of reverberance (Uhle et. al., 2011), Based on a model for partial loudness (Moore et. al., 1997). Partial loudness difference = 

partial loudness of decorrelated signal (masked by the dry input)‐ partial loudness of dry input (masked by the decorrelated signal)

Page 15: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   15 08.09.2017

4. Stereo Width EnhancementOverview

Extending the width of the stereo image by enhancing inter‐channel level differences of direct sound components:1. Stereo Mid/Side Decomposition,2. Boost the stereo side signal.

STFT Stereo M/S- Decomposition iSTFT

x(t)X(k, m)

y(t)

S(k, m)

M(k, m)

w

Y(k, m)

Page 16: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   16 08.09.2017

4. Stereo Width EnhancementStereo Mid/Side Decomposition

Stereo side signal: S1 = G1X1

S2 = G2X2

Gi = max(0,|Xi |α − κ |D|α

|Xi |α)

β

• D: Downmix of the input signal• Tuning parameters for controlling the attenuation

with spectral weights

Page 17: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   17 08.09.2017

5. EvaluationListening Test

Listening test with multiple stimuli using loudspeakers. Conditions: Coded signal without any postprocessing, as known and hidden “reference”, Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE.

5 test signals of length between 8 s and 30 s each, loudness normalized (ITU‐R BS.1770).

Codecs: mp3 at 64kbps, AAC at 48 kbps.

12 listeners.

Page 18: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   18 08.09.2017

1. “How well the spatial image has been improved?”2. “Sound quality?”

5. EvaluationListening Test (2)

Page 19: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   19 08.09.2017

1. “How well the spatial image has been improved?”2. “Sound quality?”

5. EvaluationListening Test (2)

Page 20: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   20 08.09.2017

6. Conclusion

In perceptual audio coding, audible artifacts can be introduced when the bitrate is too low.

We have proposed a suite of algorithms each designed for mitigating common types of artifact.

Listening test:  Both methods achieved a significant improvement, The combination of both methods is rated higher than the methods in isolation 

(“slightly better”). These tools can be used to implement a Low Bitrate Coding Enhancement system. Future work:  Assessment of the performance obtained with a combination of all proposed 

enhancement tools (presented in Part 1 and Part 2).

Page 21: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   21 08.09.2017

Thank you for your attention!

Page 22: Methods for Low Bitrate Coding Enhancement Part II: Spatial … · 2017. 9. 22. · Stereo With Enhancement (SWE), Ambient Sound Enhancement (ASE), SWE + ASE. 5test signals of length

© Fraunhofer IIS   22 08.09.2017

Sonamic Enhancement Sound Demo

In Regency Ballroom

Listen also to

Symphoria 3D

Sonamic Loudness