digital audio signal processing lecture 6 : reverberation & dereverberation

40
Digital Audio Signal Processing Lecture 6: Reverberation & Dereverberation Toon van Waterschoot / Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] .be [email protected]

Upload: werner

Post on 24-Feb-2016

52 views

Category:

Documents


1 download

DESCRIPTION

Digital Audio Signal Processing Lecture 6 : Reverberation & Dereverberation. Toon van Waterschoot / Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] [email protected]. Outline. Introduction P roblem statement A pplication scenarios - PowerPoint PPT Presentation

TRANSCRIPT

Optimization in Audio Signal Processing

Digital Audio Signal Processing

Lecture 6: Reverberation & DereverberationToon van Waterschoot / Marc Moonen

Dept. E.E./ESAT-STADIUS, KU [email protected]@esat.kuleuven.beOutlineIntroductionProblem statementApplication scenariosRoom acousticsDereverberationMethod 1: BeamformingMethod 2: Speech enhancementMethod 3: Blind system identification & inversionConclusion & open issuesIntroduction: Problem statementClean sound > Room acoustics > Reverberant sounddesired: music example[clean ] [reverberant ]undesired: speech example[clean ] [reverberant ] [very reverberant ]Reverberation has desired/undesired impact on sound quality and speech intelligibility Research problems:artificial reverberation synthesisreverberation control/enhancement dereverberation

Introduction: Application scenariosScenario-1: Sound reproductiongoal: sound control in acoustic environment(improved listening comfort/experience for audience)preprocessing strategysingle-point > multiple-point > area (increasingly difficult)applications: public address, home/automotive audio systems

preprocessingNote: in a sound reproduction scenario,dereverberation is often referred to as equalizationIntroduction: Application scenariosScenario-2: Sound acquisitiongoal: sound control in electric environment(improved sound quality of microphone recordings)postprocessing strategysingle-microphone > multi-microphoneapplications: speech recognition, hearing aids, recording,

postprocessing

Note: in contrast to AEC/AFC problems, (de)reverberation problem is not related to concurrent use of loudspeakers and microphones in same acoustic environmentOutlineIntroductionRoom acousticsDereverberationMethod 1: BeamformingMethod 2: Speech enhancementMethod 3: Blind system identification & inversionConclusion & open issuesRoom acoustics: OverviewAcoustic wavesKey characteristicsNon-parametric modelsFinite difference methodFinite/boundary element methodImage source methodRay tracing methodParametric models(Digital waveguide mesh)Impulse responseRoom transfer functionPole-zero modelAcoustic wave equation a valid sound field always satisties

= sound pressure (function of space and time) speed of sound is Laplacian operator (carthesian coordinates)subject to boundary conditions example rigid wall: single point source:

Room acoustics: Acoustic waves

Acoustic wave equation > Helmholtz equation

obtained from acoustic wave equation by applying a Fourier transform over the time variable (*) k is wave number

compose sound field as sum of room modes

Room acoustics: Acoustic waves

Example: 2-D room, 6 x 10 mrigid walls

mode 1: 17.1 Hz =0.5*(343m/s)/(10m)mode 2: 28.5 Hz =0.5*(343m/s)/(6m)mode 3 (1&2): 33.3 Hz =sqrt((17.1)^2+(28.5)^2)mode 4: 34.3 Hz =(343m/s)/(10m) mode 5 (2&4): 44.6 Hz =sqrt((17.1)^2+(28.5)^2)

mode 1

mode 2

mode 3

mode 5

mode 4Room acoustics: Key characteristicsReverberation time (Sabines formula): room volume, total surface area of room average absorption coefficient of surfaces (*)time needed for 60 dB squared sound pressure decayCritical distance: source directivity room constantdistance at which direct = reverberant sound energyDirect-to-reverberant ratio: source-observer distanceratio of direct vs. reverberant sound energy

(*) 01, 0 for rigid wall (mirror), 1 for open windowRoom acoustics: Non-parametric models (1)Finite difference time domain (FDTD) methodspatio-temporal sampling on regular grid: partial derivatives (spatial & temporal) in wave equation approximated by finite difference operator

FDTD wave equation

with boundary conditions

Room acoustics: Non-parametric models (2)Finite element method (FEM)4-step procedure to discretize boundary value problemweak formulation of boundary value problemintegration by parts to relax differentiability requirementssubspace approximation of field and source functionsenforce orthogonality of approximation error to subspacesubspace approximation relies on FEM basis functions:defined on arbitrarily constructed tetrahedral meshhaving small spatial supportFEM wave equation:

Boundary element method (BEM)numerical approximation of Greens function

Skip this partSkip this partRoom acoustics: Non-parametric models (3)Ray tracing methodsound waves represented by raysassumption of specular reflections (no diffraction), i.e. mirror-like reflection in which ray from a single incoming direction is reflected into a single outgoing directionrays can be traced from sound source to observer

Room acoustics: Non-parametric models (4)Image source methodreflections modeled as direct rays from image sourceimage sources = virtual sources located outside roommultiple reflections modeled as high-order image sources

Room acoustics: Parametric models (1)Impulse responseroom response to gunshot source (impulse function)conceptually simple model, straightforward interpretationpoor modeling efficiency (~103 params), high spatial variation

direct couplingearly reflectionsdiffuse sound fieldRoom acoustics: Parametric models (2)Room transfer function (RTF)assumptions: shoe-box shaped room / rigid wallsassumed modes solution of Helmholtz equation:

= set of (non-negligible) room modes resonance frequency of m-th mode damping factor of m-th mode eigenfunction of m-th mode normalization constant of m-th mode

Room acoustics: Parametric models (3)Pole-zero modelRTF suggests use of pole-zero modelRTF denominator independent of source/observer positions

gain factor minimum-phase zeros non-minimum-phase zeros common acoustical polesspecial cases:all-zero model = impulse responseall-pole model: represents room resonances only

OutlineIntroductionRoom acousticsDereverberationProblem statementOverview of dereverberation methodsMethod 1: BeamformingMethod 2: Speech enhancementMethod 3: Blind system identification & inversionConclusion & open issuesDereverberation: problem & overviewPS: measurement noise not considered: Reverberation as an additive signal degradationMethod 1: beamforming approach to dereverberation spatial separation of clean and reverberant soundMethod 2: speech enhancement approach to dereverberationtransform-domain separation of clean and reverberant soundReverberation as a convolutive signal degradation

Method 3: blind system identification and inversion approach to dereverberation:deconvolution of reverberant sound

OutlineIntroductionRoom acousticsDereverberationMethod 1: Beamformingfixed beamformingadaptive beamformingMethod 2: Speech enhancementMethod 3: Blind system identification & inversionConclusion & open issuesMethod 1: Introductionconcept: spatial separation of direct and reverberant sound(cf. multi-microphone noise reduction)difficulties compared to noise reduction:spatial separation of direct sound and room reflections requires knowledge of reflection DOAs(~ room acoustics model)reverberant sound is diffuse (comes from "all possible" directions, including source direction)two distinct approaches:fixed delay-and-sum beamformeradaptive filter-and-sum beamformerMethod 1: Fixed DSBfixed DSB structure (cf. Topic-2):

fixed DSB = matched filter (maximizing WNG) in the casespatially white noise (not entirely true for reverberation!)known sound source positionideal omni-directional microphones

(cfr. Lecture-2)Method 1: Fixed DSBexpected DRR improvement of fixed DSB:

source to m-th microphone distance, wave number m-th microphone position vectorcomputed using statistical room acoustics (SRA) (with assumption that direct & (diffuse) reverberant component are uncorrelated, etc.) depends on source-array distance + microphone separationindependent of reverberation time (!) (cfr improvement of DRR)

Method 1: Adaptive FSBadaptive FSB structure (cf. Topic-2):

optimal solution (matched filter) depends on room model:

~ blind system identification & inversion (cf. below)

:

+

(cfr. Lecture-2)OutlineIntroductionRoom acousticsDereverberationMethod 1: BeamformingMethod 2: Speech enhancementcepstrum-basedLPC-basedspectrum-basedMethod 3: Blind system identification & inversionConclusion & open issuesMethod 2: Introductionconcept: enhancement of reverberant speech by modeling & reducing reverberant sound in transform domainapplicable to single- & multi-microphone sound acquisitionchoice of transform domain results in three approaches:cepstrum-based LPC-basedspectrum-basedMethod 2: Cepstrum-basedconcept: convolution in time domain ~ addition in cepstral (*) domain

reverberation can be subtracted in cepstral domaincepstral subtraction:speech = low-quefrencyroom acoustics = high-quefrency

cepstral analysiscepstral subtractioncepstral synthesis(*) use complex cepstrum (=invertible)Method 2: LPC-basedlinear predictive coding of reverberant speech:reverberation hardly affects speech LPC coefficientsreverberation largely affects LPC residualdereverberation reduces to LPC residual enhancementbased on knowledge of speech production process + spatial averaging (using multiple microphones)LPC analysisLPC residual enhancementLPC synthesisLPC coefficientsMethod 3: Spectrum-basedconcept: late reverberation ~ (broadband) additive noisespectral subtraction:estimate noise energy & compute subtractive gain functionspectral subtraction assumes noise stationarity (cf. Lecture-3) not valid for reverberation!estimation of "noise energy based on statistical model for late reverberation

TF analysisSpectral subtractionTF synthesislate reverberation energy estimatorNote:Straightforwardly extendable to combined dereverberation & noise suppressionOutlineIntroductionRoom acousticsDereverberationMethod 1: BeamformingMethod 2: Speech enhancementMethod 3: Blind system identification & inversionall-zero model identification & inversionall-pole model identification & inversionConclusion & open issuesMethod 3: Introductionconcept: two-step procedurestep 1: identify room model (source > multiple microphones)step 2: invert room model highly non-trivial difficulties:source signal unknown > blind identification(non-) invertibility of room modelmodel inversion sensitive to identification & numerical errorstwo approaches based on different room models:all-zero modelall-pole model

starting point: cross-relation error / nullifying filters

batch identification using EVD/SVD

vector of stacked & filtered RIRs lies in null space of microphone array covariance matrixfilters denote erroneous zeros (which can be removed) zeros common to all RIRs cannot be identifiedhigh & unknown RIR order / poor conditioningMethod 3: Blind system identification

Method 3: Blind system identificationPS: vector of stacked & filtered RIRs lies in null space of microphone array covariance matrix

Method 3: Blind system identificationPS: zeros common C(z) to all RIRs cannot be identified

S(z)S(z)C(z)Method 3: InversionMultiple-input/output inverse theorem (MINT):

exact solution exists if poor conditioning for near-common zerosInversion sensitive to system identification errors

Method 3: InversionMultiple-input/output inverse theorem (MINT):

exact solution exists if poor conditioning for near-common zerosInversion sensitive to system identification errors

Method 3: Inversionmatched filtering:can be interpreted as multiple-beam beamformers, having beams in direction of direct sound and 1st order reflections (note that has a peak at time = 0, corresponding to a constructive addition of all multi-path components) matched filter = non-causal filter > pre-echo effect

pre-echo

Method 3: Inversionmatched filtering:can be interpreted as multiple-beam beamformers, having beams in direction of direct sound and 1st order reflections (note that has a peak at time = 0, corresponding to a constructive addition of all multi-path components) matched filter = non-causal filter > pre-echo effect(can be alleviated by filter truncation)

pre-echo

Method 3: All-pole model starting point: all-pole model with common acoustical polesa priori identification of all-pole model multi-channel LPC of estimated RIRsspatial averaging of single-channel LPC coefficientsmodel inversion > fixed FIR filter (!)Conclusionreverberation is complex physical phenomenon that can be modeled in a variety of waysresearch problems related to reverberation:artificial reverberation synthesisreverberation control/enhancementdereverberationdereverberation is still challenging problem!Method 1: beamformingMethod 2: speech enhancementMethod 3: blind system identification & inversion

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

(

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

NULL SPACE

SIGNALS

EMBED Equation.3

EMBED Equation.3

0

_1065376792.unknown

_1067786336.unknown

_1067786768.unknown

_1067786999.unknown

_1067788669.unknown

_1067790006.unknown

_1067790013.unknown

_1067788753.unknown

_1067787026.unknown

_1067786769.unknown

_1067786474.unknown

_1067786490.unknown

_1067786390.unknown

_1067780936.unknown

_1067781011.unknown

_1067785773.unknown

_1067780965.unknown

_1067780765.unknown

_1067780935.unknown

_1067269590.unknown

_1065373784.unknown

_1065374727.unknown

_1065375604.unknown

_1065374597.unknown

_1065373665.unknown

_1065373771.unknown

_1065373603.unknown

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

(

EMBED Equation.3

EMBED Equation.3

EMBED Equation.3

NULL SPACE

SIGNALS

EMBED Equation.3

EMBED Equation.3

0

_1065376792.unknown

_1067786336.unknown

_1067786768.unknown

_1067786999.unknown

_1067788669.unknown

_1067790006.unknown

_1067790013.unknown

_1067788753.unknown

_1067787026.unknown

_1067786769.unknown

_1067786474.unknown

_1067786490.unknown

_1067786390.unknown

_1067780936.unknown

_1067781011.unknown

_1067785773.unknown

_1067780965.unknown

_1067780765.unknown

_1067780935.unknown

_1067269590.unknown

_1065373784.unknown

_1065374727.unknown

_1065375604.unknown

_1065374597.unknown

_1065373665.unknown

_1065373771.unknown

_1065373603.unknown