digital audio signal processing dasp - ku leuvendspuser/dasp... · 2 digital audio signal...

1

Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven

[email protected] homes.esat.kuleuven.be/~moonen/

Digital Audio Signal Processing

DASP

Lecture-3: Noise Reduction-II Fixed Beamforming

Digital Audio Signal Processing Version 2017-2018 Lecture-3: Fixed Beamforming 2 / 34

Overview

•  Introduction & beamforming basics •  Data model & definitions

•  Filter-and-sum beamformer design

•  Matched filtering –  White noise gain maximization –  Ex: Delay-and-sum beamforming

•  Superdirective beamforming –  Directivity maximization

•  Directional microphones (delay-and-subtract)

2


Introduction

•  Directivity pattern of a microphone –  A microphone (*) is characterized by a `directivity pattern which specifies the gain & phase shift that the microphone gives to a signal coming from a certain direction (i.e. `angle-of-arrival’) –  In general the directivity pattern is a function of frequency (ω) –  In a 3D scenario `angle-of-arrival’ is azimuth + elevation angle –  Will consider only 2D scenarios for simplicity, with one angle-of arrival (θ), hence directivity pattern is H(ω,θ) –  Directivity pattern is fixed and defined by physical microphone design

(*) We do digital signal prcessing, so this includes front-end filtering/A-to-D/..

|H(ω,θ)| for 1 frequency


Introduction

•  Virtual directivity pattern –  By weighting or filtering (=freq. dependent weighting) and then

summing signals from different microphones, a (software controlled) virtual directivity pattern (=weigthed sum of individual patterns) can be produced

–  This assumes all microphones receive the same signals (so are all in the same position). However…

∑=

=M

mmm HFH

1virtual ),().(),( θωωθω

01000

20003000 0

4590

135180

0

0.5

1

Angle (deg)

Frequency (Hz)

F1(ω)

F2 (ω)

FM (ω)

z[k]

y1[k]

y2[k]

yM [k]

+ :

Fm (ω) = fm,n.e− jωn

n=0

N−1

∑

3


F1(ω)

F2 (ω)

FM (ω)

z[k]

y1[k]

+ : dM

θ

dM cosθ

Introduction

•  However, in a microphone array different microphones are in different positions/locations, hence also receive different signals

•  Example : uniform linear array i.e. microphones placed on a line & uniform inter-micr. distances (d) & ideal micr. characteristics (p.10) For a far-field source signal (i.e. plane wave), each microphone receives the same signal, up to an angle-dependent delay… fs=sampling rate c=propagation speed

][][ 1 mm kyky τ+=

Hvirtual (ω,θ ) = Fm (ω).e− jωτm (θ )

m=1

M

∑

sm

m fc

d θθτ

cos)( =dmdm )1( −=


F1(ω)

F2 (ω)

FM (ω)

z[k]

y1[k]

+ : dM

θ

dM cosθ

Introduction

•  Beamforming = `spatial filtering’ based on microphone characteristics (directivity patterns) AND microphone array configuration (`spatial sampling’)

•  Classification: Fixed beamforming: data-independent, fixed filters Fm e.g. delay-and-sum, filter-and-sum

Adaptive beamforming: data-dependent filters Fm e.g. LCMV-beamformer, generalized sidelobe canceler

=This lecture

=Next lecture

4


Introduction

•  Background/history: ideas borrowed from antenna array design and processing for radar & (later) wireless communications

•  Microphone array processing considerably more difficult than antenna array processing: –  narrowband radio signals versus broadband audio signals –  far-field (plane waves) versus near-field (spherical waves) –  pure-delay environment versus multi-path environment

•  Applications: voice controlled systems (e.g. Xbox Kinect), speech communication systems, hearing aids,…


Overview






5


Data model & definitions 1/5 Data model: source signal in far-field (see p.14 for near-field)

•  Microphone signals are filtered versions of source signal S(ω) at angle θ

•  Stack all microphone signals (m=1..M) in a vector d is `steering vector’ •  Output signal after `filter-and-sum’ is

[ ]TjM

j MeHeH )()(1 ).,(...).,(),( 1 θωτθωτ θωθωθω −−=d

)()}.,().({),().(),().(),(1

* ωθωωθωωθωωθω SYFZ HHM

mmm dFYF ===∑

=

H instead of T for convenience (**)

)( ..),(),(shift phase dep.-pos.

)(

pattern dir.

ωθωθω θωτ SeHY mjmm

!"!#$!"!#$−=

)().,(),( ωθωθω SdY =


Data model: source signal in far-field •  If all microphones have the same directivity pattern Ho(ω,θ), steering

vector can be factored as…

•  Will often consider arrays with ideal omni-directional microphones : Ho(ω,θ)=1 Example : uniform linear array, see p.5 •  Will use microphone-1 as reference (e.g. defining input SNR):

Data model & definitions 2/5

)().,(),( ωθωθω SdY =

[ ]!!!!! "!!!!! #$!"!#$positions spatial

)()(

pattern dir.

0 ...1.),(),( 2Tjj MeeH θωτθωτθωθω −−=d

microphone-1 is used as a reference (=arbitrary)

d1(ω,θ )=1

6



Definitions: (1) •  In a linear array (p.5) : θ =90o=broadside direction θ = 0o =end-fire direction

•  Array directivity pattern (compare to p.3) = `transfer function’ for source at angle θ ( -π<θ< π )

•  Steering direction = angle θ with maximum amplification (for 1 freq.)

•  Beamwidth (BW) = region around θmax with amplification > (max.amplif - 3dB) (for 1 freq.)

),().()(),(),( θωω

ωθω

θω dFHSZH ==

),(maxarg)(max θωωθ θ H=



Data model: source signal + noise •  Microphone signals are corrupted by additive noise

•  Define noise correlation matrix as

•  Will assume noise field is homogeneous, i.e. all diagonal elements of noise correlation matrix are equal :

•  Then noise coherence matrix is

[ ]TMNNN )(...)()()( 21 ωωωω =N

})().({)( Hnoise E ωωω NNΦ =

iΦΦ noiseii ∀= , )()( ωω

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

==

1............1

)(.)(

1)( !ωωφ

ω noisenoise

noise ΦΓ

)()().,(),( ωωθωθω NdY += S

7



Definitions: (2) •  Array Gain = improvement in SNR for source at angle θ ( -π<θ< π )

•  White Noise Gain =array gain for spatially uncorrelated noise

(e.g. sensor noise) ps: often used as a measure for robustness •  Directivity =array gain for diffuse noise (=coming from all directions)

DI and WNG evaluated at θmax is often used as a performance criterion

⎟⎟⎠

⎞⎜⎜⎝

⎛ −=Γ

cddf ijsdiffuse

ij

)(sinc)(

ωω

skip this formula )(.).(

),().(),(

2

ωω

θωωθω

FFdFdiffusenoise

H

H

DIΓ

=

)().(),().(

),(2

ωω

θωωθω

FFdF

H

H

WNG =Iwhite

noise =Γ

|signal transfer function|^2 (with micr-1 used as reference: d1 =1)

|noise transfer function|^2 )().().(),().(

),(2

ωωω

θωωθω

FΓFdF

noiseH

H

input

output

SNRSNR

G ==

PS: ω is rad/sample ( -Π≤ω≤Π ) ω fs is rad/sec


•  Far-field assumptions not valid for sources close to microphone array –  spherical waves instead of plane waves –  include attenuation of signals –  2 coordinates θ,r (=position q) instead of 1 coordinate θ (in 2D case)

•  Different steering vector (e.g. with Hm(ω,θ)=1 m=1..M) :

PS: Near-field beamforming

[ ]TjM

jj Meaeaea )()(2

)(1

21),( qqqqd ωτωτωτω −−−= …),( θωd

smref

m fc

pqpqq

−−−=)(τ

with q position of source pref position of reference microphone pm position of mth microphone

m

refma pq

pq−

−=

e

e=1 (3D)…2 (2D)

8


PS: Multipath propagation

•  In a multipath scenario, acoustic waves are reflected against walls, objects, etc..

•  Every reflection may be treated as a separate source (near-field or far-field)

•  A more realistic data model is then.. with q position of source and Hm(ω,q), complete transfer function from

source position to m-the microphone (incl. micr. characteristic, position, and multipath propagation)

`Beamforming’ aspect vanishes here, see also Lecture-5 (`multi-channel noise reduction’)

)()().,(),( ωωωω NqdqY += S

[ ]TMHHH ),(...),(),(),( 21 qqqqd ωωωω =


Overview






9


Filter-and-sum beamformer design

•  Basic: procedure based on page 11 Array directivity pattern to be matched to given (desired) pattern over frequency/angle range of interest

•  Non-linear optimization for FIR filter design (=ignore phase response)

•  Quadratic optimization for FIR filter design (=co-design phase response)

),().()(),(),( θωω

ωθω

θω dFHSZH ==

),( θωdH

[ ] ∑−

=

−==1

0,1 .)( , )(...)()(

N

n

jnnmm

TM efFFF ωωωωωF

θωθωθωθ ω

ddHH dNnMmf nm ),(),(min

2

1..0,..1,, ∫ ∫ −−==

( ) θωθωθωθ ω

ddHH dNnMmf nm ),(),(min 2

1..0,..1,, ∫ ∫ −−==


Filter-and-sum beamformer design •  Quadratic optimization for FIR filter design (continued)

With optimal solution is

θωθωθωθ ω

ddHH dNnMmf nm ),(),(min

2

1..0,..1,, ∫ ∫ −−==

[ ] [ ]

[ ]!!!! "!!!! #$

!!! %!!! &'

(

),(

).1(

.0

1

11,0,

),(.)(.),(.)(...)(),().(),(

... , ...

θω

ω

ω

θωθωωωθωωθω

d

dfddF

ffffNMxM

jN

j

MxMTH

MHH

TTM

TTNmmm

e

eIFFH

ff

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

⊗===

==

−

−

∫ ∫∫ ∫ === −

θ ωθ ωθωθωθωθωθωθω ddHdd d

Hoptimal ),().,( , ),().,( , . *1 dpddQpQf

Kronecker product

10


Filter-and-sum beamformer design •  Design example

M=8 Logarithmic array N=50 fs=8 kHz

01000

20003000 0

4590

135180

0

0.5

1

Angle (deg)

Frequency (Hz)


Overview






11


Matched filtering: WNG maximization

•  Basic: procedure based on page 13

•  Maximize White Noise Gain (WNG) for given steering angle ψ

•  A priori knowledge/assumptions: –  angle-of-arrival ψ of desired signal + corresponding steering vector –  noise scenario = white

})().(),().(

arg{max)},(arg{max)(2

)()(MF

ωω

ψωωψωω ωω FF

dFF FF H

H

WNG ==


Matched filtering: WNG maximization

•  Maximization in is equivalent to minimization of noise output power (under white input noise), subject to unit response for steering angle (**)

•  Optimal solution (`matched filter’) is

•  [FIR approximation]

),(.),(

1)( 2MF ψω

ψωω d

dF =

1),().( s.t. ),().(min )( =ψωωωωω dFFFFHH

})().(),().(

arg{max)(2

)(MF

ωω

ψωωω ω FF

dFF F H

H

=

ωωωω

dNnMmf nm ∫ −−==

2MF1..0,..1, )()(min

,FF

12


Matched filtering example: Delay-and-sum

•  Basic: Microphone signals are delayed and then summed together

•  Fractional delays implemented with truncated interpolation filters (=FIR)

•  Consider array with ideal omni-directional micr’s Then array can be steered to angle ψ :

Hence (for ideal omni-dir. micr.’s) this is matched filter solution

1),( =ψωH

d

ψcos)1( dm −

Σ d

2Δ mΔ

1Δ

M1

ψ

MeF

mj

m

Δ−

=ω

ω)(

∑=

Δ+=M

mmm ky

Mkz

1

][.1][

d(ω,ψ) = 1 e− jωτ 2 (ψ ) ... e− jωτM (ψ )"#$

%&'

T

)(ψτmm =Δ M),()( ψω

ωdF =



•  Array directivity pattern H(ω,θ):

=destructive interference =constructive interference •  White noise gain :

(independent of ω)

For ideal omni-dir. micr. array, delay-and-sum beamformer provides WNG equal to M for all freqs (in the direction of steering angle ψ).

)( 1),(1),(

),().,(1),(

maxθψψθω

θω

θωψωθω

===

≤

=

HH

MH H dd

MWNG H

H

==== ..)().(),().(

),(2

ωω

ψωωψθω

FFdF

ideal omni-dir. micr.’s

13


02000

40006000

8000 045 90 135

180

0.2

0.4

0.6

0.8

1

Angle (deg)Frequency (Hz)

•  Array directivity pattern H(ω,θ) for uniform linear array:

H(ω,θ) has sinc-like shape and is frequency-dependent


)2/sin()2/sin(

),(

2/

2/1

)cos(cos)1(

γγ

θω

γ

γ

ψθω

j

jM

M

m

fc

dmj

eMe

eH s

−

−=

−−−

=

= ∑

-20

-10

0

90

270

180 0

Spatial directivity pattern for f=5000 Hz

M=5 microphones d=3 cm inter-microphone distance ψ=60° steering angle fs=16 kHz sampling frequency

=endfire

γ

1),( ==ψθωH

ψ=60° wavelength=4cm



0

2000

4000

6000

8000 050

100150

0.20.40.60.8

1

Angle (deg)

Frequency (Hz)


For an ambiguity, called spatial aliasing, occurs. This is analogous to time-domain aliasing where now the spatial

sampling (=d) is too large. Aliasing does not occur (for any ψ) if

M=5, ψ=60°, fs=16 kHz, d=8 cm

)cos1.(

ψ+=d

cf

2.2min

max

λ==≤

fc

fcds

)cos1.(

.. and 0for occurs 2 then2

if 3)

)cos1.(.. and for occurs 2 then

2 if 2)

) all(for for 0 )1integer for 2 1),(

Details...

ψθγ

πψ

ψπθγ

πψ

ωψθγ

θω

−====≥

+====≤

==

==

dcfπ

dcfπ

pπ.pγiffH


)cos1.(

ψ+≥d

cf

14



•  Beamwidth for a uniform linear array:

hence large dependence on # microphones, distance (compare p.24 & 25) and frequency (e.g. BW infinitely large at DC)

•  Array topologies: –  Uniformly spaced arrays –  Nested (logarithmic) arrays (small d for high ω, large d for small ω) –  2D- (planar) / 3D-arrays

with e.g. ν=1/sqrt(2) (-3 dB)

d

2d

4d

ψω

νsec)1(96

dMcBW

−≈



Overview






15


Super-directive beamforming : DI maximization

•  Basic: procedure based on page 13

•  Maximize Directivity (DI) for given steering angle ψ

•  A priori knowledge/assumptions: –  angle-of-arrival ψ of desired signal + corresponding steering vector –  noise scenario = diffuse

})(.).(

),().(arg{max)},(arg{max)(

2

)()(SD

ωω

ψωωψωω ωω FF

dFF FF diffuse

noiseH

H

DIΓ

==


•  Maximization in

is equivalent to minimization of noise output power (under diffuse input noise), subject to unit response for steering angle (**)

•  Optimal solution is

•  [FIR approximation]

})(.).(

),().(arg{max)(

2

)(SD

ωω

ψωωω ω FF

dFF F diffuse

noiseH

H

Γ=

1),().( s.t. ),().().(min )( =Γ ψωωωωωω dFFFFHdiffuse

noiseH

),(.)}(.{1)( 1

),(.1)}(.{),(

SD ψωωωψωωψω

dFdd

−−Γ

Γ= diffusenoisediffuse

noiseH

ωωωω

dNnMmf nm ∫ −−==

2SD1..0,..1, )()(min

,FF


16


•  Directivity patterns for end-fire steering (ψ=0):

Superdirective beamformer has highest DI, but very poor WNG (at low frequencies, where diffuse noise coherence matrix becomes ill-conditioned) hence problems with robustness (e.g. sensor noise) !

-20

-10

0

90

270

180 0

Superdirective beamformer (f=3000 Hz)

-20

-10

0

90

270

180 0

Delay-and-sum beamformer (f=3000 Hz)

M=5 d=3 cm fs=16 kHz

Maximum directivity=M.M obtained for end-fire steering and for frequency->0 (no proof)


0 2000 4000 6000 80000

5

10

15

20

25

Frequency (Hz)

Direc

tivity

(line

ar)

SuperdirectiveDelay-and-sum

0 2000 4000 6000 8000-60

-50

-40

-30

-20

-10

0

10

Frequency (Hz)

Whit

e nois

e gain

(dB)

SuperdirectiveDelay-and-sum

WNG=M= 5

PS: diffuse noise ≈ white noise for high frequencies (cfr. ωèΠ and c/fs=λmin/2≈min(dj-di) in diffuse noise coherence matrix)

DI=WNG=5

DI=M 2=25



Overview






17


•  First-order differential microphone = directional microphone 2 closely spaced microphones, where one microphone signal is delayed (=hardware) and then subtracted from the other micropone signal

•  Array directivity pattern:

–  First-order high-pass frequency dependence –  P(θ) = freq.independent (!) directional response –  0 ≤ α1 ≤ 1 : P(θ) is scaled cosine, shifted up with α1 such that θmax = 0o (=end-fire) and P(θmax )=1

d

Σ τ

θ

+

_

)cos(1),( c

djeH

θτω

θω+−

−=

ωd/c <<π, ωτ <<π

cd /1 +=τ

τα

θααθ cos)1()( 11 −+=P

H (ω,θ ) ≈ jω.(τ + dc

cosθ ) = jωhigh-pass!

.(τ + dc

). P(θ )angle dependence!

Differential microphones : Delay-and-subtract


Differential microphones : Delay-and-subtract

•  Types: dipole, cardioid, hypercardioid, supercardioid (HJ84)

=endfire

=broadside

Dipole:

α1= 0 (τ=0) zero at 90o

DI=4.8dB

Cardioid:

α1= 0.5 zero at 180o

DI=4.8dB

Supercardioid:

α1= zero at 125o, DI=5.7 dB highest front-to-back ratio

35.02/)13( ≈−

Hypercardioid:

α1= 0.25 zero at 109o highest DI=6.0dB

digital audio signal processing dasp - ku leuvendspuser/dasp... · 2 digital audio signal...

Documents