2011 beamforming regularization, scaling matrices and inverse problems for sound field extrapolation...
TRANSCRIPT
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
1/32
Audio Engineering Society
Convention PaperPresented at the 131st Convention2011 October 2023 New York, USAThis paper was peer-reviewed as a complete manuscript for presentation at this Convention. Additional papers may be obtainedby sending request and remittance to Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; alsosee www.aes.org. All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission
from theJournal of the Audio Engineering Society.
Beamforming regularization, scaling
matrices and inverse problems for soundfield extrapolation and characterization:Part I Theory
Philippe-Aubert Gauthier1,2,Eric Chambatte1,2, Cedric Camier1,2, Yann Pasco1,2, and Alain Berry1,2
1Groupe dAcoustique de lUniversite de Sherbrooke, Univ. de Sherbrooke, Sherbrooke, J1K 2R1 Canada
2Centre for Interdisciplinary Research in Music, Media and Technology, McGill Univ., Montreal, H3A 1E3 Canada
Correspondence should be addressed to Philippe-Aubert Gauthier
ABSTRACT
Sound field extrapolation (SFE) is aimed at the prediction of a sound field in an extrapolation region usinga microphone array in a measurement region. For sound environment reproduction purposes, sound fieldcharacterization (SFC) aims at a more generic or parametric description of a measured or extrapolatedsound field using different physical or subjective metrics. In this paper, a SFE method recently introducedis presented and further developed. The method is based on an inverse problem formulation combined witha beamforming matrix in the discrete smoothing norm of the cost function. The results obtained from theSFE method are applied to SFC for subsequent sound environment reproduction. A set of classificationcriteria is proposed to distinguish simple types of sound fields on the basis of two simple scalar metrics. Acompanion paper presents the experimental verifications of the theory presented in this paper.
1. INTRODUCTION
For spatial sound reproduction technologies based on
physical simulationsuch as Wave Field Synthesis (WFS)
[1, 2], the underlying hypothesis is that the immersion
of a listener in a physical reconstruction of a target
sound field will lead to an appropriate sound percep-
tion over a large listening area. In this area, the local-
ization cues (interaural level difference, interaural time
difference and spectral modifications) are naturally de-
rived from the interaction of the listeners body and ex-
ternal ears with the recreated sound field. To reproduce
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
2/32
Gauthier et al. Sound field extrapolation and characterization I
or recreate a real sound field or real sound environment,
WFS and other physical reproduction techniques require
a complete physical description of the target sound field.
Sound field extrapolation using microphone array tech-nologies is appropriate for this purpose. In this paper, a
sound field extrapolation and characterization methodol-
ogy is presented. The experimental tests of the method
are reported in a companion paper.
This work is part of a larger project which involves the
entire sound field reproduction of an airplane cabin in a
full-scale mock-up. The objective of the reported theory
and experiments is to get preliminary insights about the
efficiency and validity of the sound field extrapolation
(SFE) and sound field characterization (SFC) methods in
a practical situation. Preliminary experiments in labora-
tory conditions ensure the validation of the method be-fore the realization of actual on-site measurements, SFE
and SFC for subsequentsoundenvironment reproduction
in a mock-up of an airplane cabin.
Sound field extrapolation (SFE) finds many applications
in various domains: acoustic imaging, source localiza-
tion, sound field reproduction, etc. SFE relies primarily
on the measurement of a sound field using a microphone
array placed in a measurement region. Among the most
common techniques, one finds: inverse problems [3, 4]
and spatial transform methods (such as nearfield acousti-
cal holography [5]). In this paper, we consider an inverse
method since this method can easily deal with any mi-crophone array configuration, regular or not. However,
the typically-large condition number of the matrix that
must be inverted signals that matrix-form inverse prob-
lems are sensitive to measurement noise [6]. Therefore,
regularization of the inverse problem is mandatory. Usu-
ally, with conventional regularization methods, this is at
some expense: reduced spatial resolution and supple-
mentary regularization errors. In a recent paper, a new
measurement-data-dependent regularization method that
suffers less from the aforementioned issues was intro-
duced [7].
The novelty of the method is that it applies a beamform-ing regularization matrix in the discrete smoothing norm
of the cost function used to solve the inverse problem
in the least-mean-square sense [3]. The advantages of
this method are to increase the solution spatial resolu-
tion and reduce the measurement noise sensitivity. In the
inverse problem, the beamforming regularization matrix
simply penalizes more strongly the sources for which an
a priori delay-and-sum beamformer gives a weaker am-
plitude. Recently, an experimental validation of the SFE
method was reported [8]. The validation was based on
the direct comparison of an extrapolated sound field withthe exact sound field in an extrapolation region differ-
ent from the measurement region. It was shown that the
proposed SFE method is effective. In this paper, new the-
oretical developments for the interpretation of the beam-
forming regularization matrix will be introduced on the
basis of the transformation of the general-form inverse
problem with the beamforming regularization matrix to
a standard-form inverse problem [3]. A companion pa-
per (Part - II) discusses the results of a complete experi-
mental verification of this recently developedmethod for
SFE. This companion paper presents experiments in an
hemi-achenoic room and in a reverberant chamber.
For sound environment and soundfield reproduction pur-
poses, SFE results can readily be applied to the deriva-
tion of multichannel signals using sound field reproduc-
tion technologies such as Wave Field Synthesis or Am-
bisonics [8, 10]. However, in some practical applica-
tions such as sound environment reproduction in vehi-
cle mock-ups, the entire sound environment tends to be
made of mostly stationary signals, at least for a finite pe-
riod of time (corresponding to cruise speed, fixed alti-
tude, stationary road condition, etc.). This very specific
yet simplified nature of the sound environment encoun-
tered in most vehicles allows for the fragmentation of thesound environment into sound components or sound en-
vironment atoms [11, 12]. For such components of the
entire sound environment, it is sometimes more useful to
summarize the spatial property of the sound component
by few simple metrics using a general sound field char-
acterization (SFC) method. For example, this is the hy-
pothesis behind the Directional Audio Coding (DiRAC)
[13, 14, 15, 16] approach by Pulki and coworkers for a
point in space for which the spatial sound properties are
summarized as impinging directions and diffuseness as
function of frequency. In this paper, we develop these
ideas further and apply them to the typical SFE results
obtained by the proposed method for an extended spa-tial area. Moreover, we propose a simple classification
method to distinguish simple and generic types of sound
field. This classification is deduced from direct observa-
tion of the metrics efficiency to distinguish these generic
types of sound field. Supplementary methods for virtual
acoustics and simulations from microphone array mea-
surement are also possible and discussed in [8].
AES 131st Convention, New York, USA, 2011 October 2023
Page 2 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
3/32
Gauthier et al. Sound field extrapolation and characterization I
Fig. 1: Illustration of the coordinate systems. Micro-
phones are located in xm. Points that belong to the in-
verse problem equivalent source distribution are denoted
by y. Any field point is denoted by x. The real sound
sources are confined to Vs.
1.1. Paper structure
Section 2 presents the general theory behind SFE using
the inverse problem approach and the beamforming reg-
ularization matrix as initially introduced in [7]. In this
paper, the beamforming regularization theory introduced
in [7] is further developed. The SFC metrics and meth-
ods are presentedin Sec. 3 where several of these metricsare discussed and compared on the basis of simple yet
archetypical theoretical cases. Based on the evaluation
of these metrics, a sound field components classification
tree is also proposed in Sec. 3. A short discussion and a
conclusion gather the main concluding remarks.
2. SOUND FIELD EXTRAPOLATION
The generic microphone array and coordinate systems
are shown in Fig. 1. The array includes Mmicrophones.
For a given frequency, a sound pressure field measure-
ment is stored in a complex vector p(xm) M. Al-though the method is developed in the frequency do-
main, it is possible to derive the resulting time-domain
quantities using inverse Fourier transform as long as the
equivalence between circular and linear convolution is
respected with proper zero-padding of the input data [9].
2.1. Direct problem
The discrete direct sound radiation problem in matrix
form:
p(xm) =G(xm,yl)q(yl ), (1)
with
p
M,G
ML andq
L, (2)
whereqis the source strength vector for sources located
inyl ,Gis the transfer matrix that represents sound radi-
ation and p is the resulting sound pressure vector at the
microphone locationsxm. In this paper, a simple model
of the direct problem is used: qare amplitudes of el-
emental plane waves propagating in different directions
andin Fig. 1. Therefore, we let R . Then, inthis more specific case: Gml =e
ikl xm with k l being the
wave vector for the l -th plane wave (kl = knl ,k=/c,
is the angular frequency [rad/s], c is the sound speed[m/s],nlis a unit vector aligned withkl ,land lare thepropagation azimuth and elevation). Many other types of
sources or idealized waves could be used in the direct
problem definition. Indeed, one may object that spheri-
cal or cylindrical harmonic waves could be more suitable
for the inverse problem. This is only the case when the
microphonearraydoes not include the origin of the coor-
dinate system. Indeed, the linear combination of spheri-
cal harmonics or cylindrical harmonics tends to numeri-
cally diverge in the immediate vicinity of the coordinate
system origin. In our case, the microphone array a priori
covers an extended area (as opposed to compact arrays
such as the first-order Ambisonics Sound Field micro-phone [17]) and typically includes the origin, hence our
interest for plane waves in the discrete problem. In all
cases, it is possible to convert plane waves into spherical
harmonics in a subsequent step.
2.2. Sound field extrapolation outside the mi-crophone array
Theextrapolatedsound pressurefield [Pa] at any location
xis then computed using a linear combination of plane
waves
p(x) =L
l=1
eikl xql , (3)
where the complex plane wave distribution is centered
around the coordinate system origin x = 0. Indeed, onenotes that the sound pressureat the origin is thedirect lin-
ear combination of the plane wave complex amplitudes
p(0) =L
l=1
ql. (4)
AES 131st Convention, New York, USA, 2011 October 2023
Page 3 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
4/32
Gauthier et al. Sound field extrapolation and characterization I
For any field point x that excludes the array origin, it
would be interesting for notational purposes to obtain an
expression similar to Eq. (4) where a new complex plane
wave distributionqwould be centered around the fieldpointx. This corresponds to a simple translation of the
coordinate system origin. This is expressed as follow
p(x) =L
l=1
ql(x), (5)
with
ql(x) =ejkl xql. (6)
2.3. Inverse problem: general-form andstandard-form of Tikhonov regularization
For SFE, the goal of inverse problem is to estimate the
source amplitude qthat best predicts the measured sound
fieldp, knowing the propagationoperatorG. Put simply,
as for most classical inverse problems in acoustics, we
ask for the causes, i.e. the sources amplitudesq, that cre-
ated the effect, i.e. the measurement data p, for a given
known system model G , i.e. for an imposed geometry
of source distribution. Note that for practical applica-
tions, the source geometry is imposed and specified, but
for now, we keep an unspecified definition of the source
distributionylto propose a general view of the method.
A typical approach to that problem is to cast it as a min-
imization problem with Tikhonov regularization [3]:
q=argmin
p Gq22 +2(q)2
. (7)
In Eq. (7), 2 represents the vector 2-norm (x22=
xHx, superscript Hdenotes Hermitian transpose), isthe penalization parameter and()is a discrete smooth-ing norm [3]. The function()is termed a discretesmoothing norm because it smoothly regularizes the un-
known solutionq. In classical Tikhonov regularization
as reported in many papers [4, 6, 18, 19, 20], the dis-
crete smoothing norm is the solution vector 2-norm [3]:
(q) = q2. The inverse problem solutionq should
approach the real sound source distribution or should, atleast, be able to achieve SFE according to Eq. (3) within
an extrapolation region for which the prediction error
would be below a given threshold.
In this paper, we will assume that the discrete smoothing
norm is of the more general form
(q) =Lq2, (8)
where L NL is a rectangular or square weighting ma-trix. Then, one writes the general-form inverse problem
as
q=argmin
p Gq22 +2Lq22, (9)
and the standard-form inverse problem as [3]
q
=argmin
p Gq22 +2q22
. (10)
In this equation, thenew standard-formmatrices andvec-
torsGandpmust be computed fromGandpwhich, de-
pending on the weighting matrixL, might not be a trivial
task. This must often be achieved using numerical meth-
ods [3].
However, when the weighting matrix is square (L
LL) and when its inverse exists, one directly obtains
the standard-form transformed quantities
G=GL1, p=pandq=L1q
. (11)
When L= I, as often reported in the literature, thereis no difference between the general-form and standard-
form problems. The optimal solution of the general-form
problem Eq. (9) is [3, 4, 6]
q= GHp
GHG +2LHL. (12)
The solution of the standard-form problem Eq. (10) is
[3, 4, 6]
q
=GHp
GHG +2I. (13)
For the specific case of Eq. (11) (i.e. L LL anddet(L)=0), Eq. (13) can be directly expressed as func-tion ofL
q=L1q
=L1
[L1]TGHp
[L1]TGHGL1 +2I (14)
with superscript T denoting matrix transposition. There-
fore, for the specific case of Eq. (11), it is possible to in-
terpret the problem from two equivalent vantage points:
a general-form problem with a regularization matrix L(Eq. (12))or a standard-formproblem with a transforma-
tion matrixL1 that transforms the propagation operator
G(Eq. (13)). In the even more specific case of a diagonal
matrixL, the regularizationmatrixLputs weights on the
individual solution components ql and the transforma-
tion matrix is a diagonal matrix L1 that scales columns
of the propagation operatorG.
AES 131st Convention, New York, USA, 2011 October 2023
Page 4 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
5/32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
6/32
Gauthier et al. Sound field extrapolation and characterization I
in reference [7]. Moreover, initial experiments with the
beamforming regularization matrix in an hemi-anechoic
room, with a 96-microphone array and with a single om-
nidirectionnal source showed the practicability and va-lidity of the method [8].
To illustrate the validity of the method, a simple theoreti-
cal example is given in Figs. 2, 3 and 4. The direct prob-
lem source distribution involves 642 plane waves com-
ing from 4steradians. The microphone array is shownin Fig. 3: it is made of 256 microphones. The original
sound field is created by a dipole inx = [1.8k,2k,0]T
where k is the acoustic wavelength. Real parts ofthe original and extrapolated sound fields are shown in
Fig. 4. The SFE was based on Eq. (20) with=0.01.Clearly, SFE using the proposed method is effective.
In the following, we illustrate the equivalence of the
beamforming regularization matrix and the correspond-
ing scaling matrixL (see Eqs. (11) to (13))
L =L1 =diag(|GHp|/GHp) LL. (21)
Then, Eq. (14) gives
qBF= diag(|QBF|/QBF)
2GHp
GHdiag(|QBF|/QBF)2G +2I (22)
since L is diagonal. This solution is equivalent to
Eq. (20). Therefore, it is possible to interpret the original
beamforming regularization matrix as a standard-formproblem using a data-dependent scaled system matrixG
G=GL =Gdiag(|GHp|/GHp), (23)
withqBF= Lq
BF. MatrixL will be called the scaling
matrix.
2.6. Spatial resolution: equivalence between
the beamforming regularization matrix andbeamforming scaling matrix
As discussed in [7], since the beamforming regulariza-
tion matrix involves a general-form inverse problem, one
must rely on the generalized singular value decomposi-tion (GSVD) of the matrices pair Gand Lto evaluate
the possible spatial resolution of the problem. This is
typically evaluated on the basis of the generalized sin-
gular vectors. In the case of the beamforming scaling
matrix, the problem is written in standard form and the
spatial resolution of the problem can be evaluated on the
basis of the singular value decomposition (SVD) of the
0.5
00.5
1
0.5
0
0.5
1
1
0.5
0
0.5
1
cos(l)cos(l)sin(l)cos(l)
sin(l)
Fig. 2:Spherical distribution ofL= 642 incoming planewaves. Each propagating directionlandlis shown asa black dot on the sphere with the corresponding direc-
tion cosines.
x2
x1
x3
Fig. 3:Theoretical 256-microphonearray geometry. Mi-
crophones are horizontally aligned with a uniform rect-
angular grid (shown in grey) and they are randomly po-sitioned along x3on the basis of a two-layer geometry
(with 0.1 wavelength as the vertical separation distance).
Microphone acoustic centers are shown as black dots.
The problem is dimensionless and the array spans two
wavelengths alongx1and x2.
AES 131st Convention, New York, USA, 2011 October 2023
Page 6 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
7/32
Gauthier et al. Sound field extrapolation and characterization I
Fig. 4: (a): Real part of the original sound field created
by a dipole sound source in x = [1.8k,2k,0]T in the
direction/2.25 radians. Field positions are normalized
by the acoustical wavelengthk. (b): Real part of the ex-trapolated sound field using a plane wave source distri-
bution obtained from Eq. (20) with= 0.01. The dipoleis marked as a black and white dot. The dipole main
and orthogonal axes are highlighted by dashed black and
white lines. The white contour line indicates the region
of 0.001 of local quadratic SFE error and the black con-
tour line indicates the region of 0.1 of local quadratic
SFE error (p(x) p(x)22).
scaled system matrix G. In this paper, the equivalence
of the two formulations in terms of spatial resolution
will be illustrated. Furthermore, this demonstration will
also illustrate the better spatial resolution obtained by thebeamforming regularization or scaling matrices.
The SVD ofGis given by
G=UVH =M
i=1
uiivHi , (24)
with unitary matricesU
MM andV
LL (UHU =VHV=I). In Eq. (24), the vectorsuiand viare the leftand right singular vectors, respectively. They correspond
to the columns of Uand V. Each singular vector pair
corresponds to a singular value i stored on the main
diagonal of
ML. It is assumed that the number
of microphones is smaller than the number of unknown
sources. The singular values are ordered in decreasing
order (1 2 >0). On the basis of this SVD, thesolution of the standard form is written
qBF=L
M
i=i
fi
uHi p
ivi (25)
where the filter factors fi=2i/(
2i +
2)represent theregularization effect.
The GSVD of G and L is given by [3, 21] with U
MM,V LL,C ML,M LL andZ LL.The columns ofUandVare orthonormal (UHU=IandVHV= I) and Z is nonsingular. The columns ofU, VandZ(ui,viand zi, respectively) form a new set of sin-
gular vectors that are used as independent basis vectors.
The columns ofU are used as basis vectors for acoustic
pressurepwhile the columns ofZare used as basis vec-
tors for source distributionq. Note thatUandVare not
equal to those found from standard SVD (namely,Uand
V). Matrices C and M have their coefficients ciand m istored in increasing order on their main diagonals. The
generalized singular valuesiare given by
i=ci/mi. (26)
On the basis of this GSVD, Eq. (12) is written
qBF=M
i=1
fiuHi p
cizi (27)
where fi= 2i/(
2i +
2)represents the regularization ef-fect on the solution.
AES 131st Convention, New York, USA, 2011 October 2023
Page 7 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
8/32
Gauthier et al. Sound field extrapolation and characterization I
An example is now introduced to highlight the increased
spatial resolution of the inverse problem approach with
the beamforming regularization matrix in comparison
with the inverse problem with the identity matrix as theweighting matrix. Moreover, this example will illus-
trate the fact that the spatial resolution obtained with
the beamforming regularization matrixLwritten in gen-
eral form is equal to the resolution obtained with the
beamforming scaled system matrixGwritten in standard
form. To illustrate this property, we simply rely on the
comparison of the spatial resolution of the generalized
singular vectors zi and the scaled singular vectors Lvi
since they both form the orthogonal bases of the solu-
tions of the standard form Eq. (25) and the general form
Eq. (27). Theziand Lviare presented in Figs. 5(a) and
5(b). For comparison purposes, the right singular vectors
of the matrixGare shown.
The example involves a linear microphone array ofM=32 microphones spanning 4 acoustical wavelengths kwith a plane wave distribution ofL=256 plane waves(l=0, ,and l=0) and for a plane wave incidentfrom =/2. First note the effect of the beamform-ing regularization matrix and the corresponding scaling
matrix by comparison with the standard system matrix
G: they provide a locally-increased spatial resolution in
the vicinity of the impinging sound wave. Moreover, the
comparison of Fig. 5(a) and Fig. 5(b) illustrates the ex-
act correspondence of the vector bases used for the stan-
dard form Eq. (25) and the general form Eq. (27). There-fore, on the basis of this example and Eqs. (20) and (22),
the increasedspatial resolutionproperty associated to the
beamforming regularizationmatrix method (as originally
presented on the basis of the GSVD in [7]) is equivalent
to the increased spatial resolution property for the scaled
system matrix.
3. SOUND FIELD CHARACTERIZATION
In this section, several metrics and quantifiers are pre-
sented to characterize the measured and extrapolated
sound fields for a given frequency on the basis of theinverse problem solutions q, qBF or q(,), i.e. theplane wave distributions. In some cases, the metrics are
computed directly from the inverse problem solution and
in some other cases the metrics are computed from the
SFE result, namely the sound pressure or the particle ve-
locity. The presented metrics are either objective or sub-
jective predictors. A distinction is also introduced be-
tween local and global metrics. It is known from the lit-
erature that some metrics aremore effective to predict the
listener sound localization in different frequency bands
[24], therefore several metrics are presented and dis-cussed before being exemplified using the SFE method
presented earlier.
3.1. Sound intensity and direction-of-arrival
fields and averages
The extrapolated sound pressure field [Pa] as function of
xis given by the algebraic superposition of the Lhar-
monic plane waves used in the direct problem as ex-
pressed in Eq. (3).
The acoustic velocity field u(x)[m/s] is computed using
the linearized Euler equation [25]
u(x) =p(x)
i , (28)
with being the air density [kg/m3],the angular fre-quency [rad/s] and the gradient operator given by
=
x1e1 +
x2e2 +
x3e3 (29)
whereeiis a canonical vector [21] pointing in the xidi-
rection. Accordingly, for the problem at hand, one finds
p(x) =L
l=1
ikl eikl xql, (30)
and
u(x) =L
l=1
nl
ceikl xql, (31)
with nl = k l/kl2 being a unit vector collinear withkl . For a given harmonic sound field, the time averaged
acoustic intensityI(x)[W/m2] is given by [25]
I(x) =1
2[p(x)u(x)], (32)
which gives, in our specific case
I(x) =1
2
L
l=1
eikl xql
L
l=1
nl
ceikl xql
. (33)
Many metrics presented in the sequel are derived from
the sound pressure, velocity and intensity fields.
AES 131st Convention, New York, USA, 2011 October 2023
Page 8 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
9/32
Gauthier et al. Sound field extrapolation and characterization I
0 0.25 0.5 0.75 1
10
20
30
Q
BF
0 0.2 0.4 0.6 0.8 11
0
1
Lv1
0 0.2 0.4 0.6 0.8 11
0
1
Lv2
0 0.2 0.4 0.6 0.8 11
0
1
Lv3
0 0.2 0.4 0.6 0.8 11
0
1
Lv4
0 0.2 0.4 0.6 0.8 11
0
1
Lv5
0 0.2 0.4 0.6 0.8 11
0
1
Lv6
l/
(a)
0 0.25 0.5 0.75 1
10
20
30
Q
BF
0 0.2 0.4 0.6 0.8 11
0
1
z1
0 0.2 0.4 0.6 0.8 11
0
1
z2
0 0.2 0.4 0.6 0.8 11
0
1
z3
0 0.2 0.4 0.6 0.8 11
0
1
z4
0 0.2 0.4 0.6 0.8 11
0
1
z5
0 0.2 0.4 0.6 0.8 11
0
1
z6
l/
(b)
Fig. 5: Absolute value of the beamforming output QBF(Eq. (15)) (top) and the first six (from top) singular vectors
(black lines) (a) and generalized singular vectors (black lines) (b) for a linear microphone array of 32 microphones
spanning 4 acoustical wavelengthskwith a plane wave distribution of 256 plane waves (l= 0, ,and l= 0) andfor a plane wave incident from= /2. The real part of the vectors are shown as continuous lines and the imaginaryparts of the vectors as dashed lines. For comparison purpose, the right singular vectors of the matrixGare shown in
grey.
AES 131st Convention, New York, USA, 2011 October 2023
Page 9 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
10/32
Gauthier et al. Sound field extrapolation and characterization I
The direction of the sound intensityI(x)can also be usedto predict a local indication of the DOA (direction-of-
arrival). Indeed, the DOA is a unit vector in the opposite
direction of the sound intensity vector. Then the DOAvectornDOA(x)is given by
nDOA(x) = I(x)
I(x)2. (34)
For a set of N SFE pointsxn, the average intensity vector
is introduced for a spatially discrete extrapolationregion:
IN=1
N
N
n=1
I(xn). (35)
This averaging operation can also be computed for all
the subsequent metrics, including the DOA. It will be
systematically denoted byN.
Intensity and DOA fields are both objective and sub-
jective metrics, they represent a directional transport of
acoustical energy, but they are also sometimes used as
indicators of sound localization by human hearing, es-
pecially the DOA. However, the sound intensity solely
expresses the net flow of energy, it does not indicate the
direction of particular simultaneous arrivals, as for the
DOA.
3.2. Energy density field and average energy
The local time-averaged energy density field E(x)of anharmonic acoustic sound field is a combination of ki-
netic and potentialenergydensity fields,Ec(x) andEp(x)[J/m3], respectively [25]:
E(x) =Ec(x) +Ep(x) =
4
u(x)22 +
|p(x)|2
(c)2
.
(36)
According to Eqs. (3) and (31), one obtains for the prob-
lem at hand
E(x) =
4
L
l=1
nl
ceikl xql
2
2
+|Ll=1 e
ikl xql|2
(c)2 .
(37)
The energy density field can provide some interesting
insights about a measured sound field. For a com-
pletely diffuse sound field, the local spatial average en-
ergy density fieldE(x)N(withNneighboring points ofx) should be constant in space [25]. For an harmonic
sound field, a local spatial average is an average over
a volume with dimensions larger than the wave length
[25]. However, in practical situations, the local energy
density fieldE(x)is not spatially uniform and this intro-
duces some issues. This will be discussed in Secs. 3.6and 3.7.
In this paper, we also introduce the normalized standard
deviation of the local energy density with respect to the
average energy density (EN)
N= 1
NEN
N
n=1
|E(xn) EN| 100%, (38)
in % ofEN. A small Nwill suggest a uniform distri-bution of the energy density while a large Nsuggests anheterogeneous distribution of the energy density.
3.3. Directional pressure, energy density anddiffusion
Most of the previously introduced SFCmetrics andquan-
tifiers (Secs. 3.1 and 3.2) rely on the computation of SFE
and local quantities before being actually averaged over
the SFE sampled region. It is possible to introduce clas-
sical metrics on the basis of the plane wave source distri-
bution qwithout actual SFE.
3.3.1. Directional pressure
For the proposed SFE method, the output is a plane wave
amplitude vector qwhich directly gives the directionalpressure: pl(l+,l) = ql(l ,l ). Indeed, if a local-ization algorithm could be designed to listen to a single
directionl+,lfrom the SFE results, it would onlydetect a sound pressure wave withqlas its complex am-
plitude. Therefore, the passage from ql to pl is direct.
However, one should keep in mind the reversal of the
propagation directions l ,l to corresponding listeningdirectionsl+,l.
3.3.2. Directional energy density
Since, for a single harmonic plane wave ql(l ,l )thevelocity field is related to the pressure field through the
characteristic impedance c, one can directly write thedirectional energy density [25] on the basis of Eq. (36)and the directional pressure p l
El = |pl|
2
2c2=
|ql|2
2c2. (39)
The directional energy densityEl , since it is based on di-
rectional pressure pl , represents the energy density that
AES 131st Convention, New York, USA, 2011 October 2023
Page 10 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
11/32
Gauthier et al. Sound field extrapolation and characterization I
comes from the listening directionl+ ,l . The av-erage directional energy is given by
ElL=1L
Ll=1
El= 12c2L
Ll=1
|ql|2. (40)
3.3.3. Directional diffusion
The previous metrics lead to the definition of directional
diffusion. The directional diffusion in % is defined as
follows [26]
d= (1 /o) 100 % (41)
where is the average of the absolutedifference betweenthe directional energy density and the spatial average of
the directional energy density andois the value offora single impinging plane wave. Therefore, d= 100% fora perfectly diffuse sound field and d=0 % in anechoicconditions. In this paper, we follow the propostion of
Goveret al.[26] and use the following definition for
= 1
El
L
l=1
|El El|. (42)
However, for the evaluation ofo, we rely on an averageof(according to Eq. (42)) over all the possible planewave directions:
o= 1L
L
l=1
1E
(l)l
L
l=1
|E(l
)l E(l
)l |. (43)
That is, the inverse problem is theoretically computedL
times for all the possible harmonic plane wave directions
(indexl in the previous equation). The resulting solu-
tions q(l)l lead to the directional energy densities E
(l)l
used in this definition ofo. Indeed, the heterogeneousnature of the inverse problem solution q as function of
sound wave direction due to array geometry requires
the computation of a direction-averagedoas shown inEq. (43). Note that since the inverse problem solution
qis obtained with regularization, we do not expect that
the directional diffusion will reach 0 % in practical situ-ation. Indeed, the regularization introduces some spheri-
cal spreading of the solution, even for a single incoming
plane wave.
3.4. Incident directivity factor
Assuming that the plane wave distributionql(l ,l )uni-formly covers 4steradians, it is possible to quantify the
directivity of the source distribution. Inspired from the
definition of the directivity factor of sound sources, an
incident directivity factor is accordingly introduced
Q=q2q22
, (44)
with q= q or q= qBF. The corresponding incidentdirectivity index [dB ref 1] is
DI=10log10(Q). (45)
This type of incident directivity factor was also intro-
duced by Gover [26] for the analysis of transient sound
fields in rooms. As for the directional diffusion, the di-
rectivity index is an averaged parameter that expresses
the anisotropic character of a sound field.
3.5. Sound localization: Velocity and energyvectors, interaural time difference
Both the velocity and energy vectors are derived from
the audio engineering field where researchers look for
predictors of human sound localization in presence of
stereophonic sound systems.
The velocity vector was proposed as a sound localization
predictor at low frequencies, i.e. typically below 700 Hz
where the interaural phase difference is a dominating cue
for the localizationand where thehead diffraction is min-
imal [23, 24, 27]. It is originally defined as the normal-ized particle velocity at the center of the reproduction
region, where the listener stands. More recently, the ve-
locity vector definition was expanded to the entire sound
field, and it is now given by Daniel et al.[27]
V(x) =cu(x)
p(x). (46)
One notes that the velocity vector is the particle velocity
vector u(x) normalized by the particle velocityamplitudep(x)/cthat would be obtained for a purely propagatingplane wave of sound pressure amplitude equal to the lo-
cal sound pressure p(x). Therefore, the velocity vectoris a dimensionless metric. Note that, by contrast with the
intensity and DOA vectors, the velocity vector is a com-
plex quantity. The real part of the velocity vector is in
the opposite direction of the DOA vector nDOA(x)(seeEq. (34)) and is associated with precise sound localiza-
tion [24] and active sound intensity. It is also generally
accepted that the imaginary part of the velocity vector is
AES 131st Convention, New York, USA, 2011 October 2023
Page 11 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
12/32
Gauthier et al. Sound field extrapolation and characterization I
associated with image broadeningor perceived phasi-
ness [24]. Typically, the imaginary part of the veloc-
ity vector is also related to reactive sound intensity. In
[27], it is stressed that the velocity vector can also beused to predict the interaural time difference (ITD). In
accordance with the informations conveyed by the ve-
locity vector, we introduce an equivalent expression of
the velocity vector
V(x) =VR(x) + iVI(x), (47)
with VR(x) =(V(x)), and VI(x) =(V(x)). Themag-nitude, azimuth and elevation ofVR, in the spherical co-
ordinates shown in Fig. 1, are denoted VR,VRand VR ,respectively. Since it is assumed that VRis associated
with sound localization, we derive the ITD as follows
ITD(x,VR,H) =H
c sin(VR H)cos(VR ), (48)
where it is assumed that the listeners head is oriented
towardsH and H without any roll movement of thelisteners head, i.e. the two ears are always in the same
horizontal plane. In Eq. (48), the listeners ear separation
isH[m].
For the higher frequency range, the head diffraction has
an strong effect and this makes the interaural level differ-
ence (ILD) one of the dominating localization cues [27].
Accordingly, the following energy vector is a more rele-
vant predictor of sound localization above 700 Hz [27],
E=
Ll=1 nl |ql|
2
Ll=1|ql |
2 , (49)
or
E = RE
cos(E)cos(E)e1
+ sin(E)cos(E)e2 + sin(E)e3
, (50)
where 0 RE1 andE,Eare the spherical compo-nents of the energy vector. Note that the energy vector as
defined in Eq. (49) can only predict sound localization atthe coordinates systems origin.
3.6. Diffuseness field and average diffuseness
By combining the intensity field (Eq. (32)) and the
energy density field (Eq. (36)), Merimaa and Pulkki
[13, 14, 15, 16] introduced the definition of diffuseness
for a single frequency and a single point in space. For
any pointx, it is possible to write the diffuseness field as
follows:
(x) =1 I(x)/c2E(x)
. (51)
The diffuseness varies between zero and unity. Theo-
retically, in a completely diffuse field it is expected that
(x) = 1 while in a purely propagative field,(x) = 0 isexpected. One of the issues that arises with the diffuse-
ness(x)is that it mostly depends on the sound inten-sity. Therefore, any situation that would lead to a null in-
tensity will be detected as a diffuse sound field. This can
arise for two propagating plane waves in opposite direc-
tion and identical amplitudes which produce a standing
wave. In this case, the net energy flow is zero and the in-
tensity field is null. This situation is easily generalized toany sound field made of opposite-direction propagating
plane waves with similar amplitudes. Therefore, the dif-
fuseness, as defined above, cannot distinguish between a
standing wave pattern and a diffusesound field. This will
be illustrated in Sec. 3.7.
Another limitation of this definition of the diffuseness
field is that it may not be appropriate for a diffuse sound
field since the local energy density E(x)tends to varywith position for a harmonic diffuse sound field (see
Sec. 3.2). This strong variation of the local energy den-
sity, by marked contrast with the spatially-uniform local
average energy density of the theoretical harmonic dif-
fuse sound field [25], makes it difficult to use the local
diffuseness(x)as a quantifier of the overall anisotropyor diffusion of the sound field over the SFE area.
Again, for the discrete set of pointsxn, we can also intro-
duce a discrete average
N=1
N
N
n=0
(xn). (52)
In subjective terms, the diffuseness is often related to the
listener envelopment or the sensation of surrounding and
enveloping sound.
3.7. Theoretical test cases
In order to evaluate the capability of the previous char-
acterization metrics (computed on the basis of SFE) to
distinguish or characterize several types of sound fields,
several test cases are reported: a single source in free
field, multiple sources in free field, a standing wave and
AES 131st Convention, New York, USA, 2011 October 2023
Page 12 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
13/32
Gauthier et al. Sound field extrapolation and characterization I
a diffuse sound field. As the test cases are presented, the
relevance of these metrics are discussed for the charac-
terization of sound environments in vehicles or other sit-
uations. Finally, the quantifiers most able to distinguishbetween these archetypical test cases will be identified.
As it will be shown, two simple scalar metrics are the
most appropriate for a classification of such sound field
components. It is also made clear that other field metrics
are useful to visualize and understand the behavior of the
sound field in a large area.
3.7.1. Single source in free field
For this first test case, the SFE results reported in Fig. 4
are used. This corresponds to a single dipole of strength
0.1 [5] in free field. The intensity field computed at
N= 625 locations in the horizontal plane is shown in
Fig. 6(a). Clearly, the intensity field is stronger in thevicinity of the exact dipole position. Also note that the
average intensity IDOAN (computed from the NSFEpoints, see Eq. (35)) is correctly oriented. The DOA field
is shown in Fig. 6(b). This result clearly highlights the
effectiveness of the SFE DOA vector as a predictor of
perceived sound localization in a free-field situation over
the entire SFE region. The average DOA orientation, be-
sides being slightly different from the average intensity
orientation, is also correctly aligned. The slight orien-
tation difference between the average intensity and the
average DOA is caused by the fact that for the average
DOA all theNpoints share the same contribution in the
averaging while for the average intensity the contribution
of each of the Npoints in the averaging is proportional
to the local intensity magnitude. Also, DOA more effi-
ciently predicts the perceived direction of the incoming
sound. Note that proper intensity and DOA results are
not expected outside the effective SFE region shown by
the contour lines in Figs. 6(a) and 6(b).
The corresponding energy density E(x)and diffuseness(x)fields are shown in Figs. 7(a) and 7(b). The energydensity is confined to the vicinity of the true dipole po-
sition. Moreover, one notes that the local diffuseness is
zero nearly everywhere in the SFE region except along
the dipole null-axis. This is expected since the null pres-sure observed on the dipole null-axis makes the intensity
null in this region. Therefore, the diffuseness approaches
unity. This result explains the observed DOA in that re-
gion. Indeed, a closer look at the DOA (Fig. 6(b)) along
the dipole null-axis shows that the DOA strongly varies
with the position in that area. This variation is an ar-
tifact since the intensity approaches zero in that region,
2 1 0 1 2
2
1.5
1
0.5
0
0.5
1
1.5
2
I(x) [W/m2], Max(|I(x)|) = 0.087649 W/m
2
x1/
k
I(x) N
= 0.0067485
x2
/k
(a)
2 1 0 1 2
2
1.5
1
0.5
0
0.5
1
1.5
2
nDOA
N
nDOA
(x)
x1/
k
x2/k
(b)
Fig. 6: (a): Intensity field I(x)(Eq. (32)) [W/m2], av-
erage intensity vectorI(x)N[W/m2
], (b): direction-of-arrival nDOA(x)(Eq. (34)) and average DOA nDOAN(N=625) for a single dipole in free field for the SFEshown in Fig. 4. The microphone array is shown in light
grey. The average vectors (computed for the SFE points)
are centered at the origin and shown as a large arrow.
Local SFE errors shown as contour lines (see Fig. 4 for
more details).
AES 131st Convention, New York, USA, 2011 October 2023
Page 13 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
14/32
Gauthier et al. Sound field extrapolation and characterization I
then the DOA is not defined (nDOA 0/0). Interestingly,the DOA clearly goes through strong or erroneous vari-
ation when it crosses the SFE effective region illustrated
as the contour lines in Fig. 6(b). The local variation ofthe DOA also suggests a potential local sensation of dif-
fuseness which also corresponds to actual perception in
the null-axis of a dipole in free-field. This will be con-
firmed by the velocity vectors.
The directional pressure pland the energy vector E ob-
tained from SFE are shown in Fig. 8(a) as a spherical
plot in linear scale (radius) and logarithmic scale (color)
[dBref 1]. Clearly, the directional pressure,which is sup-
posed to predict perceivedsound directionabove 700 Hz,
is precise and well aligned with the energy vector which
points towards (from the coordinates system origin) the
dipole. The directional energy and the energy vector areshown in Fig. 8(b). As expected from the directional en-
ergy definition (Eq. (39)), the directional energy is much
more precise than the directional pressure.
The velocity vectorV(x), the real part of which is a pre-dictor of sound source localization below 700 Hz, is re-
ported in Fig. 9. As expected from the definition of the
velocity vector (see Eq. (46)), the real part of the veloc-
ity vector orientation matches the orientation of the DOA
vector (see Eq. (34)). In addition, the imaginary part
of the velocity vector highlights the regions where the
sound is perceived as diffuse or not localized. Clearly,
for the reported test case, the imaginary part of the ve-
locity vector is non-negligible in the dipole null-axis, the
region where the diffuseness (x)approaches one (seeFig. 7(b)). Therefore, as shown by this example, the ve-
locity vector is an interesting metric since it combines
the information carried by the DOA vectornDOA(x)andthe diffuseness(x). The ITD predicted from the realpart of the velocity vector is shown in Fig. 10 where one
clearly notes the transition from negative to positive ITD
when the listener passes from one side to the other side
of the sound source while its head azimuth is fixed to the
angle of the energy vector. Therefore, SFE seems to cor-
rectly predict the velocity vector and the ITD in the SFE
region where the local SFE errors are low.The scalar metrics related to this test case are reported in
Tab. 1. We recall that these scalar metrics are directly de-
rived from the plane wave amplitudes obtained from the
inverse problem solution: these scalar metrics are rep-
resentative of the sound field as a whole. Interestingly,
even if they are intuitively understood as origin-centered,
they are in fact the same for any SFE pointsx. Indeed, all
(a)
(b)
Fig. 7: (a): Energy density E(x) (Eq. (36))[J/m3]104, average energyE(x)N [J/m3], (b): dif-fuseness (x) (Eq. (51)) and average diffuseness(x)N(N=14400) for a single dipole in free field forthe extrapolated sound field shown in Fig. 4.
AES 131st Convention, New York, USA, 2011 October 2023
Page 14 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
15/32
Gauthier et al. Sound field extrapolation and characterization I
x1
Directional pressure pl(
l,
l), color: dB ref 1
x3
x2
10
8
6
4
2
0
(a)
x1
Directional energy density El(
l,
l), color: dB ref 1
x3
x2
10
8
6
4
2
0
(b)
Fig. 8: Spherical plots of the (a): directional pressure
pl(l ,l ) (linear (radius) and dB ref 1 (color) scale),(b): the directional energy density El(linear (radius) and
dB ref 1 (color) scales) and the energy vector E(shown
as large arrow) for a single dipole in free field for the
extrapolated sound field shown in Fig. 4.
2 1 0 1 2
2
1
0
1
2
Re[V(x)], Max(|V(x)|) = 36.289
x1/
k
x2
/k
2 1 0 1 2
2
1
0
1
2
Im[V(x)], Max(|V(x)|) = 36.289
x1/
k
x2
/k
Fig. 9: Real (top) and imaginary (bottom) parts of the
velocity vectorV(x)(Eq. (46)) for a single dipole in freefield for the extrapolated sound field shown in Fig. 4.
Local SFE errors shown as contour lines (see Fig. 4 for
more details).
AES 131st Convention, New York, USA, 2011 October 2023
Page 15 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
16/32
Gauthier et al. Sound field extrapolation and characterization I
Metrics Values
Deviation ofE(x)(N) 110.3190%Directional diffusion (d) 13.6152 %
Directivity factor (Q) 0.2920Directivity index (DI) 5.3462 dB ref 1
Energy vector azimuth (E) 1.0985 radEnergy vector elevation (E) 0.0014 rad
Energy vector radius (RE) 0.9174
Table 1:Scalar metrics for the single dipole in free field
(see Fig. 4) withN=14400.
Fig. 10:Predicted ITD [ms] (Eq. 48) for a single dipole
in free field for the extrapolated sound field shown in
Fig. 4. The listener head orientation is fixed toH= Eover the entire SFE region. The head orientation is
shown as a black large arrow. Local SFE errors shown
as contour lines (see Fig. 4 for more details).
the scalar metrics, except the directional diffusionpl , are
based on absolute and squared values of the plane wave
amplitudesq. Therefore, the phase shift of the solution
for a translation of the origin (Eq. (6)) does not affect the
computed scalar metrics. This is an important property
of these scalar metrics.
For this reported test case, the value of the directivity
factor and directivity index suggest a moderately direc-
tive sound field. This is not really the case, therefore,
these two quantifiers might not be the most appropri-
ate or should, at least, be modified for the case of singlesource in free field. For comparison purpose, withL =642 plane waves in the source distribution of the direct
problem, a single plane wave would give Q =1/1=1and DI=0 dB ref 1 while a totally diffuse sound fieldwith L=642 equal amplitude plane waves would giveQ = 1/642 = 0.0016and DI = 28.0754dB ref 1. How-ever, as will be shown in the following, more immers-
ing sound field situation, theQ and DI values reported
in Tab. 1 are in the highest observable range.
Comparison of these Qand DI directivity metrics with
the directional diffusion dsuggests that the latter is a
more effective directivity metric. Indeed, d is muchcloser to its lowest value (0%) thanQ(or DI) is closer to
its highest value. Moreover, a high standard deviation of
the energy density (N) suggests an heterogeneous distri-bution of the energy density through space which com-
forts the idea that the sound field is all but diffuse.
From the scalar metrics, one also notes that the energy
vector magnitudeRE=0.9174 is relatively high. This isa direct consequence of the energy vector E (Eq. (49))
definition which implies that the energy vector is high if
and only if the directional pressure shows a strong spher-
ical polarity. Indeed, in the case of spherically symmet-
ricalqlthe vector sum in Eq. (49) is null. This will be
further discussed for the upcoming test cases.
3.7.2. Two sources in free field
This test case corresponds to a free-field 2-channel
stereophonic sound reproduction situation. As shown in
Fig. 11, two in-phase monopole sources are located in
x1/k=1 and x2/k=2. The monopole amplitude[25] of each source is 0.5.
For this test case, theSFE results arepresentedin Fig. 11.
The intensityfield computed atN= 625 points in the hor-izontal plane is shown in Fig. 12(a). Clearly, the intensity
field is stronger in the vicinity of the monopolepositions.
Also note that the average intensity is correctly orientedin terms of stereophonic sound perception. The DOA
field is shown in Fig. 12(b). This result clearly highlights
the effectiveness of the SFE-based DOA vector as a pre-
dictor of perceived sound localization in a stereophonic
free-field situation (two coherent sound sources). This
prediction is valid over the entire effective SFE region.
Indeed, for the extended central sweet spot (x1/k=0,
AES 131st Convention, New York, USA, 2011 October 2023
Page 16 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
17/32
Gauthier et al. Sound field extrapolation and characterization I
x2 2), the reported test case exactly corresponds toa stereophonic listening with a centered phantom im-
age created by level-differencestereophony. The average
DOA is also correctly oriented in that sweet spot. For anoff-axis listening position, the phantom image predicted
by the DOA and SFE deviates towards the closest sound
source.
The energy density field E(x)and the diffuseness field(x)for the same extrapolated sound field are shown inFigs. 13(a) and 13(b). Again, the energy density is well
localized in the vicinity of the exact source positions.
One notes that the local diffuseness is zero nearly ev-
erywhere in the SFE region except in the region where a
wrong or imprecise phantom image position is expected
from two-channel stereophonic systems (for a given fre-
quency). Source localization cues from the local diffuse-ness(x)agrees with the predicted perceived sound di-rection from the DOA in these regions (see Fig. 12(b)).
Indeed, a closer look at the DOA (Fig. 12(b)) in this re-
gion reveals a DOA that strongly varies with the position
in that area.
The directional pressure, directional energy and energy
vectors are presented in Figs. 14(a) and 14(b). One notes
that the energy vector is a good predictor of sound lo-
calization for a listener at the center of the array while
the directional pressure and energy density reveal both
the presence of the two real sources and the presence of
the perceived central sound image. The complex velocity
vector field is shown in Fig. 16. Again, we note that the
real part of the velocity vectors predict both the sound
localization created by the stereophonic image and the
diffuse curved-regions. The ITD predicted from the real
part of the velocity vector is shown in Fig. 15 where one
clearly notes that the ITD is zero in the central region.
Moreover, one can observe the expected passage from
negative to positive ITD on the left and on the right sides
of the central position. Again, SFE seems to correctly
predict the velocity vector and the ITD for the reported
test case.
The scalar metrics for this test case are presented in
Tab. 2. The directivity factor Qand the correspondingdirectivity index DI are lower than for the case of a sin-
gle source in free field. These values suggest a less di-
rective sound field, which is the case. The directional
diffusiondalso gives a higher value. The deviation of
the energy density is very similar to the value obtained
for the single dipole test case. Therefore, it seems that
this deviationNmight be a good indicator of free-field
Fig. 11:(a): Real part of the original sound field created
by two monopole sound sources. (b): Real part of the
extrapolated sound field using a plane wave source dis-
tribution obtained from Eq. (20) with =0.001. Themonopoles are marked as a black and white dots. The
white contour line indicates the region of 0.001 of local
quadratic SFE error and the black contour line indicates
the region of 0.1 of local quadratic SFE error.
AES 131st Convention, New York, USA, 2011 October 2023
Page 17 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
18/32
Gauthier et al. Sound field extrapolation and characterization I
2 1 0 1 2
2
1.5
1
0.5
0
0.5
1
1.5
2
I(x) [W/m2], Max(|I(x)|) = 0.15636 W/m
2
x1/
k
I(x) N
= 0.011408
x2
/k
(a)
2 1 0 1 2
2
1.5
1
0.5
0
0.5
1
1.5
2
nDOA
(x)
x1/
k
nDOA
N
x2/k
(b)
Fig. 12: (a): Intensity field I(x)(Eq. (32)) [W/m2], av-
erage intensity vector I(x)N [W/m2
], (b): direction-of-arrival nDOA(x)(Eq. (34)) and average DOA vectornDOA(x)N (N=625) for two monopoles in free fieldfor SFE shown in Fig. 11. The microphone array is
shown in light grey. The average vectors (computed for
the SFE points) are centered at the origin and shown as
a large arrow. Local SFE errors shown as contour lines
(see Fig. 11 for more details).
(a)
(b)
Fig. 13: (a): Energy density E(x) [J/m3] 104
(Eq. (36)), average energyE(x)N[J/m3], (b): diffuse-
ness (x)(Eq. (51)) and average diffuseness (x)N(N=14400) for two monopoles in free field for the ex-trapolated sound field shown in Fig. 11.
AES 131st Convention, New York, USA, 2011 October 2023
Page 18 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
19/32
Gauthier et al. Sound field extrapolation and characterization I
x1
Directional pressure pl(
l,
l), color: dB ref 1
x3
x2
10
8
6
4
2
0
(a)
x1
Directional energy density El(
l,
l), color: dB ref 1
x3
x2
10
8
6
4
2
0
(b)
Fig. 14: Spherical plots of the (a): directional pressure
pl(linear (radius) and dB ref 1 (color) scale), (b): energy
vector E, directional energy density El (linear (radius)
and dB ref 1 (color) scale) and energy vector E for thetwo monopole sources in free field (see Fig. 11). The
energy vector is shown as a large arrow (aligned withx2and fused with the main lobe).
Fig. 15: Predicted ITD [ms] (Eq. 48) for the two
monopole sources in free field (see Fig. 11). The listener
head orientation is fixed toH= Eover the entire SFEregion. The head orientation is shown as a black large
arrow.
Metrics Values
Deviation ofE(x)(N) 108.2139 %Directional diffusion (d) 22.9553 %
Directivity factor (Q) 0.1997Directivity index (DI) 6.9959 dB ref 1
Energy vector azimuth (E) 1.5570 radEnergy vector elevation (E) 0.0261 rad
Energy vector radius (RE) 0.7387
Table 2:Scalar metrics for the two monopole sources in
free field (see Fig. 11) with N=14400.
situations with localized sound sources. Again, the en-
ergy vector azimuth and elevation angles agree with theexpected sound perception.
3.7.3. Standing wave in rectangular coordi-nates
In this case, the sound field is a standing wave in rectan-
gular coordinates created by eight propagating waves in
three-dimensional space. Low frequency standing wave
AES 131st Convention, New York, USA, 2011 October 2023
Page 19 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
20/32
Gauthier et al. Sound field extrapolation and characterization I
2 1 0 1 2
2
1
0
1
2
Re[V(x)], Max(|V(x)|) = 35.0474
x1/
k
x2
/k
2 1 0 1 2
2
1
0
1
2
Im[V(x)], Max(|V(x)|) = 35.0474
x1/
k
x2
/k
Fig. 16: Velocity vector V(x) (Eq. (46)) for the twomonopole sources in free field (see Fig. 11). The
monopoles are marked as black and white dots. The mi-
crophone array is shown in light grey. Local SFE errors
shown as contour lines (see Fig. 11 for more details).
patterns can be found in small closed spaces such as ve-
hicle cabins. Although not very often investigated by
the spatial audio community, the identification, charac-
terization and subsequent reproduction of standing wavepatterns represents a specific challenge encountered in
sound environment reproduction of closed spaces. The
reported theoretical case corresponds to an oblique mode
of a rigid-walled rectangular cavity. For this test case, the
sound field is given by
p(x) =cos(kx1x1)cos(kx2x2)cos(kx3x3), (53)
with kx1 =2cos(s)cos(s), kx2 =2sin(s)cos(s)andkx3 =2sin(s). For the reported case, the standingwave angles were set to s = /7 and s = /6. Thecomparison of the original sound field and the SFE re-
sults with a regularization parameter of =0.0001 isshown in Fig. 17: SFE is effective over a large region.
This is perhaps caused by the fact that the plane wave
model used in the direct problem definition is more ap-
propriate for that type of sound field.
As expected, the corresponding intensity field (not
shown here) is numerically null over the entire region.
The energy density field E(x)and the diffuseness field(x) for the extrapolated sound field are shown inFigs. 18(a) and 18(b). Clearly, energy distribution over
the entire SFE region corresponds to the modal pattern.
However, the local diffuseness(x)is one nearly every-
where. On the basis of the definition of the diffuseness(x)(see Eq. (51)), this was to be expected. Indeed, ac-cording to that definition, this metric will attribute full
diffuseness to a standing wave field since its net intensity
is null over the entire SFE domain.
The directional pressure and energy density are shown in
Figs. 19(a) and 19(b). In the second of these two figures,
the identification of the eight propagating waves that cre-
ate the three-dimensional standing wave is clear.
For this standing wave test case, the scalar metrics are
reported in Tab. 3. The directivity factor Q(Eq. (44))
is 0.2530 and the corresponding directivity index DI
(Eq. (45)) is5.9682 dB ref 1. These values suggest amoderately directive incident sound field. This is not re-
ally the case, therefore, these two quantifiers might not
be the most appropriate or should, at least, be modi-
fied to detect standing wave pattern. Some indications
of the standing-wave nature of the sound field is pro-
vided by the fact that the directional diffuse dis very
low: 12.3839%, which is the lowest of all the observed
AES 131st Convention, New York, USA, 2011 October 2023
Page 20 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
21/32
Gauthier et al. Sound field extrapolation and characterization I
Fig. 17: (a): Real part of the original sound field de-
fined as a standing wave in rectangular coordinates. The
dimensionless problem is normalized by the acoustical
wavelength k. (b): Real part of the extrapolated soundfield using a plane wave source distribution obtained
from Eq. (20) with=0.0001. The white contour lineindicates the regionof 0.001 of local quadraticSFE error.
The nodal lines are shown as dashed black lines.
(a)
(b)
Fig. 18: (a): Energy density E(x) (Eq. (36))[J/m3]106, average energyE(x)N [J/m3], (b): dif-fuseness (x) (Eq. (51)) and average diffuseness(x)N(N= 14400) for a standing wave and for the ex-trapolated sound field shown in Fig. 17. The nodal lines
are shown as dashed lines.
AES 131st Convention, New York, USA, 2011 October 2023
Page 21 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
22/32
Gauthier et al. Sound field extrapolation and characterization I
x1
x3
Directional pressure pl(
l,
l), color: dB ref 1
x2
10
8
6
4
2
0
(a)
x1
x3
Directional energy density El(
l,
l), color: dB ref 1
x2
10
8
6
4
2
0
(b)
Fig. 19:(a): Directional pressurepl(linear (radius) and
dB ref 1 (color) scale) and (b): directional energy den-
sity El (linear (radius) and dB ref 1 (color) scale) for a
standing wave for the extrapolated sound field shown in
Fig. 17.
directional diffusion for the four reported test cases.Moreover, the energy vector radius RE is very small:
0.0379 in comparison with RE close to 1 for the singledipole and two monopole test cases. Notably, this test
case illustrates a very interesting property of the energy
vector magnitude. Indeed, as soon as two plane waves
of opposite direction share a similar amplitude|ql|, theytend to cancel each other in the computation of the en-
Metrics Values
Deviation ofE(x)(N) 63.1460 %Directional diffusion (d) 12.3839 %
Directivity factor (Q) 0.2530Directivity index (DI) 5.9682 dB ref 1
Energy vector azimuth (E) 1.5727 radEnergy vector elevation (E) 0.0208 rad
Energy vector radius (RE) 0.0379
Table 3: Scalar metrics for the stationary wave (see
Fig. 17) withN=14400.
ergy vector. Most interesting is the fact that this hap-
pens for plane wave distributions such as the one shown
in Fig. 19(a) for a oblique standing wave but also for anyother stationary wavessuch a cylindrical or sphericalhar-
monics. Moreover, this same cancellation also arises for
a diffuse sound field where soundenergy travels in all di-
rections. Therefore, the energy vector magnitude seems
a good predictor of directive (few sources in free space)
or non-directive (standing waves or partly diffuse sound
fields) sound field. This will be further discussed in the
case of the diffuse sound field.
3.7.4. Diffuse sound field
For this test case, an harmonic diffuse sound field is
created using a limited set of 642 plane waves coming
from random directions covering a 4steradians solidangle. Both the amplitude and phase of the plane waves
were random. For more details about the definition and
properties of harmonic diffuse sound fields, the reader
is referred to [25]. The original diffuse sound field and
the corresponding SFE result are shown in Fig. 20 for
=0.0001. Again, the SFE method performs very wellover a large effective area even for the specific case of
a diffuse sound field. The sound field characterization
metrics, namely sound intensity, DOA, energy density,
diffuseness, directional pressure and directional energy
density are shown in Figs. 21(a) to 23(b).
Both the sound intensity and the DOA fields shown in
Figs. 21(a) and 21(b) suggest a diffuse situation. Indeed,
the sound intensity average is very low and the DOA and
sound intensity spatial variations are large. This diffuse
character will be supported by the corresponding scalar
metrics.
The acoustical energy density E(x)and the diffuseness(x)are shown in Figs. 22(a) and 22(b), respectively.
AES 131st Convention, New York, USA, 2011 October 2023
Page 22 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
23/32
Gauthier et al. Sound field extrapolation and characterization I
Fig. 20:(a): Real part of the original diffuse sound field
defined. Thedimensionless problem is normalized by the
acoustical wavelengthk. (b): Real part of the extrapo-lated sound field using a plane wave source distribution
obtained from Eq. (20) with= 0.0001. The white con-tour line indicates the region of 0.001 of local quadratic
SFE error and the black contour line indicates the region
of 0.1 of local quadratic SFE error.
2 1 0 1 2
2
1.5
1
0.5
0
0.5
1
1.5
2
I(x) [W/m2], Max(|I(x)|) = 0.00085613 W/m
2
x1/
k
I(x) N
= 2.7693e005
x2
/k
(a)
2 1 0 1 2
2
1.5
1
0.5
0
0.5
1
1.5
2
nDOA
N
nDOA
(x)
x1/
k
x2
/k
(b)
Fig. 21:(a): Intensity fieldI(x)[W/m3] (Eq. (32)), aver-age intensity vectorI(x)N[W/m3], (b): DOAnDOA(x)(Eq. (34)) and average DOA vector nDOA(x)N (N=625) for a diffuse sound field (SFE shown in Fig. 20).
The microphone array is shown in light grey. The aver-
age vector are centered at the origin and shown as large
arrows. Local SFE errors shown as contour lines (see
Fig. 20 for more details).
AES 131st Convention, New York, USA, 2011 October 2023
Page 23 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
24/32
Gauthier et al. Sound field extrapolation and characterization I
(a)
(b)
Fig. 22: (a): Energy density E(x) [J/m3] 106
(Eq. (36)), average energyE(x)N [J/m3], (b): diffuse-
ness (x)(Eq. (51)) and average diffuseness (x)N(N=14400) for a diffuse sound field (SFE shown inFig. 20).
Metrics Values
Deviation ofE(x)(N) 43.7181 %Directional diffusion (d) 53.9748 %
Directivity factor (Q) 0.1548Directivity index (DI) 8.1022 dB ref 1
Energy vector azimuth (E) 2.4254 radEnergy vector elevation (E) 0.5296 rad
Energy vector radius (RE) 0.0788
Table 4: Scalar metrics for the diffuse sound field (see
Fig. 17) withN=14400.
The energy density distribution is not homogeneous and
the diffuseness goes through strong spatial variations
from zero to unity. By itself, the diffuseness average(x)N= 0.43666 suggests a moderately diffuse soundfield, which is not the case. Since the diffuseness is
(x)1 x for the stationary wave (see Fig. 18(b))but much less than unity for the true diffuse sound field,
the diffuseness might not be the most appropriate and il-
lustrative quantifier to distinguish an harmonic standing
wave from an harmonic diffuse sound field.
The information on the directional pressure and energy
density are shown in Figs. 23(a) and 23(b). Since the
energy vector E is very small, it is not shown on these
two figures. By comparison with the previously reported
test cases, these directional quantifiers show a distribu-
tion that covers more uniformly the 4steradians solidangle. However, one can observe that the directional en-
ergy shows some sort of principal directions, something
that would not be the case for a true description of a
diffuse sound field. This heterogeneity will explain the
fact that scalar metrics reported in subsequent paragraphs
does not reach the expected and ideal theoretical and ex-
treme values. In fact, due the array finite size and geom-
etry, it might not be possible to obtain a entirely filled
plane wave distribution for a true diffuse sound field.
The scalar metrics for the diffuse sound field test cases
are very relevant. They are shown in Tab. 4. Attention
will be directed to the two most relevant scalar metrics.First, one notes that the directional diffusion dis higher
than for all the other test cases. Second, the energy vec-
tor radiusREis, as expected, very low, hence suggesting
a poorly perceived sound source position. As it will be
shown in the next section, these two scalar metrics can be
used to derive a classification tree that might be able to
distinguish between the archetypical situations of a pre-
AES 131st Convention, New York, USA, 2011 October 2023
Page 24 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
25/32
Gauthier et al. Sound field extrapolation and characterization I
x1
x3
Directional pressure pl(
l,
l), color: dB ref 1
x2
10
8
6
4
2
0
(a)
x1
x3
Directional energy density El(
l,
l), color: dB ref 1
x2
10
8
6
4
2
0
(b)
Fig. 23:(a): Directional pressurepl(linear (radius) and
dB ref 1 (color) scale) and (b): directional energy density
El(linear (radius) and dB ref 1 (color) scale) for a diffuse
sound field (SFE shown in Fig. 20).
cise sound source, a standing wave pattern and a diffuse
sound field.
3.8. Transition between test cases
The previous sections and subsections highlighted the ef-
fectiveness of the field and scalar metrics to predict var-
ious characteristics of the sound field computed by SFE.
Yet the physical metrics such as sound intensity field,
DOA field, diffuseness field and expected ITD are very
useful, they also provide a very much detailed descrip-
tion of the sound field. Sometimes, it is interesting to
classify or characterize the sound field in broader cate-
gories, atoms or terms so that the most appropriate soundreproduction techniquecan be selected for that sound en-
vironment component. In this section, we present the
transition between the archetypical test cases reported
earlier: it will be shown that a simple classification could
be based on two scalar metrics, namely the energy vector
magnitudeREand the directional diffusiond.
Transitions between: the single dipole and the two
monopole cases (hereafter named #1 to #2), the single
dipole and the standing wave (#1 to #3), the standing
wave to the diffuse sound field (#3 to #4) and the single
dipole to the diffuse sound field (#1 to #4) were com-
puted for 20 interpolation points. For the interpolation,the orignal test cases were first scaled to ensure a simi-
lar vector 2-norm of the measured pressurep(xm)at themicrophone array. Note that transition between the two
monopoles in free field and the standing wave or diffuse
sound field are not reported. Indeed, it was sufficient
to keep only a single free-field case, namely the single
dipole, for the transition between free-field and standing
wave or diffuse sound field. Otherwise, the transition
graphics would have been too dense. Next, for the in-
terpolation, a linear amplitude fade is applied between
two limiting test cases and the inverse problem is solved
for each of the interpolation points. The energy vector
and the directional diffusion are then directly computedas above. Note that for the interpolations the regulariza-
tion parameter in Eq. 22 is chosen as follows: #1 to #2,=0.01; #1 to #3, =0.0001; #1 to #4, =0.0001and #3 to #4,= 0.0001.
The results, in terms of energy vector magnitude and di-
rectional diffusion are shown in Fig. 24.
Lets first examine the transition #1 to #2 which is from
the single dipole to the two monopole cases. As one
could expect, they both occupy a similar region of the
dREplane and, most notably, they share a very highR
Ewhich indicates a very directive sound field.
Next, consider the transition #1 to #3 which is from the
single dipole to the standing wave. The major differ-
ence between the two limiting points of this transition is
alongRE: the standing wave involves, as explained ear-
lier, a very lowRE. When the transition fade is at 50 %,
RE 0.6. The curvature (along axis d) of this transi-tion curve is easily explained by the fact that the stand-
AES 131st Convention, New York, USA, 2011 October 2023
Page 25 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
26/32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
27/32
Gauthier et al. Sound field extrapolation and characterization I
simple assume that a point that tends toward this limit-
ing corner will approach a single propagating wave in
free field. The other limiting case occurs at RE =0,
d/100=1 (shown as a thick+in Fig. 24). This caseis even more difficult to reach both physically and as a
solution of the inverse problem. Indeed, this is only pos-
sible if the plane wave amplitudes are all exactly equal
in magnitude. The imperfection of the array and the SFE
algorithm cannot reach such an ideal case. This explains
why the directional diffusion seems to be limited at 0.6for thereported cases. The last possible limiting case is at
RE= 0,d/100 = 0 (shown as a thickin Fig. 24). Thiscould only occur if two plane waves of exactly the same
amplitude were exactly facing each other. That would
be the indication of an axial mode. The other corner of
the classification plot, RE=1, d/100=1, correspondsto an impossible case of both spherically-polarized and
spherically-uniform plane wave amplitudes. Therefore,
one should not expect any points in the top right triangu-
lar part of the plot.
Before actually proceeding to the experimental valida-
tion of the proposed SFE and SFC methods, the sug-
gested classification criteria must be discussed.
First, one should be aware that this is a preliminary pro-
posal and that it could be refined. Indeed, it is easy
to object that the methodology behind the definition of
this classification is some sort of a manual multi-
dimensional analysis. We note that a systematic multi-dimensional analysis could be performed. However, the
simple classification, as proposed in this paper, has the
great advantage of involving two simple metrics and not
a metrics linear combination. Consequently, the classifi-
cation is easy to understand and interpret.
Second, it could be objected that the proposed transition
values RE and d that circumscribe the directive, non-
directive, stationary and diffuse regions are derived, for
specific test cases for a specific microphone array and
for dimensionless simulations. Further verifications for
various cases should be performed. In fact, we suggest
that these transition values should be verified for othermicrophone arrays. In all cases, the experimental results
will show that the proposed classification and transition
values are convenient for the reported experiments.
3.10. Sound-field type scores
In real applications, a sharp classification tree might not
always be the most appropriate approach to quantify the
measured sound field. Therefore, we propose the intro-
duction of sound-field type scores. The free-field score
is given by
Sff=RE. (54)
The modal or standing-wave score is given by
Sm= (1 RE)(1 d/60)2, (55)
and the diffuse-field score is given by
Sd= (1 RE)(d/60)2. (56)
As one notes, the division of the directional diffusion by
60 is inspired from the observations that stem from the
transition plot shown in Fig. 24. Further refinements of
these scores could be done. The interest of the scores
in comparison with a sharp classification scheme is thatthey can deal with intermediate cases. Moreover, for
broadband noise or signals, it would be possible to plot
the score as function of frequency. This is the topic of
current verifications.
To illustrate the capability of these sound-field scores to
deal with the reported test cases, Fig. 26 shows the scores
of the four transitions already reported in Fig. 24. For
Fig. 26(a), the transition from the single dipole to the two
monopoles cases systematically gives Sffas the highest
score. For the transition from the single monopole to the
diffuse sound field, the scores shown in Fig. 26(b) are
able to distinguish the free-field and diffuse-field situa-tions. For the twoother transitions reported in Figs. 26(c)
and (d), the scores are also good detectors for the stand-
ing waves and diffuse sound field.
4. CONCLUSION
The aim of this paper was twofold: 1) develop and de-
scribea method of spatial sound field extrapolation based
on microphone array measurements of arbitrary geome-
try and 2) develop and define a sound field characteri-
zation method and a sound field classification based on
known objective and subjective metrics.
To achieve SFE, a recently developed method was pro-posed and further analyzed. This method is based on the
combination of classical least-square inverse problems in
matrix form with a beamforming regularization matrix
used as a discrete smoothing norm in the regularization.
In this paper, we extended the analysis of this regular-
ization method which was compared to the application
of a beamforming scaling matrix in the inverse problem
AES 131st Convention, New York, USA, 2011 October 2023
Page 27 of 32
-
8/11/2019 2011 Beamforming regularization, scaling matrices and inverse problems for sound field extrapolation and charact
28/32
Gauthier et al. Sound field extrapolation and characterization I
ffm d0
0.5
1
Sff,
Sm,
Sd
ffm d0
0.5
1(a) Transition from #1 to #2 (0%, 50%, 100%)
ffm d0
0.5
1
ffm d0
0.5
1
Sff,
Sm,
Sd
ffm d0
0.5
1(b) Transition from #1 to #4 (0%, 50%, 100%)
ffm d0
0.5
1
ffm d0
0.5
1
Sff,
Sm,
Sd
ffm d0
0.5
1
(c) Transition from #3 to #4 (0%, 50%, 100%)
ffm d0
0.5
1
ffm d
0
0.5
1
Sff,
Sm,
Sd
ffm d
0
0.5
1(d) Transition from #1 to #3 (0%, 50%, 100%)
ffm d
0
0.5
1
Fig. 26:Sound type scores (Sff,Sm,Sd) for the four tran-
sitions shown in Fig. 24. The scores are only shown for
0 %, 50 % and 100 % (from left to right) of the transi-
tions, these points correspond to the markers in Fig. 24.
The highest scores are highlighted as black bars.
with classical Tikhonov regularization. On the basis of
the general singular value decomposition of the transfer
matrix and beamforming regularization matrix pair and
the singular value decomposition of the transfer matrix,
it was shown that the beamforming regularization ma-trix approach is equivalent to an inverse problem where
the original transfer matrix is scaled by a beamforming
scaling matrix. This is a new result that opens up the
understanding of the original proposal on the use of a
beamforming regularization matrix [7].
Once SFE was achieved, the inverse problem solution
was investigated to evaluate different sound field met-
rics, namely: energy density, sound intensity, direction
of arrival, diffuseness, velocity vector, energy vector, di-
rectional energy, interaural time difference, incident di-rectivity factor, incident directivity index and directional
diffusion. Using theoretical simulations, they were com-
pared in terms of their capability to simply character-
ize archetypical sound field types: a small number of
sources in free-field situations, standing wave patterns