3d audio: pt3

87
3D Audio: pt3 Sound synthesis and spatial processing Politecnico di Milano – Polo Regionale di Como

Upload: buihanh

Post on 01-Jan-2017

262 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 3D Audio: pt3

3D Audio: pt3

Sound synthesis and spatial processing Politecnico di Milano – Polo Regionale di Como

Page 2: 3D Audio: pt3

SUMMARY •  Wavefield Synthesis

•  strict foundation on the basic laws of acoustical wave propagation

•  Ambisonics •  represent the sound field by an expansion into

three-dimensional basis functions •  Microphone Acquisition

Page 3: 3D Audio: pt3

WAVEFIELD SYNTHESIS

Page 4: 3D Audio: pt3

Wavefield synthesis •  Basic idea:

•  reconstruct the soundfield within a volume by recreating the wavefront in a closed surface that contains it (through sampling)

Page 5: 3D Audio: pt3

Wavefield synthesis •  Huygens principle:

“every points of a wavefront can be considered as the source of an elementary wave that interacts with the surrounding environment”

●  The wavefield produced by a primary

source Ψ can be reconstructed using a distribution of secondary sources, for example obtained by placing the speakers P1, P2, …, PM

Page 6: 3D Audio: pt3

Wavefield synthesis

Page 7: 3D Audio: pt3

Wavefield synthesis •  Let:

•  Q(z , ω) be the Fourier Transform of the emitting source; •  z be the vector of polar coordinates (r,θ,φ) also referred as the

listener position; •  P(z,ω) be the Fourier Transform of the acoustic pressure in any

point z •  V be the volume that encloses the listening area •  S be the surface that encloses the volume V

!

Page 8: 3D Audio: pt3

Wavefield synthesis •  The Huygens principle is mathematically described by the

Kirchhoff-Helmholtz integral

where: •  G is the Green Function. The GF describes the contribution of

a point source at position z’ to the soundfield at position z. •  Point source Green Function:

•  P(z’,ω) is the pressure along the enclosing surface S •  ∂ / ∂n P(z’,ω) is the pressure gradient along the surface normal

Page 9: 3D Audio: pt3

Wavefield synthesis

The wavefield in a volume V is completely described by the pressure P(z’, ω) along the enclosing surface S and the pressure gradient ∇P(z’, ω) along the surface normal n.

According to the Huygens principle, we can interpret the above integral as a set of secondary sources along the surface S

Page 10: 3D Audio: pt3

Wavefield synthesis •  Choose G as Green’s function of a point source

(monopole)

∂ / ∂n( G ) is the Green’s function of a dipole •  Approximate the distribution of monopoles and dipoles

on the surface S by suitable types of loudspeakers. •  Excite the loudspeakers with suitable driving signals

which are derived from P(z’, ω) and ∇P(z’, ω)

Page 11: 3D Audio: pt3

Acquisition system •  The holophonic acquisition system is derived from the

Huygens principle and the Kirchhoff integral, and consists of the spatial sampling of

•  1) pressure P(z’, ω) •  2) pressure gradient ∇P(z’, ω)

•  Pairs of microphones •  omnidirectional for pressure •  directional for gradient (figure eight)

•  Pairs of speakers •  Monopolar (closed speakers) •  Dipolar (open speakers)

•  As shown in the K-H integral, the omnidirectional microphone feeds the dipole while the il directional one feeds the non-dipolar (closed) speaker

Page 12: 3D Audio: pt3

acquisition system

Page 13: 3D Audio: pt3

acquisition system

Page 14: 3D Audio: pt3

Wavefield reconstruction •  Microphones are now replaced by speakers

•  driving signals are the same that were acquired by the microphones

•  Speakers must be used in pairs with similar polar characteristics as the microphones used for the acquisition

•  The perfect reconstruction is obtained when we can have a continuous distribution on S

•  Distance btw microphones and btw speakers determines the minimum wavelength that can be recorded (Shannon)

Page 15: 3D Audio: pt3

Wavefield reconstruction

Page 16: 3D Audio: pt3

Wavefield reconstruction •  Elimination of dipoles:

•  admit a certain sound pressure also for z′ outside V •  use a modified Green’s function G1 in order to verify the

condition ∂/ ∂n G1=0 on S

only monopoles are now used •  cons:

•  speaker selection: windowing artifacts •  soundfield generated outside V

Page 17: 3D Audio: pt3

Wavefield reconstruction Reproduction of monochromatic plane wave …

Page 18: 3D Audio: pt3

Wavefield reconstruction •  From 3D to 2D:

•  reconstruct the wavefield on a plane (listener head)

•  Spatial Sampling: •  Sampling the curve:

•  Trade-off: N. of speakers VS. Max sampled freq. •  We limit our bandwidth to frequencies below 1.5KHz

(above 1.5 the damage is quite contained)

Page 19: 3D Audio: pt3

Wavefield reconstruction Spatial sampling artifacts:

Page 20: 3D Audio: pt3

Example: 8-channel circular rig •  Wavefield synthesis requires an anechoic rendering

environment •  When the environment is not anechoic we can adopt a

scheme with filters that are design to cancel (or minimize the impact of the environment)

Multitrack recording

Page 21: 3D Audio: pt3

Non-anechoic environments •  G(z) can be determined through a

calibration procedure •  Adapt a filter in such a way to

minimize the error btw the acquired signal in the rendering room y[n] (using microphones that are similar to those used during the recording) and the reference signal d[n]

•  Calibration signal is a pseudo-random noise

•  Filter lengths are proportional to reverberation time

Page 22: 3D Audio: pt3

Wavefield synthesis: applications •  Example of application: recreating the soundfield of

a different environment

Page 23: 3D Audio: pt3

•  improving the acoustics of an auditorium by acquiring the sound and re-synthesizing it in the same environment

Wavefield synthesis: applications

Page 24: 3D Audio: pt3

Wavefield synthesis •  using WFS together with beamforming it is possible

to silence some areas

Page 25: 3D Audio: pt3

AMBISONICS

Page 26: 3D Audio: pt3

Ambisonics - overview •  Originally developed in the ’70s by Gerzon,

Barton and Fellgett as a novel multi-channel recording-rendering technique with the aim of enhancing immersivity •  With ambisonics the sound is encoded

together with its directional components in a vector as a set of spherical harmonics

•  Similarly to holophonic systems, we can apply a transform matrix (decoder) to such vector in order to produce the signals that feed a speaker array that surrounds the environment

Page 27: 3D Audio: pt3

Ambisonics - overview

Sound!eldMicrophone

WXY(Z)coding

decodingand

conversion ch.N

Recording

Sweet-spot

2 ch. LR

2 ch.headphone

3/ 2

Page 28: 3D Audio: pt3

Pros & cons •  Main advantages

•  Simple recording/coding method. We need multiple microphones (at least 4 for 3d – 3 for 2D) placed in the same point at the center of the acoustic scene;

•  Independent coding/decoding methods •  We can decode sound to produce a stereo signal, or a signal

that is compatible with a standard 3/2 or with an N-channel system (N=4, 6, 8,…)

•  Main disadvantages •  Sources can only produce planar wavefronts (fundamental

assumption of Ambisonics) •  Not restrictive as any wavefield can be modeled as the

combination of planar wavefronts •  The rendered wavefield must be a combination of planar

wavefronts (i.e. speakers must be sufficiently far from the listener)

•  Not a negligible problem for small environments such as cars

Page 29: 3D Audio: pt3

Ambisonics •  Ambisonics systems are based on transducers

(mics) that produce a sampling of directional components of •  order zero (pressure microphones) •  order one (pressure gradient microphones) •  ...

•  Higher-order components correspond to a sampling of the acoustic field with many more (at least 9) super-directional microphones

Page 30: 3D Audio: pt3

Ambisonics •  Directional characteristics of the wavefield are

reconstructed by summing the spherical components of the field for orders m=0,1,2; each acquired with a microphone having similar directivity characteristics

(from an omnidirectional microphone)

(from microphone with an 8-shaped response)

Page 31: 3D Audio: pt3

Coding formats •  A-format – used for coincident directional microphones

(symmetrically oriented - soundfield) •  B-format – developed for studio equipment (most

widespread) •  Used also when microphones have similar directional

characteristics as the spherical harmonics •  Signal are coded into W,X,Y,Z

•  C-format – for transmission •  UJT-format – for multi-channel coding (compatible with

mono and stereo formats) •  D-format – for decoding and rendering

Page 32: 3D Audio: pt3

Coding formats •  When using a microphone for each harmonic

component, signals coming from W, X, Y, Z (B-format) can be encoded in any other format

•  Z is omitted in 2D systems

Page 33: 3D Audio: pt3

Theory •  Hypothesis

•  Sources are considered point source •  Sources emit planar wavefronts

•  Notation •  We adopt cylindrical coordinates •  Let us consider (for simplicity) the 2D case (z=0) •  Let k be the wave vector, of magnitude k=2π/λ (wave number) and

same orientation as Ψ •  k= [kx ky] = [kcosΨ ksinΨ] •  Let r = (r,φ) be the space vector

•  The soundfield produced by a planar wave is where PΨ is the pressure

Page 34: 3D Audio: pt3

Theory •  Pressure of reference field in a specific point is

•  Goal: to reproduce the planar wave relative to the listening reference pt by replacing the real source with a rendering system made of N speakers symmetrically arranged around the listening pt

proof

xP! ej kr

!

y!

r

P! ejkr cos( !! ! )

listening point

plane wave

source

Page 35: 3D Audio: pt3

Theory

proof

Theory

proof

Page 36: 3D Audio: pt3

Rendering •  Each speaker generates a wavefield •  At the listening pt we have •  To reproduce the original

field we need

Page 37: 3D Audio: pt3

Rendering •  The planar wave is characterized by a constant

pressure and a phase component that can be expanded using a Fourier-Bessel series

Page 38: 3D Audio: pt3

Synthesis phase •  Truncate series as we have N speakers only

Matching conditions

Page 39: 3D Audio: pt3

Facts •  As N tends to infinity

•  the order-N approximation tends to an exact matching, so the synthesized wavefront is exactly the same as the reference one

•  the sweet spot expands

Page 40: 3D Audio: pt3

Vector-form expression for analysis

•  We decoupled the direction information of the of the plane wavefront (ψ) from information on the soundfield (φ)

•  The information on the spatial distribution of the plane wavefront is all in c

•  Pr(r) can be express in the vector-form

proof

Page 41: 3D Audio: pt3

Vector-form expression for analysis proof

cTh=J0(kr)+ 2cosψ⋅ j 2J1(kr)cosφ+ 2sinψ⋅ j 2J1(kr)sinφ+...+

+ 2cosmψ⋅ jm 2Jm(kr)cosmφ+ 2sinmψ⋅ jm 2Jm(kr)sinmφ

cTh=J0(kr)+2jJ1(kr)cosψcosφ+2jJ1(kr)sinψsinφ+...+

+2jmJm(kr)cosmψcosmφ+ jmJm(kr)sinmψsinmφ=

=J0(kr)+2 jmJm(kr)cosmψcosmφ+ jmJm(kr)sinmψsinmφ( )m=1

cTh=J0(kr)+2 jmJm(kr)cos[m(φ−ψ)]( )m=1

Vector-form expression for analysis proof

Page 42: 3D Audio: pt3

Vector-form expression for analysis

•  First components of vector c multiplied PΨ

•  Which are for m=0 (zero order) and m=1 (first order)

Page 43: 3D Audio: pt3

•  In the acquisition phase (mics) we need to identify the coefficients cm of vector c •  In principle we need m=0,…,+∞ •  In practice, we need to truncate the expression

•  In order to measure cm we would need mics whose directivity matches cos(mΨ) e sin(mΨ). Except for m=0 (omnidirectional mic) and m=1 (bi-directional mics), for m>=2 there are no matching mics

•  In practice, we stop at order 1 or 2 •  For high order we need to mathematically

determine coefficients cm

Vector-form expression for analysis

Page 44: 3D Audio: pt3
Page 45: 3D Audio: pt3

•  In the synthesis phase

Vector-form expression for synthesis

Page 46: 3D Audio: pt3

Vector-form expression for synthesis •  Let be the soundfield produced by

the n-th speaker •  We have

Vector-form expression for synthesis• Let be the soundfield produced

by the n-th speaker

• We have

C=[ c1 c2 . .. cN ]

PA( r )=∑

n=1

N

Pn(r , f

n)=∑

n=1

N

PncnT h=AT CT h

AT=[ P1 P2 .. . PN ]

Page 47: 3D Audio: pt3

Coding •  The recorded sound contains the whole information

required for the coding •  So the Ambisonic signal contains the whole

information (magnitude and direction) of the sources •  E.g.: the 2D Ambisonic signal for a source PΨ

oriented as Ψ is given by

Page 48: 3D Audio: pt3

Notes •  In ambisonics recordings the order M is extremely

limited •  Typical choice is M=1 but sometimes M=2 or M=3

are used for experimental reasons •  Truncation implies a limited listening region: as M

increases, the sweet spot grows •  Truncation implies a limited localization

•  The notation (W,X,Y,U,V) is the typical one of the 2D case. The harmonics along z are, in fact, zero •  m=1 implies Z=0 •  m=2 implies R=S=T=0

Page 49: 3D Audio: pt3

Decoding •  We start from the matching conditions

and we want

PA(r)=PR(r)

PA(r)=ATCTh

PR(r)=PψcTh

ATCTh=PψcTh we can define

b=PψcT

bT = ATCT

Where:

Decoding •  We start from the matching conditions

and we want

we can define

bT = ATCT

Where:

Page 50: 3D Audio: pt3

•  We thus have where

•  C is dependent of the speaker positions

The only element we don’t know is A which is the vector of the unknowns magnitudes that drive the speakers

•  As C is not square, we need to resort to pseudo-inversion (least mean square solution)

Decoding

AT=[P1 P2 ...PN]

•  We thus have where

•  C is dependent of the speaker positions

The only element we don’t know is A which is the vector of the unknowns magnitudes that drive the speakers

•  As C is not square, we need to resort to pseudo-inversion (least mean square solution)

Decoding

Page 51: 3D Audio: pt3

Speaker placement •  Speaker signals are determined by the Ambisonics

weights according to their spatial location •  When speakers are placed on a circle (2D case)

we have •  Likewise, a closed-form expression can be found

for speakers placed on a spherical surface (3D case)

Page 52: 3D Audio: pt3

Remarks •  Coding (recording) format (W,X,Y (Z) , …) is independent

on the No. and on the configuration of the rendering speakers

•  A large number of equidistant and uniformly distributed speakers guarantees a larger sweet spot and a more stable localization of sound

•  Many speakers guarantee better realism

•  Asymmetric speaker placement requires a different weighting of components coming from

•  Speaker characteristics must be as similar as possible

Page 53: 3D Audio: pt3

Remarks •  For a good 2D rendering we need at least 4

speakers •  For a correct localization we need 5 or more

speakers •  For 3D Ambisonics (complete sphere) we need 6-8

speakers at least •  There are currently systems based on over 128

speakers

Page 54: 3D Audio: pt3

Comparative remarks •  Both holophonic systems (wavefield synthesis) and

Ambisonics attempt a complete characterization of the soundfield but they profoundly differ in their approach

•  We consider a number of speakers N=S number of mics, and the order given 2M+1

•  Holophony is based on spatial sampling of the wavefield •  Spatial aliasing •  For small S, due to aliasing, the error is uniformly distributed

over the volume •  Increasing S the error has a fast decrease

•  Ambisonics truncate the spherical harmonic series •  Approximation that narrows the sweet spot •  For M low, the error is minimal at the center of the sweet spot

and it increases as we move out of it •  Increasing M the sweet-spot increase and the error decrease

Page 55: 3D Audio: pt3

Comparative remarks •  For small S Ambisonic work better than holophonic

approach. •  For high order they are almost equivalent •  The difference is in mics acquisition:

•  In Ambisonic for high order we need specific directional mic (they could even not exist)

•  In Holophony we can increase the number of Omnidirectional and “otto” mics

Page 56: 3D Audio: pt3

References •  P. Fellgett, “Ambisonics. Part one: general system description” Studio

Sound, pp. 20-22 & 40. August, 1975 •  M.A. Gerzon, “Peiphony: With-Height Sound Reproduction”, JAES Vol.

21, No. 1, Jan-Feb 1973. •  M.A. Gerzon, “The Design of Precisely Coincident Microphone Arrays

for Stereo and Surround Sound”, AES preprint L-20 Convention 50, March 1975.

•  M.A. Gerzon, “Ambisonics. Part two: Studio Techniques”, Studio Sound, pp. 24-30, October 1975.

•  M.A. Gerzon, “Design of Ambisonic Decoders for Multispeaker Surround Sound”, 58th Audio Egineering Society Convention, November 1977.

•  D. G. Malham, A. Myatt, “ 3-D Sound Spatialization using Ambisonic Techniques”, Computer Music J., vol. 19 no. 4, pp. 58-70 (1995).

•  J. S. Bamford, “An Analysis of Ambisonic Sound Systems of First and Second Order”. Master Thesis. Waterloo University, Ontario, Canada. 1995.

•  Jérome Daniel, “Represéntation de champs acoustiques, application à la trasmision et à la reproduction de scenes sonores complexex dans un contexte multimedia”, PhD thesis, CCETT- France Telecom, 1999.

Page 57: 3D Audio: pt3

Microphone acquisitions

Page 58: 3D Audio: pt3

Microphones •  The mic is an engine which produces an output

voltage v(t) proportional to some scalar or vector acoustic variable g(t)

•  Microphones can be classified according to the law of proportionality btw acoustic and electric variable: •  Microphones that are sensitive to the (scalar)

acoustic pressure •  Non-directional (omni-directional) microphones

•  Microphones that are sensitive to the (vector) pressure gradient

•  Directional microphones whose characteristics depend on the order m of the gradient

Page 59: 3D Audio: pt3

Directional characteristics •  Radiation diagrams

a)  Pressure microphone (omni-directional) b)  Low-order pressure gradient microphone (directional) c)  High-order pressure gradient microphone (very directional)

NB: radiation diagrams depend on the frequency and generally we have different diagrams for different frequencies

Page 60: 3D Audio: pt3

Combining radiation diagrams •  It is possible to construct alternative radiation patterns through

•  combining and connecting several microphone capsules (possibly with different characteristics) within the same device

•  Adopting particular physical geometries around the capsule(s)

•  E.g. Cardioid pattern can be obtained through a series connection of an omnidirectional capsule with a figure-of-eight combined in the same enclosure.

Page 61: 3D Audio: pt3

Acquisition techniques •  Three different strategies

•  Directional microphones in close proximity (coincident)

•  intensity depends on orientation only •  referred to as Intensity difference ΔI.

•  Omni-directional microphones placed at a distance of tens of centimeters from one another

•  referred to as time difference ΔΤ •  As microphones are omni-directional, intensity

difference is negligible •  Mixed microphones: mix directional microphones

(cardioid or 8-figure) placed at a certain distance from one another

•  we have both a ΔT and a ΔI

Page 62: 3D Audio: pt3

Intensity stereophony •  XY techniques (intensity stereophony)

•  two directional microphones at the same place, and typically pointing at an angle 90° or more to each other

•  A stereo effect is achieved through differences in sound pressure level between two microphones.

•  Due to the lack of differences in time-of-arrival / phase-ambiguities, the sonic characteristic of XY recordings is generally less spacy and has less depth compared to recordings employing an AB-setup

•  When the microphones are bidirectional and placed facing +/-45° with respect to the sound source the XY-setup is called a Blumlein pair.

•  The sonic image produced by this configuration is considered by many authorities to create a more realistic, almost holographic soundstage

Page 63: 3D Audio: pt3

Intensity stereophony

(a) ΔI pair (XY config.) with cardioid mics (b) ΔI pair (XY config.) with figure 8 mics

(Blumlein pair) (c) ΔI pair with one cardioid and one 8-figure

(Mid-Side configuration)

Page 64: 3D Audio: pt3

Time-of-Arrival Stereophony •  A-B techniques (time-of-arrival stereophony)

•  two parallel omnidirectional microphones some distance apart, so capturing time-of-arrival stereo information as well as some level (amplitude) difference information, especially if employed in close proximity to the sound source(s)

•  At a distance of about 50 cm the time delay for a signal reaching

first one and then the other microphone from the side is approximately 1.5 msec (1 to 2 msec)

Page 65: 3D Audio: pt3

Acquisition techniques

(e) ΔΤ pair (A-B) with omni-directional mics

Page 66: 3D Audio: pt3

Near-coincident technique: mixed stereophony •  the ORTF stereo technique of the Office de Radiodiffusion

Télévision Française = Radio France, calls for a pair of cardioid microphones placed 17 cm apart at a total angle between microphones of 110 degrees which, according to Eberhard Sengpiel, results in a stereophonic pickup-angle of 96°

•  Notice that the spacing of 17 cm has nothing to do with human ear distance. The recorded signals are generally intended for playback over stereo loudspeakers and not for ear phones.

Page 67: 3D Audio: pt3

Acquisition techniques

(f) ΔT−ΔI pair (A-B) w/ cardioids (ORTF - Office de

Radiodiffusion Television Française)

Page 68: 3D Audio: pt3

Examples of ORTF rigs

Page 69: 3D Audio: pt3

Acquisition techniques •  For multi-channel application

•  ∆Τ and mixed techniques use one mic for each reproduction channel (holophonic)

•  ∆I (coincident) techniques can use a smaller number of mics than

the number of speakers – The signal fed into speakers are the obtained by linear combination of signal captured from mics, through a coding matrix

where gi is a constant, ci is the signal to feed to the ith speaker and mi the signal captured from the ith mic

where M is the number of mics

Page 70: 3D Audio: pt3

Surround Atmo-Mikrofon •  “Surround-Atmo-Mikrofon” – composed by 4 cardioid mics •  Much used to record concerts

Page 71: 3D Audio: pt3

Multi-channel Mid-Side •  Mid-Side (MS) •  It uses the two cardioid mics (Mf and Mb) and one “otto” figure mic (S) •  It possible to build a 5 channel system (with C central – L Left – R Right

– Ls Left back – Rs Right back)

Page 72: 3D Audio: pt3

Multi-channel Mid-Side

With and costant

Page 73: 3D Audio: pt3

Microphone arrays •  Discrete arrays

Page 74: 3D Audio: pt3

Discrete microphone arrays

•  Sets of discrete microphones with various geometries and appropriate directivity patterns

Page 75: 3D Audio: pt3

3D arrays •  Soundfield microphone

Page 76: 3D Audio: pt3

High-order arrays •  Trinnov’s 3D array with 24 microphones arranged in a

pseudo-random fashion

Page 77: 3D Audio: pt3

3°-order directivity responses

Page 78: 3D Audio: pt3

High Order Ambisonics (HOA) •  France Telecom also

developed spherical arrays with 32 microphones (4th order)

Page 79: 3D Audio: pt3

Ambisonics and WFS acquisition and rendering

Page 80: 3D Audio: pt3

Ambisonics array •  The directional 3D diagram

of the first-order Ambisonics is made of an omnidirectional figure and by three figures eight, one per axis

•  One pressure microphone (omnidirectional)

•  Three gradient pressure microphones

•  sounds coming from mics are those of the B format

•  Omnidirectional mic: W signal

•  Directional mics: X,Y and Z signals

Page 81: 3D Audio: pt3

Soundfield microphone •  The ambisonics acquisition system is a coincident

array. In order to avoid mic cluttering, we can use the so-called Soundfield microphone of type ∆I •  First-order approximation, the microphone is made

of 4 cardioid directional cartridges placed on a tetrahedron (one mic per direction)

•  The soundfield mic picks up 4 signals: RF right-front, RB right-back, LF left-front and LB left-back

•  This set of signals is called A-format

Page 82: 3D Audio: pt3

Soundfield microphone •  B-format can be derived

by the A-format as

Page 83: 3D Audio: pt3

W

Z

Z

Y

4 channel: Σ,Δ,T,Q

3 channel: Σ,Δ,T(T band limited to 5kHz)

2 channel superstero

2 channel stereo

2 channel summed

UH

J enc

oder

Limits and other formats •  Because of the discrete distribution of microphones, there is

an upper bandwidth limit on the 4 components of the B-format

•  From [W X Y Z] is possible to derive other formats. •  With the so called UHJ encoding matrix is possible to

obtain variuous broadcast fotramt from the B-Format.

W

Z

Y

X

4 channel: L,R,T,Q

3 channel: L,R,T (T band limited to 5kHz)

2 channel superstero

2 channel stereo

2 channel summed

UH

J enc

oder

Page 84: 3D Audio: pt3

Second-order ambisonics •  First-order ambisonics with coincident microphones

determines a panning law that is quite far from ideality

•  In order to better approximate ideal directivity patterns we can adopt a second-order approximation •  The second order coincident

array is made of 12 cardioid or hyper-cardioid microphones arranged on a dodecahedron

Page 85: 3D Audio: pt3

Problems and limits •  Coincident microphone arrays are used for stereo

or surround sound recording but there are some factors that affect the correct reconstruction of the spatial acoustic image •  Distance btw microphones is 3-10 cm instead of less

than 5 mm •  Microphone directivity diagrams are not ideal

(especially at high frequencies)

Page 86: 3D Audio: pt3

Altering directivity •  In the first-order case we can construct a tf function

matrix G to perform beamforming so that the new signals X’,Y’,Z’,W’ will exhibit the desired polar diagram

Page 87: 3D Audio: pt3

Altering directivity •  Correction of

tetrahedron response

G1(z) for the omnidirectional signal W G2(z) for the figure eight signals X,Y,Z