improve your surroundings - aes scotland presentation

Post on 16-Jan-2016

15 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

The presentation given by Andrew Horsburgh and Robert Davis for the Audio Engineering Society, Scotland branch, on the subject of spatial audio.

TRANSCRIPT

Improve Your Surroundings

andrew j. horsburgh, BSc Honsrobert davis, BSc Hons

This presentation will cover the basic principles of spatial audio, surround sound and production techniques. Focusing on the spatial audio format Ambisonics, specifically the Production and reproduction of Ambisonic soundfields.

A Feature Presentation!

How to define your surroundings?

Any sound that has ever been made has identifiable characteristics belonging to the sound and its environment. These characteristics are minute, in all directions and contain specific identifiable spatial or acoustical properties. Reproducing these minute characteristics is one aim of the Ambisonic format.

Buzz-word Bingo

Many artists, producers and engineers is having the ability to make things an exact acoustical replica of a space. Buzz-words like 'immersion', 'depth', 'realism' often used to describe a situation. Widely adopted current technologies are not able to reproduce the audible characteristics that ensure 'immersion' every time.

Localisation of SoundsLord Rayleigh's 'Duplex theory' defined the localisation method that the human auditory system uses to identify sound locations. These are Interaural Time Differences (ITD) and Interaural Level Differences (ILD). Minute changes in time of arrival and pressure allow for precise localisation of sound between the two ears. Placement of speakers is related to these principles.

Ambisonic Sound Localisation

Based heavily upon Lord Rayleigh's Duplex Theory and Blumlein Stereo - Ambisonic creator Gerzon developed a Meta-Theory for the basis of his isotropic format. Utilising two methods of mathematical representation, covered later, the spectrum of audio can be localised with stereophonic principles but at higher precision.

StereophonyCurrent standards use two discrete audio files with a numerically applied position, implying stereo-field width - but not featuring the essential delay time or amplitude differences. Extending beyond stereo to surround formats such as Dolby 5.1 and THX - This method of encoding audio does not give true impressions of space or contain information that can reproduce a realistic environment that can fool the listeners.

Multichannel Choices Today:

● 'Dot One' Discrete (includes 5.1, 7.1, 11.2, 22.2)

● 'With Height' systems (Dolby PLIIz, Dolby Atmos)

● Wave Field Synthesis (WFS+)● VBAP● Ambiophonics (Binaural on speakers)● Ambisonics● Auro 3D

Why Ambisonics?

Advantages:● Isotropic - Speakers are all treated 'equally'● Mono, Stereo, Quad, 5.1 etc mix compatible● Stable phantom images between each pair

of speakers● Improving resolution does not mean

increasing speaker count by x^2!● Recording is independent to reproduction

system

Ambisonic Birth

Development of the format began during the 1970s, led by mathematician M. Gerzon. The creation of a omni-directional, isotropic wavefront recording and reproduction format was acknowledged by many to be ahead of its time - until now, 40 years later. With viable increase of technologies labelled '3D' and 'immersive' - adoption of Ambisonics can deliver an audio experience like no other.

That's no Moon

Soundfield Microphones

Reproduction and recording of three dimensional soundfields is easy, provided you have chosen the correct set of microphone patterns.

Omni F/B Fig 8 L/R Fig 8

CoreSound Microphones

Several companies have created a single-unit microphone that captures the information similar to the Omni + Fig8 + Fig8 + Fig8 configuration. One of these is CoreSound's TetraMic.

Ambisonic Research In Practice

Previously - only simulations of speaker configurations, hypothetical-microphone patterns and synthetic soundfields were available. Through empirical and software research now conducted at York, Derby, Huddersfield and Queens University, Dublin have all taken a different approaches to Ambisonics.

Excellent - How Do I Get It?

Google 'Ambisonics' and you'll find obscure literature relating to mathematical functions such as Bessel and Hankel, or build it yourself software in an unfamiliar package. Making music using Ambisonics isn't as difficult as it first seems.. But first you need plug-ins..

Ambisonic Production Tools

● Overview● Encoding (Panning)● Decoding (Playback/Monitoring)● Effects● 2D (Horizontal) systems● 3D (With height) systems

2D Definitions

● Polar/Cartesian co-ordinates

● Radial distance (r)● Azimuth (θ)

○ Measured anti-clockwise from due front

● Unit circle (r = 1)

Encoding

● Directional encoding of virtual sound sources (Panning)

● Mono source input -> Ambisonic sound field● Applies to multi-track mixing techniques, live

performance or sound design● Simulates microphone pick-up patterns

● W signal is -3dB: W = 0.707(θ)● X signal is cosine of source direction: X = cos(θ)● Y signal is sine of source direction: Y = sin(θ)

2D Systems - First Order

W X Y

2D Systems - Second Order

W X Y

U V

W X Y

● Higher order 2D components are simply multiples of source angle i.e. U = cos(2θ), V = sin(2θ)

2D Systems - Third Order

W X YW X Y U V

P Q

● And again , P = cos(3θ), Q = sin(3θ)

Sound Field Reconstruction (Decoding)

● Required for playback of Ambisonic material over loudspeakers

● A linear combination of Ambisonic channels ● Creates speaker feeds● Cons: Material cannot be correctly monitored

before decoding ● Pros: Allows any Ambisonic encoded material

to be played back on various speaker arrays

First Order - Quad System

● Each channel is multiplied by a coefficient (ratio) and summed to give the speaker feed

● Changing the coefficients creates the other speaker feeds

W X Y

× 0.471 × 0.667 × 0.000

=++

Basic Decode

First Order - Quad System

● Adjusting the ratio between channels allows different polar patterns to be reproduced

W X Y

× 0.586 × 0.586 × 0.000

=++

Max-rE Decode

× 0.707 × 0.500 × 0.000

=++

In-Phase Decode

Second Order - Hexagon System

● Adding higher order components narrows the reproduced polar pattern

W X Y

× 0.283 × 0.000

++

× 0.400

+ +

U V

× 0.400 × 0.000

= Basic

Second Order - Hexagon System

● Further adjustment of channel ratios

Max-rE In-Phase

Third Order - Octagon System

● Higher directional resolution is achieved with extra loudspeakers and audio channels

Max-rE In-PhaseBasic

Velocity and Energy Vectors

● Analysis of localisation quality by measuring the contributions from all loudspeakers

● An approximation of how humans localise sound sources

● Velocity vector relevant for low frequencies (<700Hz)

● Energy vector relevant for mid/high frequencies (>700Hz)

rV & rE – First Order

Max-rE|rV| = 0.707|rE| = 0.707

In-Phase|rV| = 0.500|rE| = 0.667

Basic|rV| = 1.000|rE| = 0.667

rV & rE – Second Order

Max-rE|rV| = 0.866|rE| = 0.866

In-Phase|rV| = 0.667|rE| = 0.800

Basic|rV| = 1.000|rE| = 0.800

rV & rE – Third Order

Max-rE|rV| = 0.924|rE| = 0.924

In-Phase|rV| = 0.750|rE| = 0.857

Basic|rV| = 1.000|rE| = 0.857

Dual Band Decoding

● Crossover filters are used to split the frequency spectrum into two bands

● Allows different decodes at high and low frequencies

● For psychoacoustic optimisation

○Max rV at Low Freqs.○Max rE at High Freqs.

Near Field Compensation

● A physical effect created by close proximity to loudspeakers (within a few meters)

● In microphone recording similar effect called ‘precedence’

● Compensated with high-pass filtering

3D Definitions

● Spherical/Cartesian co-ordinates

● Radial distance (r)● Azimuth (θ)

○ Measured anti-clockwise from due front

● Elevation (ϕ)○ Measured up or down

from horizontal● Unit sphere (r = 1)

Spherical Harmonics

● m = degree/mode , n = order [note: sometimes reversed]

Higher Order Systems

● Channel count Vs. Ambisonic order

Ambisonic Productions and Standardisation

Proposed Array Standards

Adoption and implementation of regular arrays is key to attracting large companies and markets. Calibration of the array is crucial - however, the decoding stage has yet to be agreed upon. Standardising the environment (placement & number of speakers), loudness and production tool kits are possible.

An Array Standard

Now we have plug-ins that work with easily accessible software. What now? Create an array, calibrate the array and then produce in Ambisonics!

How many speakers?

The most elemental array for Ambisonics is a First Order Horizontal one, using 4 speakers.

First Order Layouts

How many speakers?

The most elemental array for Ambisonics is a First Order Horizontal one, using 4 speakers. Moving up to second order harmonics we need to use 6 speakers.

Second Order

How many speakers?

The most elemental array for Ambisonics is a First Order Horizontal one, using 4 speakers. Moving up to second order harmonics we need to use 6 speakers. Going to third order horizontal resolution we need to 8 speakers.

Third Order Layout

A large array, featuring 25th Order regular horizontal array.

Loudness

The Loudness war needs no introduction. Hyper-compression is a problem that has resulted in several industry wide recommendations from professional bodies. ● European Broadcast

Union R-128● ITU-R BS 2054.2

LUFS!

R-128 specifies the use of Loudness Unit Full Scale in broadcasting. Material is normalised to -23dBFS +/- 1dB, conforming with calibrated metering of ITU BS 1770. Not only is the recommendation in the box, but calibrated at the speakers.

Desired Listening figures

Taking into consideration most of the recommendations from the ITU, EBU and K-Scale - an SPL between 83dB and 85dB SPL at listening position. This is an equivalent target SPL of 83dB SPL @ 1 meter for each speaker in the array irrespective of number.

Metering and Placement

In adhering to R-128 compatibility with the Fletcher-Munson curves is preserved, and acoustical integrity of material is kept. Placement of the speakers in regular form ensures stereophonic phantom images between speakers. Advantages of Ambisonics allows for higher tolerance in placement.

Research Project

A main research project is under way to compare the qualitative aspect of Ambisonics against other 'surround' formats. Technical accuracy has been shown a benefit of Ambisonics but data relating to listening tests are small and controlled.

Production Tools

Using the DAW 'Reaper', guides are available using existing plug-in suites, one of the more popular is Dr Wiggins's 'WigWare'. PC exclusive suites are available (Dave Malhams B-Dec), with Linux exclusive AmbDec.

Delay Plug-In

● Allows delayed sounds to be ‘scattered’ throughout the sound field

● To create abstract spatial effects

● Or studio delay

Delay Plug-In – System Overview

● The processing system

Delay Plug-In – Features

● Offset control centres source direction

● Range specifies amount the source will deviate (modulate) from centre point

Delay Plug-In – Output

● Randomisation setting● Sample and hold is used

to hold position of signal for duration of playback

Delay Plug-In – Output

● Oscillation setting● Multiple delays allows for

time overlap of signals

Audio Demo

In the demo today we are using a custom Third Order Ambisonic Decoder attached to 8 speakers. Using Reaper to host the plug-ins, there is a mixture of up-mixed monophonic to Ambisonic material and First Order CoreSound recordings.

Conclusions

In this talk we have covered; Elementary description of auditory perceptionAmbisonic functions (FOA, HOA)Encoding and Decoding ProcessIssues associated with AmbisonicsProduction Standard Guidelines

Thank you for listening, any questions?

top related