multichannel audio technologies: lecture 17 more …gkearney/mcat/mat_more_about_ambisonics.pdf1 2 3...

10
Multichannel Audio Technologies: Lecture 17 More About Ambisonics A Little More Theory: Spherical Harmonics In the last class we mentioned that the Ambisonic Soundfield microphone uses an omnidirectional component and three figure of 8 components to record the sound field in three dimensions. These components are known in the Ambisonic world as Spherical Harmonic components . The Physics theory behind Spherical Harmonics is well beyond the scope of this course but we can think of them here as the directional response of the soundfield microphone in three dimensions (i.e. the directional response of 4 coincident microphones). In fact, what we have been describing here thus far (the B-Format) is a 1 st Order Ambisonic system i.e. 1 st Order Ambisonics has 4 spherical harmonic components, namely W, X, Y, Z. W X Y Z Now imagine if we add in even more coincident microphones to describe the soundfield. Except this time these new microphones have the polar patterns shown below. For a 2 nd order Ambisonic system we have a total of 9 spherical harmonic components (9 coincident microphones!). These added spherical harmonics are called R, S, T, U and V.

Upload: dinhcong

Post on 10-May-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Multichannel Audio Technologies: Lecture 17 More About Ambisonics A Little More Theory: Spherical Harmonics In the last class we mentioned that the Ambisonic Soundfield microphone uses an omnidirectional component and three figure of 8 components to record the sound field in three dimensions. These components are known in the Ambisonic world as Spherical Harmonic components. The Physics theory behind Spherical Harmonics is well beyond the scope of this course but we can think of them here as the directional response of the soundfield microphone in three dimensions (i.e. the directional response of 4 coincident microphones). In fact, what we have been describing here thus far (the B-Format) is a 1st Order Ambisonic system i.e. 1st Order Ambisonics has 4 spherical harmonic components, namely W, X, Y, Z.

W X Y Z

Now imagine if we add in even more coincident microphones to describe the soundfield. Except this time these new microphones have the polar patterns shown below. For a 2nd order Ambisonic system we have a total of 9 spherical harmonic components (9 coincident microphones!). These added spherical harmonics are called R, S, T, U and V.

This means that the higher the order of system we go the more accurately we can describe the sound field. Previously we had only been capturing the soundfield from up-down, left-right and front-back. In a second order system we also describe components from all 45o angles in three dimensions. This means that when we sum the components to create our ‘virtual microphones’ on decoding, we have even more directive virtual microphones formed as shown below.

The polar plot shown below shows how the ‘virtual microphone’ becomes more directive the naturally higher the order we go. (Notice the small side lobes that result from adding in the higher order spherical harmonics!)

In fact, on reproduction, if we go to higher orders, interesting things happen over the playback area: The sweet spot gets (theoretically) LARGER! Shown below is an example of sound field reconstruction as we increase in order (from order 1 to 5, left to right) adding even more complex spherical harmonics. We can start to see a sound field that looks like the reference plane wave over the entire audience area.

1 2 3 4 5

Reference Plane Wave

However it is currently very difficult to create the desired polar patterns for second order (and beyond) Ambisonic microphones, and thus Ambisonic recordings are generally always B-Format (1st Order). However, some panning plugins do work in second order so there is software out there that allows you to render your mono or stereo sources into higher order Ambisonics. Ambisonics Decoding for Speaker Layouts A major advantage of Ambisonic Surround Sound is that recording and studio processing are disengaged from reproduction. B-Format produces and operates on the W, X, Y and Z channels, but these can be reproduced through any number of speakers. The more speakers which are used the better, as this gives a larger listening area and a more stable sound localisation. Using more speakers also improves the illusion that the speakers have vanished; that is to say, the listeners hear a single seamless sound field. It is important to understand, that the Ambisonic-encoded signals are not feeding any speakers themselves, but carry the directional information of an entire soundfield. That means that they are completely independent from the loudspeaker layout chosen for decoding the soundfield. Ambisonic does not imply a certain number of loudspeakers used for reproduction.

N = (M+1)2 for 3D reproduction

where N is the number of Ambisonic channels, and M is the order of the system. For horizontal-only reproduction we end up with:

N = (2M+1) for 2D reproduction In case of the B-Format, this means four speakers for a periphonic (i.e. three-dimensional) loudspeaker setup. In case of a horizontal-only setup, where the Z channel can be neglected, three speakers are sufficient. However, it is fine and even desirable to use more speakers than the number of Ambisonic channels, since this can increase the overall quality of sound localization.

An Ambisonic encoder has however put certain demands towards the layout of the loudspeaker array: it is supposed to be as regular as possible. The more regular it is, the

better the results in terms of localization of audio sources will be. In other words, the decoder will do its job as good as it can with the speaker layout offered to it. UHJ and Mono/Stereo compatibility In the mid seventies most of the then ‘surround’ systems were compatible, to a greater or lesser extent, with conventional stereo and mono. Yet B-Format consisted of sum-and-difference signals, like Blumlein M-S recording, which could not be listened to directly. Ambisonics needed a stereo/mono-compatible matrix as well. UHJ was developed to facilitate this. Like Dolby Surround, it was a matrix system that had excellent mono/stereo compatibility. (UHJ doesn’t actually stand for anything, the name comes from an amalgamation of other matrix encoding systems!) In fact, the stereo was so good that it almost appeared ‘super-stereo’ with the image extending beyond the speakers. UHJ material was also perfectly mono compatible! The essential idea was to encode the B-Format channels such that normal mono and stereo players could play back the audio, and dedicated Ambisonic decoders could extract the B-Format information. The equations that govern the matrix sum and difference operations for horizontal only extraction are: Encoding: S = 0.9396926*W + 0.1855740*X D = j(-0.3420201*W + 0.5098604*X) + 0.6554516*Y (where j is a 90 degree phase shift) Left = (S + D)/2.0 Right = (S - D)/2.0 Decoding: S = (Left + Right)/2.0 D = (Left - Right)/2.0 W = 0.982*S + j*0.164*D X = 0.419*S - j*0.828*D Y = 0.763*D + j*0.385*S Ambisonics quickly attracted the attention of audiophile label Nimbus Records, and apart from a few initial stereo and QS releases, every album ever released by Nimbus was recorded Ambisonically and released UHJ.

Unfortunately when DVD came around, there were several major drawbacks to promoting UHJ - or B-Format for that matter - as a standard surround-sound encoding scheme for DVD. Theoretically, 5.1 surround carried by a 6-channel digital medium does not need a decoder of any kind. You put six channels in one end and get them out of the other. To get surround sound information out of a UHJ or B-Format signal, however, you

have to decode it. So there has to be a decoder somewhere, whereas in the 5.1 environment, you don’t need one. This means that the transmission path is more expensive than the 5.1 – and in a commercial environment where you want to minimize your costs, it won’t fly.

In fact, the situation is not as simple as that. DVD players did adopt decoders, namely DTS and Dolby AC3, but not Ambisonics. So on the face of it, it looked as though UHJ and Ambisonics as the solution to all the problems of surround sound was thwarted at the starting gate.

But not so fast…We may not be able to get UHJ into the equation anymore, but there is still room for Ambisonics. If you have to have a 5.1 signal path… well, give them a 5.1 signal path. Enter the G-Format!

G-Format for 5.1 In 1992 Gerzon and company tackled one of their major Ambisonics problems: decoding Ambisonics for irregular speaker arrays. This meant that no longer did the speaker layouts have to be perfectly symmetrical, but they could be user defined and specified in the decoder (although, practically symmetrical arrays still work best!). These new B-Format decoders were named Vienna decoders, and Ambisonics could now be used with the ITU 5.1 speaker layout. This is known as the G-Format. This also meant that since the Ambisonic system was now being utilized with a standard layout, decoding of the soundfield could be performed in the studio or mastering suite and encoded directly to DTS or Dolby Digital. No Ambisonics decoder at the consumer stage is necessary! Using Ambisonic Recording Equipment The Soundfield microphone MKV that we use in MMT is shown below. The picture on the right below shows the tetrahedral array of capsules required for A-Format pickup. There is also a small decal on the front of the microphone that indicates the microphones forward orientation. If you do not notice this on recording setup, your soundfield will be rotated wrongly on decoding. Don’t worry though, it’s easy to rotate the soundfield to the correct position at the mixing stage.

The microphone is connected to a directional processor/ B-Format encoder unit. On the front of the unit are controls for various parameters of the soundfield pickup:

Input Gain: Controls the sensitivity of the microphone over a 30dB range in 10dB steps. As the first gain stage this is the most important in ensuring a non-hissy recording. Have this set as high as possible without clipping Fine Gain - should normally be operated as near to the zero mark as possible but allows a +10dB to - 10dB fine trim. LB, LF, RB, RF Mute Switches: These refer to the four capsules on the tetrahedron. None of these should be muted! Osc.test - the test oscillator produces a 1kHz tone at the B-Format record outputs which should be used to align the multi-track record inputs. It is good practice to record these test tones prior to the recording session so as to have alignment references. Invert – If the Soundfield microphone is to be hung upside down, use this switch, to invert the polarity of the Z component. Tape and Dub – These buttons reverse the roles of the B-Format inputs and outputs and should never be used during recording. (The facility allows B-Format to B-Format and/or B-Format to stereo dubbing with fine gain control and SoundField control if required.) Azimuth - complete electronic rotation of the microphone. Should be set to 0. Elevation - allows plus or minus 45° of continuous variation on the vertical alignment of the actual microphone. Should be set to 0.

Dominance – This is the frontal dominance control and allows us to focus on the front or rear of the soundfield as desired. SoundField In Button - routes the B-Format signal through the SoundField control section. If no SoundField correction is to be made the section should be switched out of circuit to avoid accidental adjustment. SoundField Rec Button - In normal or dub operation the SoundField controls are inserted into the B-Format signal after the B-Format record outputs and only affect the stereo output. Rec allows their insertion into the B-Format record outputs, enabling corrections to be made onto 4 track tape as well as the stereo output. Metering - the 4 bargraph LED meters show the signal levels of the 4 components of the B-Format signal W, X, Y and Z as they appear either at the B-Format output or off tape at the B-Format input. In either case they show the effect on signal level of any SoundField adjustments and directly represent the signal level at the B- Format outputs. Stereo Microphone controls - the polar pattern control is graduated from omni-directional (0) at the anti-clockwise end through cardioid at ‘12 o’clock’ to figure-of-eight at the clockwise end and smoothly adjusts the polar pattern of the generated microphone(s) through all the intermediate sub-cardioid and hyper-cardioid positions. The capsule angle control is graduated from 0° to 180° and smoothly adjusts the angle of the generated microphones between the two extremes. With the control set to 0° the two outputs would be of two microphones pointing in exactly the same direction from exactly the same point in space and would therefore be identical mono signals. Monitor - controls the signal level of the stereo output to the headphone socket. With this microphone please always take care to monitor what is coming from the headphones on the control unit! Recording B-Format to Cubase. The following setup assumes use of the Motu Traveler. 1. Connect the Soundfield microphone to the B-Format processor unit. Be careful when connecting the cable to the microphone that you don’t force the threads. 2. Connect the record output XLR cable loom to the sound field multichannel input. 3. Set up 4 monophonic inputs in your VST connections 4. Set up 4 monophonic tracks and label them W, X, Y and Z respectively 5. Route inputs 1-4 to Cubase channels W, X, Y and Z respectively. 6. Arm the tracks and press record. Press the oscillator button on the Soundfield processor. Record 10 seconds of test tone. 7. Turn off the oscillator and proceed with program recording.

Decoding (Horizontal) B-Format in Cubase for 5.1.

1. Set up a 5.1 group channel and label it AmbisonicDec. 2. Pan the W channel to the Front Left Speaker 3. Pan the X channel to the Front Right Speaker 4. Pan the Y channel to the Centre Speaker. Mute the Z-channel. 5. On the 5.1 group add in the plugin Visual Virtual Mic (from the Ambisonics

folder). Note that we add this in on a group as opposed to a 5.1 output so that we may, if desired, mix our Ambisonic material with other 5.1 surround material.

6. Load the preset for 5.1 7. Add in a 5.1 output bus in VST connections. 8. Route the AmbisonicDec group channel to the 5.1 output bus. You should now

have a decoded Ambisonic soundfield.

VVMicVST calculates up to 8 virtual microphone signals from a b-format input. The controls are given as: Elev: Elevation from -90 to +90 degrees with positive pointing up. Polar Pattern Display: Not really a control, but shows the current polar patterns of all virtual microphones. The number in the lower right corner in a colored box shows which microphone is currently being set by the sliders. The lighter colored parts of the patterns represent inverted response. The black polar pattern show the sum of all the microphones, scaled down to fit. Azi: Azimuth from -180 to +180 degrees with positive pointing left. Width: Separation of two virtual microphones from 5 to 180 degrees. Dir Directionality: 0.0 for omni, 1.0 for cardiod, 2.0 for figure eight. Number of Outputs: Determines how many microphone signals will be calculated. Current Output: Determines which microphone the Elev, Azi, Width, and Dir sliders are currently affecting. The current output can also be changed up and down by using the square bracket keys '[' and ']'. Link Pairs: If checked, the Elev, Azi, Width, and Dir sliders affect pairs of sliders, like 1&2 or 3&4, together. Link All Directivities: If checked, the Dir slider sets the directivity for all microphones.

Panning Mono in the Ambisonics Scene In order to pan a monophonic source in an Ambisonics soundfield the source must be converted to B-Format. Two plugins which can do this are Panorama (works in Cubase 4) and BPan (This is not supported in Cubase 4, but does work in Nuendo 2 and is a better panner in my opinion). Shown below are some steps for panning in Nuendo, but the procedure is exactly the same for Cubase.

1. Setup a 5.1 decoder group as before. Name this Ambidec 2. Setup a 5.1 panning group just for your monophonic source (i.e. if your mono file

is called Speech, then call this track SpeechPan). Load in the plugin BPan. Remember, we can’t put this plugin directly onto a mono channel as it generates B-Format signals and needs a channel with a least 4 outputs. Consistently keeping with 5.1 groups here is a good idea in my opinion, since you are sure the channels will map correctly (we are really only using the first 4 channels of the 5.1 group).

3. Route the Panning group to the Ambidec decoder group. 4. On the mono channel in the Cubase Surround Panner, pan the monophonic

audio completely to the left channel. This is where VPan takes its input from. 5. Pan you’re audio to the desired position in VPAN. 6. If we are using multiple mono sources then a new Panning group should be

created for each source.

Mono TrackMono Track Panning GroupPanning Group Decoder GroupDecoder Group