aes130 th conven tion pro gram - audio engineering societythis year, the speech transmission index...

44
Audio Engineering Society 130th Convention Program, 2011 Spring 1 The AES has launched a new opportunity to recognize student members who author technical papers. The Stu- dent Paper Award Competition is based on the preprint manuscripts accepted for the AES convention. A number of student-authored papers were nominated. The excellent quality of the submissions has made the selection process both challenging and exhilarating. The award-winning student paper will be honored dur- ing the Convention, and the student-authored manuscript will be considered for publication in a timely manner for the Journal of the Audio Engineering Society. Nominees for the Student Paper Award were required to meet the following qualifications: (a) The paper was accepted for presentation at the AES 130th Convention. (b) The first author was a student when the work was conducted and the manuscript prepared. (c) The student author’s affiliation listed in the manu- script is an accredited educational institution. (d) The student will deliver the lecture or poster pre- sentation at the Convention. ***** The W innerofthe 130th AES Convention StudentPaperAward is: Design ofa Com pactCylindricalLoudspeakerArray forSpatialSound Reproduction — Mihailo Kolundzija, Christof Faller, Martin Vetterli, Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland Convention Paper 8336 To be presented on Friday, May 13 in Session P4 —Multichannel and Spatial Signal Processing, Part 1 ***** Friday,M ay 13 09:00 Room SaintJulien TechnicalCom m ittee M eeting on Transm ission and Broadcasting Session P1 Friday,May 13 09:30 – 12:30 Room 4 SPEECH AND HEARING Chair: Bob Schulein, Asius Technologies LLC, Longmont, CO, USA 09:30 P1-1 The Evolution ofthe Speech Transm ission Index— Herman J. M. Steeneken, 1,2 Sander J. van Wijngaarden, 2 Jan A. Verhave 2 1 TNO Human Factors (retired), Soesterberg, The Netherlands 2 Embedded Acoustics, Delft, The Netherlands This year, the Speech Transmission Index cele- brates its 40th anniversary. While the first mea- suring device built in the 1970s could barely fit inside a car, inexpensive pocket-size STI mea- suring solutions are now available to the world. Meanwhile, the STI method has continually evolved in order to deal with an increasing array of measuring challenges. This paper investi- gates how the STI kept up with these challenges and analyzes possible room for further improve- ment. Also, a roadmap for further development of the STI is proposed. Convention Paper 8315 10:00 P1-2 Prosody Generation Module forMacedonian Text-to-Speech Synthesis— Branislav Gerazov, Zoran Ivanovski, Faculty of Electrical Engineering and Information Technologies, Skopje, Macedonia The paper presents a fully functional prosody generation module developed for Macedonian text-to-speech (TTS) synthesis. The module is based on research of prosody generation mod- ules in high-end TTS synthesis systems, previ- ous prosody experiences in Macedonian TTS, as well as original research of prosody carried through by the authors. The paper starts with an overview of the basic tasks, problems, and solu- tions in prosody generation modules. Then it continues to give a detailed account of the work- ings of the developed module. The module first segments the input speech into intonation phras- es and determines their intonation type. Next it generates durations for each of the units that will be used to synthesize the speech output. Then it determines the positions of the lexical stresses and modifies these units’ durations. After deter- mining the intonation phrase’s pitch accent loca- tion, it generates an adequate pitch contour and calculates the pitch targets needed for unit modi- fication. The synthesis module uses this data to generate prosody in the output speech. Generat- ed prosody patterns in the output speech are of satisfactory quality for arbitrary text input. The presented results are of significant value for Macedonian TTS and can be used for other under-represented languages. Convention Paper 8316 A A E E S S 1 1 30 30 th th C C o o n n v v en en t t i i on Pro on Pro g g ram ram May 13 – 16, 2011 Novotel London West, London, UK

Upload: others

Post on 10-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 1

The AES has launched a new opportunity to recognizestudent members who author technical papers. The Stu-dent Paper Award Competition is based on the preprintmanuscripts accepted for the AES convention.A number of student-authored papers were nominated.

The excellent quality of the submissions has made theselection process both challenging and exhilarating. The award-winning student paper will be honored dur-

ing the Convention, and the student-authored manuscriptwill be considered for publication in a timely manner forthe Journal of the Audio Engineering Society.Nominees for the Student Paper Award were required

to meet the following qualifications:

(a) The paper was accepted for presentation at theAES 130th Convention.(b) The first author was a student when the work was

conducted and the manuscript prepared.(c) The student author’s affiliation listed in the manu-

script is an accredited educational institution.(d) The student will deliver the lecture or poster pre-

sentation at the Convention.

* * * * *

The W inner of the 130th AES Convention Student Paper Award is:

Design of a Com pact Cylindrical Loudspeaker Array for Spatial Sound Reproduction — Mihailo Kolundzija, Christof Faller, Martin Vetterli, Ecole Polytechnique Fédérale de Lausanne,

Lausanne, Switzerland Convention Paper 8336

To be presented on Friday, May 13 in Session P4 —Multichannel and Spatial Signal Processing,

Part 1

* * * * *

Friday, M ay 13 09:00 Room Saint JulienTechnical Com m ittee M eeting on Transm ission and Broadcasting

Session P1 Friday, M ay 13 09:30 – 12:30 Room 4

SPEECH AND HEARING

Chair: Bob Schulein, Asius Technologies LLC, Longmont, CO, USA

09:30

P1-1 The Evolution of the Speech Transm issionIndex— Herman J. M. Steeneken,1,2 Sander J.van Wijngaarden,2 Jan A. Verhave2

1TNO Human Factors (retired), Soesterberg, The Netherlands2Embedded Acoustics, Delft, The Netherlands

This year, the Speech Transmission Index cele-brates its 40th anniversary. While the first mea-suring device built in the 1970s could barely fitinside a car, inexpensive pocket-size STI mea-suring solutions are now available to the world.Meanwhile, the STI method has continuallyevolved in order to deal with an increasing arrayof measuring challenges. This paper investi-gates how the STI kept up with these challengesand analyzes possible room for further improve-ment. Also, a roadmap for further developmentof the STI is proposed. Convention Paper 8315

10:00

P1-2 Prosody Generation M odule for M acedonianText-to-Speech Synthesis— Branislav Gerazov, Zoran Ivanovski, Faculty of ElectricalEngineering and Information Technologies,Skopje, Macedonia

The paper presents a fully functional prosodygeneration module developed for Macedoniantext-to-speech (TTS) synthesis. The module isbased on research of prosody generation mod-ules in high-end TTS synthesis systems, previ-ous prosody experiences in Macedonian TTS,as well as original research of prosody carriedthrough by the authors. The paper starts with anoverview of the basic tasks, problems, and solu-tions in prosody generation modules. Then itcontinues to give a detailed account of the work-ings of the developed module. The module firstsegments the input speech into intonation phras-es and determines their intonation type. Next itgenerates durations for each of the units that willbe used to synthesize the speech output. Then itdetermines the positions of the lexical stressesand modifies these units’ durations. After deter-mining the intonation phrase’s pitch accent loca-tion, it generates an adequate pitch contour andcalculates the pitch targets needed for unit modi-fication. The synthesis module uses this data togenerate prosody in the output speech. Generat-ed prosody patterns in the output speech are ofsatisfactory quality for arbitrary text input. Thepresented results are of significant value forMacedonian TTS and can be used for other under-represented languages. Convention Paper 8316

AAEESS 113030thth CCoonnvvenenttiion Proon ProggramramMay 13 – 16, 2011

Novotel London West, London, UK

Page 2: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

2 Audio Engineering Society 130th Convention Program, 2011 Spring

10:30

P1-3 The Influence of Transm ission Channel onthe Adm issibility of Speech Sam ple forForensic Speaker Identification— Andrey Barinov, Speech Technology Center Ltd., St. Petersburg, Russia

This paper is an extension and addition to papers previously published and presented dur-ing the AES 39th Conference and AES 129thConvention, regarding voice sample quality requirements and compensation for the influ-ence of transmission channels for further foren-sic or automatic speaker identification. In this paper we provide the analysis of different typesof transmission channels such as land line,GSM, radio, and VoIP. We analyze the importantparameters of voice samples obtained fromthese channels and compare the influence of dif-ferent channels on important speaker identifica-tion voice biometric features. At the end of thepaper, there are some conclusions provided regarding the circumstances under which theparticular recording can be accepted/rejected forforensic speaker identification. Convention Paper 8317

11:00

P1-4 Optim izing the Acoustic and IntelligibilityPerform ance of Assistive Audio System s and Program Relay for the Hard of Hearingand Other Users— Peter Mapp, Peter Mapp Associates, Colchester, UK

Around 10% of the general population sufferfrom a noticeable degree of hearing loss andwould benefit from some form of hearing assis-tance or deaf aid. DDA legislation and require-ments mean that many more hearing assistivesystems are being installed—yet there is contin-uing evidence to suggest that many of thesesystems fail to perform adequately and providethe benefit expected. This paper reports on theresults of acoustic performance testing of a num-ber of trial ALS systems. The use of STIPA, as apractical measure for assessing the potential intelligibility of ALS systems, is discussed. TheALS microphone type, distance, and angular location from the target acoustic source areshown to have a significant effect on the resul-tant potential intelligibility performance. The effects of typical ALS signal processing havebeen investigated and are shown to have asmall but significant effect on the STIPA result.The requirements for a suitable acoustic testsource to mimic a human talker are discussedas is the need to adequately assess the effectsof both reverberation and noise. The findings ofthis paper are also relevant to the installationand testing of educational sound field systemsas well as boardroom and conference room systems. Convention Paper 8318

11:30

P1-5 Sound Reproduction within a Closed EarCanal:Acoustical and Physiological Effects— Samuel Gido,1,2 Robert B. Schulein,1 Stephen Ambrose11Asius Technologies LLC, Longmont, CO, USA

2University of Massachusetts, Amherst, MA, USA

When a sound producing device such as insertearphones or a hearing aid is sealed in the earcanal, the fact that only a tiny segment of thesound wave can exist in this small volume at anygiven instant, produces an oscillation of the stat-ic pressure in the ear canal. This effect cangreatly boost the SPL in the ear canal, especiallyat low frequencies, a phenomena that we callTrapped Volume Insertion Gain (TVIG). In thisstudy the TVIG has been found by numericalmodeling as well as direct measurements usinga Zwislocki coupler and the ear of a human sub-ject, to be as much as 50 dB greater than soundpressures typically generated while listening tosounds in an open environment. Even at moder-ate listening volumes, the TVIG can increase thelow frequency SPL in the ear canal to levelswhere they produce excursions of the tympanicmembrane that are 100 to 1000 times greaterthan in normal open-ear hearing. Additionally,the high SPL at low frequencies in the trappedvolume of the ear canal, can easily exceed thethreshold necessary to trigger the Stapedius reflex, a stiffing response of the middle ear,which reduces its sensitivity and may lead to audio fatigue. The addition of a compliant mem-brane covered vent in the sound tube of an insert ear tip was found to reduce the TVIG byup to 20 dB, such that the Stapedius reflexwould likely not be triggered. Convention Paper 8319

12:00

P1-6 Can W e Com pare the Sound Quality of Noise Reduction between Hearing Aids? A M ethodto Level the Ground between Devices— RolphHouben, Inge Brons, Wouter A. Dreschler, Academic Medical Center, Amsterdam, The Netherlands

This paper proposes the application of an equal-ization filter to remove unwanted differences infrequency response between recordings fromdifferent hearing aids. The filter makes it possi-ble to compare the perceptual effects (such asuser preference) of a specific signal processingfeature (e.g., noise reduction) between differenthearing aids, without the dominant influence ofdifferences in their frequency responses. Bothan objective quality measure (PESQ) and a lis-tening experiment have shown that the filter wasable to “level the ground” between the devicesincluded. The potential application of the inversefilter is to use it on recordings from hearing aidsso that we can directly compare the noise reduc-tion between devices. This allows one to deter-mine if users prefer a certain noise reductionover another, which could lead to improved rehabilitation of hearing impaired listeners. Convention Paper 8320

Session P2 Friday, M ay 13 09:30 – 12:30 Room 1

LOUDSPEAKERS

Chair: John VanderKooy, University of Waterloo, Waterloo, Ontario, Canada & BW Group, Steyning, UK

Technical Program

Page 3: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 3

09:30

P2-1 Vibrations in the Loudspeaker Enclosure Evaluated by Hot W ire Anem om etry and Laser Interferom etry— Danijel Djurek,1Nazif Demoli21AVAC (Alessandro Volta Applied Ceramics) Laboratory for Nonlinear Dynamics, Zlatar-Bistrica, Croatia 2Institute of Physics, Zagreb, Croatia

Vibration states of the loudspeaker enclosurewere examined by the laser interferometry andhot wire anemometry. According to simple expressions, air pressure in the enclosure is inde-pendent on air density. This statement has beentested by the use of gases indicated by very dif-ferent densities (air, He, SF6), and interferometricdata show a strong dependence of vibrationstates of the enclosure walls on density, when dri-ving frequency increases from 500 Hz up to 1kHz. The main reason for a deviation resides inthe imaginary part of the Morse impedance exert-ed by the vibrating gas within the enclosure.Convention Paper 8321

10:00

P2-2 Losses and Coupling in Long M ulti-W ire Loudspeaker Cables— Xavier Meynial, Guennolé Gapihan, Active Audio, Saint-Herblain, France

It is quite frequent in PA systems to use very longloudspeaker cables. These PA systems some-times carry several audio signals in adjacentwires of the same cable, or several cables may beneighboring over very long distances such as sev-eral hundreds of meters. This paper deals withlosses and coupling in these long cables. It isshown that even though losses and coupling mayintroduce significant crosstalk between wires andaffect the amplifier loads, it is possible to use verylong loudspeaker cables to connect line arrays inPA systems. Coupling between cables is also investigated, and strategies for reducing lossesand coupling are presented.Convention Paper 8322

10:30

P2-3 Subwoofers in Room s: M odal Analysis forLoudspeaker Placem ent— Juha Backman,Nokia Corporation, Espoo, Finland and AaltoUniversity, Espoo, Finland

Use of multiple subwoofers in a room is knownto help in reducing the variation of response bothas a function of frequency and a function ofplace, but simple geometry-based placementrules guarantee good results only in symmetricalcases. The paper discusses the use of experi-mental modal analysis and numerical optimiza-tion based on modal behavior to determine theoptimal placement of single or multiple sub-woofers in rooms with arbitrary geometry andsurface properties. Convention Paper 8323

11:00

P2-4 Losses in Loudspeaker Enclosures— ClausFuttrup, Scan Speak A/S, Videbæk, Denmark

In recent papers a lumped parameter model,which can simulate the impedance of conven-tional electro-dynamic transducers accurately,has been presented. The new model includesfrequency-dependent damping, which questionstraditional engineering practices in simulationsof loudspeaker enclosures and, in particular, associated losses. In this paper the conse-quences of frequency-dependent damping areevaluated to aid the development of simulationsand models of loudspeaker enclosures.Convention Paper 8324

11:30

P2-5 Com parison of M easurem ent M ethods for theEqualization of Loudspeaker Panels Basedon Bending W ave Radiation— Lars Hörchens,Diemer de Vries, Delft University of Technology,Delft, The Netherlands

Spatially extended panel loudspeakers based onbending wave radiation, such as distributedmode loudspeakers or multi-actuator panels, exhibit a complex radiation pattern with frequen-cy-dependent directivity characteristics. This paper seeks to determine the kind, position, andnumber of measurements needed to obtain anaverage radiation spectrum enabling efficientequalization. To this end, three approaches arecompared: wave field extrapolation of the panelsurface normal velocity, extrapolation of a near-field pressure measurement, and actual in-situmeasurements at a number of random positionsinside the listening space. The choice of the positions and the required number of measure-ments are discussed. Measurements taken on amulti-actuator panel are used to compare the dif-ferent approaches and present numerical results. Convention Paper 8325

12:00

P2-6 The Effect of Finite-Sized Baffles on M obileDevice Personal Audio— Jordan Cheer,1Stephen J. Elliott,1 Youngtae Kim,2Jung-Woo Choi21University of Southampton, Southampton, UK 2Samsung Electronics Co. Ltd., Korea

To reduce the annoyance from the use of loud-speakers on mobile devices, previous work hasinvestigated the use of acoustic contrast controlto optimize the performance of small arrays ofloudspeakers. These investigations have assumed that the baffle dimensions are negligi-ble so that the loudspeakers are omnidirectional,which is reasonable at low frequencies; howev-er, in practice the effect of a finite-sized baffle onthe optimized performance is important at higherfrequencies. This paper reports the results of using a finite-element model of a two-source ar-ray, positioned on a mobile phone sized baffle,to investigate the influence of the baffle on thepredicted array performance. The baffle isshown to reduce the performance of the array atfrequencies greater than around 1 kHz, but thenthe directivity of the individual drivers enhancesperformance at these higher frequencies. Convention Paper 8326 �

Page 4: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

4 Audio Engineering Society 130th Convention Program, 2011 Spring

Friday, M ay 13 10:00 Room Saint JulienTechnical Com m ittee M eeting on Audio Recordingand M astering System s

W orkshop 1 Friday, M ay 1310:30 – 12:30 Room 5

M USIC AND SEM ANTIC W EB

Co-Chairs: David De Roure, University of OxfordYves Raim ond, BBC

Panelists: David Bretherton, Southampton UniversityGregg Kellogg, Connected Media ExperienceAlexandre Passant, SeevlEvan Stein, Decibel

This workshop will provide an inter-disciplinary forum toexplore and promote the combination of music and Semantic Web technologies—specifically, to develop anunderstanding of how Semantic Web technologies cancontribute to the growth of music-related data on theWeb, as well as applications of this data. Activities in thisarea are becoming well established, driven by projects inboth research and industry, and they span personal applications together with all aspects of the creation,management, discovery, delivery and analysis of musicalcontent. The ambitions of this inaugural workshop are tobring together this emerging and energetic community, toshare information and practice, and especially to articu-late the research agenda in Music and the SemanticWeb. Through this we aim to facilitate innovation, informSemantic Web research, and establish the basis andmomentum for future events.

Session P3 Friday, M ay 13 11:00 – 12:30 Foyer

POSTERS: SOUND FIELD ANALYSIS

11:00

P3-1 Localization of M ultiple Speech Sources Using Distributed M icrophones— Maximo Cobos, Amparo Marti, Jose J. Lopez, Universidad Politecnica Valencia, Valencia,Spain

Source localization is an important task in manyspeech processing systems. There are many microphone array techniques intended to pro-vide accurate source localization, but their per-formance is severely affected by noise and reverberation. The Steered-Response PowerPhase Transform (SRP-PHAT) algorithm hasbeen shown to perform very robustly in adverseacoustic environments; however, its computa-tional cost can be an issue. Recently, the authors presented a modified version of theSRP-PHAT algorithm that improves its perfor-mance without adding a significant cost. Howev-er, the performance of the modified algorithmhas only been analyzed in single source local-ization tasks. This paper explores further thepossibilities of this localization method by con-sidering multiple speech sources simultaneouslyactive. Experiments considering different num-ber of sources and acoustic environments arepresented using simulations and real data. Convention Paper 8327

11:00

P3-2 Detection of “Solo Intervals” in M ultiple M icrophone M ultiple Source Audio Applications — Elias Kokkinis,1 Joshua Reiss,2John Mourjopoulos11University of Patras, Patras, Greece 2Queen Mary University of London, London, UK

In this paper a simple and effective method isproposed to detect time intervals where only asingle source is active (solo intervals) for multi-ple microphone, multiple source settings com-monly encountered in audio applications, suchas live sound reinforcement. The proposedmethod is based on the short term energy ratiosbetween all available microphone signals, and asingle threshold value is used to determine ifand which source is solely active. The method iscomputationally efficient and results indicate thatit is accurate and fairly robust with respect to reverberation t ime and amount of source interference. Convention Paper 8328

11:00

P3-3 A Real-Tim e Sound Source Localization and Enhancem ent System Using Distributed M icrophones— Amparo Marti, Maximo Cobos, Jose J. Lopez, Universidad Politecnica Valencia, Valencia, Spain

The Steered Response Power - Phase Trans-form (SRP-PHAT) algorithm has been shown tobe one of the most robust sound source localiza-tion approaches operating in noisy and reverber-ant environments. A recently proposed modifiedSRP-PHAT algorithm has been shown to pro-vide robust localization performance in indoorenvironments without the need for having a veryfine spatial grid, thus reducing the computationalcost required in a practical implementation.Sound source localization methods are com-monly employed in many sound processing applications. In our case, we use the modifiedSRP-PHAT functional for improving noisyspeech signals. The estimated position of thespeaker is used to calculate the time-delay foreach microphone and then the speech is enhanced by aligning correctly the microphonesignals. Convention Paper 8329

11:00

P3-4 Binaural M oving Sound Source Localizationby Joint Estim ation of ITD and ILD— ChengZhou, Ruimin Hu, Weiping Tu, Xiaochen Wang,Li Gao, Wuhan University, Wuhan, Hubei, China

Spatial cues ITD and ILD that provide sound lo-calization information play a very important rolein the binaural localization system. The efficientimprovement of binaural moving sound sourcelocalization method by joint estimation of ITDand ILD based on Doppler effect is investigated.By removing Doppler effect influence, resultsshow that the proposed binaural moving soundsource localization method achieves 0.3%(veloc-ity = 1m/s), 5.7%(velocity = 5m/s), and

Technical Program

Page 5: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 5

10.5%(velocity = 10m/s) accuracy improvementin silent conditions. The performance of ourmethod will be more effective as sound movesfaster. Convention Paper 8330 [Paper not presented but is available for purchase]

11:00

P3-5 Perceived Level of Late Reverberation inSpeech and M usic— Jouni Paulus,1 ChristianUhle,1 Jürgen Herre1,21Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany 2International Audio Laboratories Erlangen, Erlangen, Germany

This paper presents experimental investigationson the perceived level of running reverberation invarious types of monophonic audio signals. Thedesign and results of three listening tests are dis-cussed. The tests focus on the influence of the input material, the direct-to-reverberation ratio,and the reverberation time using artificially gener-ated impulse responses for simulating the late re-verberation. Furthermore, a comparison betweenmono and stereo reverberation is conducted. Itcan be observed that with equal mixing levels, theinput material and the shape of the reverberationtail have a prominent effect on the perceived lev-el. The results suggest that mono and stereo reverberation with identical reverberation timesand mixing ratios are perceived as having equallevel regardless of the material. Convention Paper 8331

11:00

P3-6 Reverberation Enhancem ent in a M odalSound Field— Hugh Hopper, David Thompson,Keith Holland, University of Southampton,Southampton, UK

The reverberation t ime of a room can be increased by using a reverberation enhance-ment system. These electronic systems havegenerally been installed in large rooms, wherediffuse field assumptions are sufficiently accu-rate. Novel applications of the technology can befound by applying it to smaller spaces where iso-lated modal resonances will dominate at low fre-quency. An analysis of a multichannel feedbacksystem within a rectangular enclosure is pre-sented. To assess the performance of this sys-tem, metrics are defined based on the spatialand frequency variations of a diffuse field. Thesemetrics are then used to optimize the parame-ters of the system using a genetic algorithm. It isshown that optimization significantly improvesthe performance of the system. Convention Paper 8332

11:00

P3-7 An Advanced Im plem entation of a Digital Artificial Reverberator— Andrea Primavera,Stefania Cecchi, Laura Romoli, Paolo Peretti,Francesco Piazza, Universita Politecnica delleMarche, Ancona, Italy

Reverberation is a well known effect particularlyimportant for listening to recorded and live music. In this paper we propose a real imple-

mentation of an enhanced approach for a digitalartificial reverberator. Starting from a preliminaryanalysis of the mixing time, the selected impulseresponse is decomposed in the time domainconsidering the early and late reflections. There-fore, a short FIR is used to synthesize the firstpart of the impulse response, and a generalizedrecursive structure is used to synthesize the latereflections, exploiting a minimization criterion inthe cepstral domain. Several results are reportedtaking into consideration different real impulseresponses and comparing the results with thoseobtained with previous techniques in terms ofcomputational complexity and reverberationquality. Convention Paper 8333

11:00

P3-8 Evaluation of Spatial Im pression Com paring2-Channel Stereo, 5-Channel Surround, and 7-Channel Surround with Height Channels for 3-D Im agery— Toru Kamekawa,1Atsushi Marui,1 Toshikiko Date,2 Masaaki Enatsu31Tokyo University of the Arts, Tokyo, Japan 2AVC Networks Company, Panasonic Corporation, Osaka, Japan 3marimoRECORDS Inc., Tokyo, Japan

Three-dimensional (3-D) imagery is now widelyspreading as one of the next visual formats forBlu-ray or other future media. Since more audiochannels are available with future media, the authors aim to find the suitable sound format for3-D imagery. A pairwise comparison test wascarried out comparing combinations of 3-D and2-D imagery with 2-channel stereo, 5-channelsurround and 7-channel surround sound (5channel surround plus 2 height channels) askingbetter depth sense and better match between visual and audio images. The results show that3-D imagery with 7 channel surround gives thehighest sense of depth and match of visual andaudio images. Convention Paper 8334

Tutorial 1 Friday, M ay 1311:00 – 12:30 Room 2

W HAT'S YOUR EQ IQ?

Presenter: Alex Case, University of Massachusetts at Lowell, Lowell, MA, USA

Equalization—we reach for it track by track and mix bymix, as often as any other effect. Easy enough, at first.EQ becomes more intuitive when you have a deep understanding of the parameters, types, and technolo-gies used, plus deep knowledge of the spectral proper-ties and signatures of the most common pop and rock instruments. Alex Case shares his approach for applyingEQ and strategies for its use: fixing frequency problems,fitting the spectral pieces together, enhancing flatteringfeatures, and more.

Friday, M ay 13 11:00 Room Saint JulienTechnical Com m ittee M eeting on Autom otive Audio �

Page 6: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

6 Audio Engineering Society 130th Convention Program, 2011 Spring

Special EventAW ARDS PRESENTATION AND KEYNOTE ADDRESSFriday, May 13, 12:45 – 13:30Room 1

Opening Rem arks: • Executive Director Roger Furness• President Jim Kaiser• Convention Chair Peter Mapp

Program : • AES Awards Presentation• Introduction of Keynote Speaker byConvention Chair Peter Mapp• Keynote Address by Trevor Cox

Awards PresentationPlease join us as the AES presents Special Awards tothose who have made outstanding contributions to theSociety in such areas of research, scholarship, and pub-lications, as well as other accomplishments that havecontributed to the enhancement of our industry. Theawardees are:

Board of Governors Award• Jan Berg• Kim io Ham asaki• Toru Kam ekawa• M ichael Kelly

Fellowship Award• Andrzej Brzoska• Christof Faller• M asam i “Sam ” Toyoshim a

Bronze M edal Award• C. Robin Caine• Yoshizo “Steve” Sohm a

Keynote Speaker

This year’s Keynote Speaker is Trevor Cox. Trevor Coxis Professor of Acoustic Engineering at the University ofSalford, a Senior Media Fellow funded by the Engineer-ing and Physical Sciences Research Council, and Presi-dent of the Institute of Acoustics (IOA).One major strand of his research is room acoustics for

intelligible speech and quality music production and re-production. Cox’s diffuser designs can be found in roomsaround the world. He has co-authored a research bookentitled Acoustic Absorbers and Diffusers. He wasawarded the IOA’s Tyndall Medal in 2004.Cox has a long track record of communicating acoustic

engineering to the public and has been involved in en-gagement projects worth over £1M. He was given theIOA award for promoting acoustics to the public in 2009.He has developed and presented science shows to15,000 pupils including performing at the Royal AlbertHall, Purcell Rooms, and the Royal Institution. Cox haspresented fifteen documentaries for BBC radio including:Life’s soundtrack, Save our Sounds, and Science vs theStrad. His latest BBC radio 4 program due for broadcastin June is titled “Green Ears.” The title of Cox’s keynotespeech is “Sounding Places, Past and Present: HowSpaces Affect the Sounds We Make and Hear.”Architectural acoustics affect the perceived quality of

live and recorded performances. Caves with ringingspeleothem show that mankind has been exploitingvenues with distinctive acoustics since the upper Palae-olithic era. This talk will explore unconventional places,both ancient and modern, with remarkable acoustics: aspherical room that allows you to whisper into one of yourown ears; a water cistern with a reverberation time of overforty seconds, and the whispering arches of Grand Centralin New York. What do these spaces say about our hearingabilities and the importance of sound to our lives?

Session P4 Friday, M ay 13 14:00 – 17:00 Room 1

M ULTICHANNEL AND SPATIAL SIGNAL PROCESSING, PART 1

Chair: Francis Rum sey, Logophon, Ltd., Witney, UK

14:00

P4-1 3-D-Sound in Car Com partm ents Based on Loudspeaker Reproduction Using Crosstalk Cancellation— Andre Lundkvist, Arne Nykänen,Roger Johnsson, Luleå University of Technolo-gy, Luleå, Sweden

One way to enhance driving safety is to use signal sounds. Driver attention may further beimproved by placing sounds in a 3-D space, using binaural synthesis. For correct loudspeak-er reproduction of binaural signals, crosstalk between the channels needs to be cancelledout. In this study, a crosstalk cancellation algo-rithm was developed and tested. The algorithmwas applied in a car compartment, and threeloudspeaker positions were compared. A listen-ing test was performed to determine the sub-jects’ ability to correctly localize sounds. It wasshown that loudspeakers placed behind the lis-tener correctly reproduced sound sources in theback hemisphere. Loudspeakers located in frontand above the listener gave a high number offront/back confusions for all angles. Convention Paper 8335

14:30

P4-2 Design of a Com pact Cylindrical Loudspeaker Array for Spatial Sound Reproduction— Mihailo Kolundzija,1 ChristofFaller,1 Martin Vetterli1,21Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland 2University of California at Berkeley, Berkeley, CA, USA

Building acoustic beamformers is a problemwhose solution is hindered by the wide-band nature of the audible sound. In order to achievea consistent directional response over a widerange of frequencies, a conventional acousticbeamformer needs a high number of discreteloudspeakers and be large enough to achieve adesired low-frequency performance. Theacoustic beamformer design described in thispaper uses measurement-based optimal beam-forming for loudspeakers located mounted on arigid cylindrical baffle. Super-directional beam-forming enables achieving desired directivitywith multiple loudspeakers at low frequencies.High frequencies are reproduced with a singleloudspeaker, whose highly directional reproduc-tion—due to the cylindrical baffle—matches thedesign goals. In addition to the beamformer filterdesign procedure, it is shown how such a loud-speaker array can be used for spatial sound reproduction. Convention Paper 8336

15:00

P4-3 A New M ultichannel M icrophone Techniquefor Effective Perspective Control— Hyunkook

Technical Program

Page 7: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 7

Lee, University of Huddersfield, Huddersfield, UK

This paper introduces a new multichannel micro-phone technique that was designed to producemultiple listener perspectives. A listener perspec-tive paradigm and the related spatial attributes arealso proposed for the evaluation of spatial qualityin acoustic recording and reproduction. The pro-posed microphone array employs five coincidentmicrophone pairs, which can be transformed intovirtual microphones with different polar patternsand directions depending on the mixing ratio. Theresults of interchannel correlation and informalsubjective evaluations suggest that the proposedtechnique is able to offer an effective control overvarious spatial attributes. Convention Paper 8337

15:30

P4-4 Experim ental Analysis of Spatial Propertiesof the Sound Field Inside a Car Em ploying aSpherical M icrophone Array— Marco Binelli,Andrea Venturi, Alberto Amendola, Angelo Farina, University of Parma, Parma, Italy

A 32-capsule spherical microphone array wasemployed for analyzing the spatial properties ofthe sound field inside a car. Both the backgroundnoise and the sound generated by the car’s soundsystem were spatially analyzed, by superimposingfalse-color sound pressure level maps over apanoramic 360x180 degree image, obtained witha parabolic-mirror camera. The analysis of thenoise field revealed the parts of the car bodywhere more noise is leaking in, providing guid-ance for better soundproofing. The analysis of theimpulse responses generated by the loudspeak-ers did show useful information on the reflectionpatterns, providing guidance for adding absorbentmaterial in selected locations and for optimizingposition and orientation of loudspeakers. Convention Paper 8338

16:00

P4-5 Im proved ITU and M atrix Surround Downm ixing— Christof Faller,1 Pieter Schillebeeckx21Illusonic LLC, St-Sulpice, Swtizerland2Soundfield Ltd., Wakefield, UK

Improvements to ITU and matrix surround down-mixing are proposed. The surround channels areoften mixed into the downmix with reduced gainto prevent that the resulting stereo signal is over-ly ambient. A method is proposed that allowscontrol of the amount of ambience in the down-mix signal independently of direct sound. In aconventional matrix surround downmix, ambi-ence from the surround channels appears impaired (negatively correlated). To address thisissue, a technique is proposed that separates direct and ambient sound. Then, a matrix sur-round downmix is only applied to the directsound, while ambient sound is treated like in anITU downmix. Convention Paper 8339

16:30

P4-6 Spaciousness Rating of 8-ChannelStereophony-Based M icrophone Arrays—

Laurent Simon,1,2 Russell Mason11University of Surrey, Guildford, Surrey, UK2now with INRIA – Rennes, Bretagne Atlantique, France

In previous studies, the localization accuracyand the spatial impression of 3-2 stereo micro-phone arrays were discussed. These showedthat 3-2 stereo cannot produce stable images tothe side and to the rear of the listener. An octa-gon loudspeaker array was therefore proposed.Microphone array design for this loudspeakerconfiguration was studied in terms of localizationaccuracy, locatedness, and image width. Thispaper describes an experiment conducted toevaluate the spaciousness of 10 different micro-phone arrays used in different acoustical envi-ronments. Spaciousness was analyzed as afunction of sound signal, acoustical environment,and microphone array’s characteristics. Itshowed that the height of the microphone arrayand the original acoustical environment are thetwo variables that have the most influence onthe perceived spaciousness, but that micro-phone directivity and the position of soundsources is also important. Convention Paper 8340

Session P5 Friday, M ay 13 14:00 – 17:30 Room 4

PRODUCTION AND BROADCAST

Chair: Natanya Ford

14:00

P5-1 Frequency W eighting and Ballistics for Program Loudness M odeling— Ian M. Dash,1Benjamin Smith,2 Densil Cabrera21Australian Broadcasting Corporation, Sydney, NSW, Australia 2University of Sydney, Sydney, NSW, Australia

There has been both experimental and anecdo-tal evidence that the low-frequency performancein the ITU-R Recommendation BS.1770 programloudness measurement algorithm could be improved. A listening test with an emphasis onlow frequency content was therefore conducted.An attempt was made to analyze the results inoctave bands by multiple regression, but largerthan expected variability precluded any usefuloutcome from this method. A simpler regressionanalysis was therefore performed using severalfixed weighting curves and asymmetric integra-tion with a range of time constants. Although theresults largely support the present low frequencyweighting curve, they also indicate that asym-metric integration provides better program loud-ness assessment than symmetric integration orhigh level gating. Convention Paper 8341

14:30

P5-2 Evaluation of Live M eter Ballistics for Loudness Control— Scott Norcross,1Félix Poulin,2 Michel Lavoie1

Page 8: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

8 Audio Engineering Society 130th Convention Program, 2011 Spring

1Communications Research Centre, Ottawa, Ontario, Canada 2CBC/Radio-Canada, Montreal, Quebec, Canada

The broadcast community is transitioning frompractices that revolved around audio peak nor-malization to one where the focus is on loudnessconsistency. Central to this effort is the loudnessmeasurement algorithm described in ITU-RBS.1770. To assist in the mixing of audio thattargets specific long-term loudness levels a needhas been identified for some form of loudness-based l ive audio metering. The EBU has proposed meter specif ications to indicate “momentary” and “short-term” loudness whoseballistics are defined, respectively, by a 400 msand a 3000 ms integration window. In parallelwith the EBU efforts, the CBC/Radio-Canada, incollaboration with the CRC, has been studyingsince 2008 various loudness meter ballistics thatwould be suitable in a production environment.This paper reports on a series of tests carriedout by the CBC/Radio-Canada where variousmomentary meter ballistics were evaluated itwas found that an IIR-based meter ballistic witha 400 ms time-constant was preferred. Convention Paper 8342

15:00

P5-3 Adaptive Dynam ics Enhancem ent— MartinWalsh, Edward Stein, Jean-Marc Jot, DTS, Inc.,Scotts Valley, CA, USA

Modern recordings are being mastered withmore and more aggressive dynamic range com-pression in an attempt to generate content thatis louder than previous releases. This can oftenlead to large discrepancies in perceived loud-ness between tracks that were mastered at different periods of recent history. A commonlyproposed solution to this problem involves theuse of loudness normalization. While such nor-malization techniques help to reduce discrepan-cies in loudness, they cannot resurrect dynamicsthat were removed due to extreme levels of dynamic range compression. This paper outlinesa technique for restoring the dynamics of mod-ern music by continuously monitoring transientsignal behavior together with the associated dynamic range levels. When dynamic rangecompression is likely, transients are restored tolevels that are more expected for the type of material being played. Convention Paper 8343

15:30

P5-4 Describing the Transparency of M ixdowns:The M asked-to-Unm asked-Ratio— PhilippAichinger,1,2 Alois Sontacchi,2 Berit Schneider-Stickler11Medical University of Vienna, Vienna, Austria2University of Music and Performing Arts Graz, Graz, Austria

In this paper a model that predicts the trans-parency of mixdowns is proposed. The masked-to-unmasked-ratio relates the original loudnessof an instrument to its loudness in the mix. In order to assess this new measure a listening testis conducted. It is shown that instruments with amasked-to-unmasked-ratio of 10% or smaller

are critical in mixdowns because most of themcannot be identified adequately. The newly sug-gested model is to be used in automatic mix-down algorithms and as an evaluating measurein future development whenever masking sce-narios are to be described. Convention Paper 8344

16:00

P5-5 The “Digital Solution”: The Answer to a Lotof Challenges within New Production Routines at Today’s Broadcasting Stations—Stephan Peus, Georg Neumann GmbH, Berlin,Germany

The introduction of networked production sys-tems allows a very actual, topical, and efficientworkflow especially within broadcast and TV applications. As a result production is movingcloser to editorial work. Sound editing such asvoice-over, etc., has to be done more and moreby editors who naturally don’t have that specialknowledge as a sound engineer. Live recordingsand interactive TV in future will call for furthersteps to simplify the production processes and toenhance reliability. We will explain practical examples of challenges from daily productionroutines, will give answers to solve the problemsby using digital technology, and will show the effects on a more simple and reliable workflow. Convention Paper 8345

16:30

P5-6 Autom atic M ixing and Tracking of On-Pitch Football Action for Television Broadcasts—Robert G. Oldfield, Benjamin G. Shirley, University of Salford, Salford, UK

For the television broadcast of football in Europe, the sound engineer will typically havean arrangement of 12 shotgun microphonesaround the pitch to pick up on-pitch sounds suchas whistle blows, players talking, and ball kicks,etc. Typically, during a match, the sound engi-neer will increase and decrease the levels ofthese microphones manually in accordance withwhere the action is on the pitch at a given timeto prevent the final mix being awash with crowdnoise. As part of the EU funded project, Fasci-natE, we have developed an automatic mixingalgorithm that intelligently seeks key events onthe pitch and turns on the corresponding micro-phones, the algorithm picks out the key eventsand automatically tracks the action eliminatingthe need for manual tracking. Convention Paper 8346

17:00

P5-7 High Quality “Radio” Broadcasting over theInternet— David Errock, Wave Science Technology Ltd., London, UK

Broadcasting audio over the internet has dra-matically improved the experience for listeners,including removing the geographic boundariesof terrestrial transmission, allowing playback onportable devices and time-shifting content. Inter-net audio has primarily increased choice withtraditional broadcasters now competing along-side “internet only” music stations. Internet

Technical Program

Page 9: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 9

distribution can also increase the audio quality,beyond that of terrestrial and satellite transmis-sion channel constraints; unlike video stream-ing, which is of lower perceived quality than traditional transmission. With the sustainabledata capacity of internet connections increasing,lossless audio can be streamed live to the con-sumer. This paper discusses the issues relatingto these distr ibution techniques and the challenges that will be experienced by thebroadcasters, the distribution network, receivermanufacturers, and listeners.Convention Paper 8347

Session EB1 Friday, M ay 13 14:00 – 15:30 Foyer

DESIGN AND ASSESSM ENT

14:00

EB1-1 Queen M ary's “M edia and Arts TechnologyStudios” Audio System Design— MartinJames Morrell, Christopher A. Harte, Joshua D. Reiss, Queen Mary University of London,London, UK

The Media and Arts Technology Studios is anew facility at Queen Mary University of Londonlinking together a previous Listening Room, withour new rooms: Control Room, PerformanceLab, and Plant Room. This engineering reportdiscusses our design philosophy for our givenbrief to create a “world-class facility” for a spacethat is a “blank canvas” for researchers. We detail considerations for making an audio systemthat is simple for standard recording and play-back while having a tremendous amount of rout-ing options for users to create unique projectsbetween all the connected spaces, featuring twoseparate spatial audio reproduction systems.The result is a 96 kHz/24 bit MADI based systemusing a multimode fiber optic network and dedi-cated wordclock throughout. Engineering Brief 1

14:00

EB1-2 Revitalizing the Denis Arnold Hall for M ultichannel Electroacoustic Sound Diffusion and Recording — Duncan Williams,University of Oxford, Oxford, UK

The Denis Arnold Hall, the flagship lecturing andperformance space in the Faculty of Music, Oxford University, has recently been the benefi-ciary of a complete refurbishment, including dedicated design and specification for the perfor-mance and diffusion of electroacoustic composi-tion and multichannel sonic art. The new config-uration includes acoustic absorption andinsulation, low frequency management, andeight flexible full range satellite loudspeakers.This diffusion system is complemented by a fullrange of playback formats (from 1/4" reel-to-reel,to Blu-ray, with line level patching for discrete or“stemmed” multichannel performance/playback),as well as 16 small diaphragm capacitor micro-phones, hung in 8 stereo pairs from the ceilingon an integrated winch system. The design andconfiguration of the space necessitated a

lengthy consultation with composers, acousti-cians, electricians, and audio-visual specialists,and lessons learned along the way might beuseful for those interested in adapting their ownspace for similar purposes. Engineering Brief 2

14:00

EB1-3 Tim e Alignm ent of Subwoofers in Large PASystem s— Natàlia Milán, Joan Amate, MasterAudio

In common PA systems the frequency range isdivided into different ranges that are reproducedusing different cabinets (subwoofers for the bassrange and top cabinets for the mid-high range).This means different locations and positions ofthe sound sources and therefore notches andpeaks in the crossover range. Time alignment isneeded to adjust the arrival time of frequenciesin the crossover area. But it’s not only a matterof distance, since the crossover is modifying thephase, too. In this brief, a downscaled model isused to show how to measure the phase differ-ence of a two-source PA system and correct itusing FFT measuring software. Two situationsare treated: two sources with overlapped fre-quency response and two sources sharingcrossover frequency. Engineering Brief 3

[Engineering Brief #4 was withdrawn]

14:00

EB1-5 Sound System Docum entation for Construction Projects— Thomas Knauss, Marshall-KMK Acoustics, Ltd., Chappaqua, NY,USA

This engineering brief will provide sound systemengineers, users, and operators with the oppor-tunity to become familiar with the information,processes, and documentation required to suc-cessfully coordinate the electrical, rough-in, andgeneral construction requirements of a profes-sionally installed sound system with the needs ofarchitects, engineers, and contractors responsi-ble for construction projects. The brief will include a categorized list of the 24 points of information that every architect, engineer, andconstruction professional needs to know abouteach and every sound system device to be implemented on a construction or renovationproject. The presenter maintains 30 years of audio experience including front of house engi-neer for national recording artists, large scale integration as a sound systems contractor tocomplete professional design services for recitalhalls, performing arts centers, corporate, enter-tainment, and house of worship facilities. Engineering Brief 5

14:00

EB1-6 A Free Database of Head-Related Im pulse Response M easurem ents in the HorizontalPlane with M ultiple Distances— Hagen Wierstorf, Matthias Geier, Alexander Raake,Sascha Spors, T-Labs, Technische UniversitätBerlin, Berlin, Germany �

Page 10: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

10 Audio Engineering Society 130th Convention Program, 2011 Spring

A freely available database of Head-Related Impulse Response (HRIR) measurements is introduced. The impulse responses were mea-sured in an anechoic chamber using a KEMARmanikin at four different loudspeaker dis-tances—3 m, 2 m, 1 m, and 0.5 m—reachingfrom the far field to the near field. The loud-speaker was positioned at ear height and themanikin was rotated with a step motor in one degree increments. For the 3 m distance addi-tional measurements have been carried outwhere the torso stayed fixed and only the artifi-cial head was rotated. In addition to the raw impulse responses there are also data-setsavailable with a frequency response compensat-ed for the use of several different headphones. Engineering Brief 6

14:00

EB1-7 Loudness M easurem ent and Hum an Interpretation in Television Program QualityChecking— Matthieu Parmentier, France Television

Along the standardization of Loudness Measure-ment (EBU R128 and ITU-R BS.1770 update),France Television introduced loudness in 2010 toimprove human interpretation during QualityChecking processes. After the defined measure-ment itself, done by a machine respecting thestandard, studies have been conducted to fix objective limits according to the network's skillsand viewer environments. By using the tools of-fered by the EBU R128, we have introduced speci-fications for audio acceptance beyond the loud-ness target of –23 LUFS, already taken intoaccount. This work and results have been extend-ed for common programs and short commercialsin different ways. The measurement tools, devel-oped according R128, have also been graphicallydeveloped to facilitate this additional reading.Engineering Brief 7

14:00

EB1-8 Do Young People Actually Care about theQuality of Their M P3s?— Ainslie Harris, RobertGordon University, Aberdeen, Scotland, UK

Ten focus groups were conducted in 2010 inToronto, New York, and northern England, withparticipants ranging in age from 15 to 32 yearsold. Participants were asked about their file qual-ity preferences for downloading music from online services, in the context of their existingpositive/negative experiences, and being able todesign a new service that would offer them theformat(s) of their choice. This brief will outlinesome of the key qualitative findings, including:(1) Contrary to popular opinion, young peoplecan tell the difference between a 128 k and 320k MP3, but there appears to be an age thresh-old, below which listeners either do not notice ordo not care. (2) Listening context and environ-ment are important, and affect consumers’ filequality preferences, even when only consideringwhat type of MP3 to download. (3) Consumersstill find that they must weigh their personal quality preferences against other practical requirements. Engineering Brief 8

W orkshop 2 Friday, M ay 1314:00 – 16:00 Room 2

W HAT'S THE USE OF RECORDING?

Chair: Sean Davies, S.W. Davies Ltd., Aylesbury, UK

Panelists: Ted Kendall, Independent Engineer, Wales, UKRobert Philip, Open University, Milton Keynes, UKGordon Reid, Cedar Audio Ltd., Cambridge, UK

As engineers we tend not to think too much about theeventual purpose of what we do, but this can influencenot only our own techniques, but can have ramificationsfar afield. A panel of specialists will consider recording asevidence in legal cases; as propaganda; as a true recordof musical styles for different periods; as a way of delivering drama; and as a means of transmitting secretintelligence.

Student/Career Developm ent EventTHE CHANGING AUDIO PROFESSIONALFriday, May 13, 14:00 – 15:00Room 5

Moderator: M elvyn Tom s, JAMES (Joint Audio Media Education Services)

Panelists: Samantha Bennett, University of Westminster, UKDennis Weinreich, APRS/UK Screen/JAMES

Over the previous 30 years, the technical and opera-tional demands placed upon working professionals in theaudio industry have grown exponentially. Equally overthis period the opportunities to learn and gain the neces-sary experience to become accomplished by sitting andworking alongside established practitioners has fallen.To a large extent, public and private education has takenon this role but has it been successful? Audio production and technology courses attract a

large intake of industry hopefuls but will they be “sittingand working alongside established practitioners” andhow can they be assured their course meets industry requirements?

Friday, M ay 13 14:00 Room Saint JulienTechnical Com m ittee M eeting on Archiving, Restoration, and Digital Libraries

Friday, M ay 13 14:00 Room M outon CadetStandards Com m ittee M eeting SC-02-02 on Digital Input/Output Interfacing

Friday, M ay 13 15:00 Room Saint JulienTechnical Com m ittee M eeting on Hum an Factors inAudio System s

Tutorial 2 Friday, M ay 1316:00 – 18:00 Room 5

FORENSIC AUDIO ENHANCEM ENT

Chair: Eddy B. Brixen, EBB Consult, Denmark

Panelists: Andrey Barinov, Speech Technology Center Ltd., RussiaRobin P. How, Metropolitan Police, London, UKGordon Reid, CEDAR Audio Ltd., Cambridge, UK

Technical Program

Page 11: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 11

Attendees of this tutorial should expect to learn moreabout issues surrounding the enhancement of forensicaudio recordings. Topics will include evidence handlingand processing, problem-specific approaches, and theincreasing enumeration of GSM modulated voice signalswill be presented and discussed. The panel of presenterswill include practitioners from law enforcement, softwaremanufacturers, and the scientific community. Before andafter audio examples will be demonstrated focusing onspecific problems.

Friday, M ay 13 16:00 Room Saint JulienTechnical Com m ittee M eeting on Audio for Telecom m unications

Session P6 Friday, M ay 13 16:30 – 18:00 Foyer

POSTERS: AUDIO EQUIPM ENT

16:30

P6-1 Study and Analysis of the Carrier DistortionSources in PW M DCI-NPC M odulator— VicentM. Sala, Luis Romeral, UPC-Universitat Politecnica de Catalunya, Terrassa, BCN, Spain

This paper studies and analyzes an analogPWM Modulator for a Half-Bridge DCI-NPC Am-plifier (Diode Clamped Inverter – Neutral PointClamped) as one of the sources of distortion inthe switching amplification chain. The four typesof error or distortion sources are studied and an-alyzed: Carriers Offset Error (COE), CarriersPhase Error (CPE), Carriers Symmetry Error(CSE), and Carriers Amplitude Error (CAE). Thispaper concludes that the two major sources oferror or distortion in PWM modulation processfor DCI-NPC topology are the Amplitude (CAE)and Offset (COE) Errors, the latter being largestcontributor to the total distortion. Convention Paper 8348[Paper not presented but is available for purchase]

16:30

P6-2 Experim ental Verification of an ElectrostaticTransducer with a Partitioned Back Electrode— Libor Husník, Czech Technical University in Prague, Prague, Czech Republic

Electroacoustic transducers based on the con-denser principle have usually a planar back elec-trode that is a one-piece entity. The previouswork studied theoretically the possibility of mak-ing the back electrode partitioned. Such anarrangement can be used in a design of the so-called digital loudspeaker, in which the sizesof the partitioned electrode are in the ratio of thepowers of 2, but this is only a special case. Theaim of this work is to study performance of asystem modeling the electrostatic transducerwith the partitioned back electrode by way ofmeasurement on an experimental sample. Vari-ous combinations of signals are applied on thepartit ioned electrode and the transducer response is measured. Convention Paper 8349

16:30

P6-3 Capacitor “Sound” in M icrophone Pream plifierDC Blocking and HPF Applications: Com paring M easurem ents to ListeningTests— Robert-Eric Gaskell, McGill University,Montreal, Quebec, Canada

The sonic effect of capacitors in various aspectsof audio electronics design has long been dis-cussed and speculated upon. Recent publica-tions have tested many of these theories throughrigorous distortion measurements of a variety ofcapacitor types under several test conditions.One particularly interesting result is a measur-able increase in 2nd harmonic distortion for elec-trolytic and PET type capacitors when a DC biasis applied. This paper repeats these measure-ments for a set of capacitors commonly used in+48V blocking applications and high pass filtersin microphone preamplifier designs. These phys-ical measurements are then compared to the result of double blind listening tests in order toexamine the audibility of these capacitor distor-tions as well as explore their sonic effect on vari-ous program materials. Convention Paper 8350

16:30

P6-4 1W 104dBA SNR Filterless Fully-Digital ClassD Audio Am plifier with EM I Reduction Technique— Rossella Bassoli, Carlo Crippa,Federico Guanziroli, Germano Nicollini, ST-Ericsson, Agrate Brianza, Monza Brianza,Italy

A 1W filter-less power DAC featuring 104 dBASNR and EMI spreading is presented. Pre-distor-tion algorithms are used to reduce harmonic dis-tortion inherent to the employed modulationprocess, and an oversampling noise shaper allowsreducing modulator clock speed to facilitate hard-ware implementation while keeping high-fidelityquality. No analog circuits exist from I2S interfaceto speaker, leading to zero output offset and goodefficiency even at medium/low power levels. 1.2Vdigital and 2.7V-5V output supplies are used. Active area is 0.94mm2 in a 0.13micron CMOSprocess. Total harmonic distortion at maximumlevel is about 0.2%. Convention Paper 8351

16:30

P6-5 Ultra-Low Power Audio Architecture forPortable Devices— Kangeun Lee, ChangyongSon, Dohyung Kim, Sihwa Lee, Samsung Advanced Institute of Technology, Suwon, Korea

Current portable devices demand not only higherperformance but also lower power consumption.For the reason, this paper describes a low powerSystem-on-Chip (SoC) architecture that targetsplayback multimedia format. To significantly reduce power consumption of the SoC, the sys-tem exploits a DSP core to decode compressedaudio. It is a key technique that a pre-bufferkeeps compressed audio data. Therefore, theDSP can independently playback audio data,and other systems including CPU, and DRAM isable to be power off until all audio data remain-ing in the pre-buffer is exhausted by the DSP.

Page 12: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

12 Audio Engineering Society 130th Convention Program, 2011 Spring

For seamless operation, we designed a new ker-nel driver that controls the DSP and CPU, and itis embedded into Linux kernel sets of theGoogle’s Android 2.2. Energy efficiency is evalu-ated by using fourteen audio sequences encod-ed with mp3 of which format is 128 kbps stereo.The experimental results show that the proposedsystem required only 25% power of the conven-tional DVFS algorithm and guaranteed Quality ofService (QoS) for mp3 playing. Convention Paper 8352

16:30

P6-6 Design of a Passive DGRC Colum n Loudspeaker with W ave Front Synthesis—Xavier Meynial, Gilles Grégoire, Active Audio,Saint-Herblain, France

The DGRC (Digital and Geometric RadiationControl) principle allows one to assign a largenumber of loudspeakers of a line array to a limit-ed number of electronic channels. It was intro-duced in 2006 and is now used in digitally steerable column loudspeakers. In this paper wepropose a very simple and straightforward imple-mentation of this principle in a passive column,where delays are approximated with all-passpassive circuits. As a result, the column isplaced vertically (not tilted) and radiates a wavefront that ensures high speech intelligibility andhomogeneous SPL coverage. We present thedesign of this column, as well as experimentalresults. Finally, we discuss its advantages interms of visual integration, acoustic perfor-mances, and cost effectiveness. Convention Paper 8353

16:30

P6-7 Design and Realization of a Reference ClassLoudspeaker Panel for W ave Field Synthesis— Stephan Mauer, Frank Melchior, IOSONOGmbH, Erfurt, Germany

This paper describes the requirements, the design, nd the realization of a reference loud-speaker panel for Wave Field Synthesis (WFS).Beside the algorithm and the loudspeaker’sacoustical performance the quality of Wave FieldSynthesis is strongly dependent on the spatialaliasing frequency. Only below that frequencywill the synthesized wave field be physically cor-rect. The spatial aliasing frequency is related tothe distance of adjacent speakers in the loud-speaker array. To raise the aliasing frequency to2.8 kHz a loudspeaker panel was designed witha tweeter spacing of 6 cm. The transducers,electronics, and directivity were designed to obtain an excellent sound quality and SPL cov-erage. Simulations regarding the influence ofspeaker spacing and wave field artifacts havebeen made. Measurement results of the panelare given. The overall system design will beshown as an application example. Convention Paper 8354

16:30

P6-8 Study and Analysis of Dem odulation FilterLosses in DCI-NPC M ultilevel Power Am plifiers— Vicent M. Sala, Luis Romeral,

UPC-Universitat Politecnica de Catalunya, Terrassa, BCN, Spain

This paper presents a less studied source of effi-ciency losses in Multilevel Diode-Clamped-Inverter or Neutral-Point-Converter (DCI-NPC)Power Switching Amplifiers. Filter inductors gen-erally add another significant contribution to thetotal power loss in Power Switching Amplifierssystems. This contribution is generally compara-ble to that of the switching power stage and it isimportant to obtain a reasonable accurate esti-mate of the losses. In this paper the four possi-ble sources of losses in the demodulation filterare studied and analyzed and are defined by theexpressions to calculate the value of the losses.Using these loss expressions, this paper ana-lyzes the contribution of each of the sources.This work finishes up presenting the results ofsimulation and the conclusions. Convention Paper 8355[Paper not presented but is available for purchase]

Student/Career Developm ent EventOPENING AND STUDENT DELEGATE ASSEM BLY M EETING – PART 1Friday, May 13, 16:30 – 18:00Room 2

Chair: Daniel Deboy

Vice Chair: M agdalena Plewa

The first Student Delegate Assembly (SDA) meeting is theofficial opening of the Convention’s student program and agreat opportunity to meet with fellow students from all cor-ners of the world. This opening meeting of the StudentDelegate Assembly will introduce new events and electionproceedings, announce candidates for the coming year’selection for the European and International Regions, announce the finalists in the four new recording competi-tion categories, and announce any upcoming events of theConvention. Students and student sections will be giventhe opportunity to introduce themselves and their activi-ties, in order to stimulate international contacts.All students and educators are invited to participate in

this meeting. Election results and Recording CompetitionAwards will be given at the Student Delegate AssemblyMeeting–Part 2 on Monday, May 16.

Friday, M ay 13 17:00 Room M outon CadetStandards Com m ittee M eeting SC-02-08 on Audio-File Transfer and Exchange

Special EventW ELCOM E RECEPTION AND PRESENTATION Friday, May 13, 18:00 – 19:30Room 1

Fraunhofer IIS invites all visitors to the AES 130th Con-vention to meet in a social atmosphere to catch up withfriends and colleagues from the world of audio. There willbe a short presentation of the “International Audio Labo-ratories Erlangen—The New Center of Audio Research”followed by a Welcome Drinks Reception.

Special EventORGAN RECITAL BY GRAHAM BLYTHFriday, May 13, 20:00 – 21:30Lincoln’s Inn Chapel, London

Graham Blyth’s traditional organ recital will be given on

Technical Program

Page 13: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 13

the new organ of the Lincoln’s Inn Chapel. The Chapelhas had a historic role in the life of the Inn. The presentbuilding was consecrated on Ascension Day, 1623, andthere are services every Sunday in law terms. Lincoln’sInn Chapel had to wait almost 200 years until the first or-gan was installed in 1820. The modest organ of 1819-20by Flight and Robson was replaced in 1856 by a three-manual and pedal organ by William Hill, one of the finestorgan builders in the country. This organ was rebuilt onno less than nine occasions, most recently in 1969.As the organ approached its 150th birthday it became

clear that its increasing unreliability meant that a decisionneeded to be taken about its future. After exhaustive research it was concluded that so little of the original Hillorgan had remained unaltered that it was impossible torestore it to any original condition.After a lengthy tender process, a contract was signed

with Kenneth Tickell in 2005 for a new three-manual organ. It was assembled in the Northampton workshopsof Tickell in 2008-9, and arrived on site in July 2009. Thecase, of European oak, has been stained to match thewoodwork of the Chapel, and the highly polished frontpipes are complemented by shades of lime wood carvedto a foliage design based on the surviving example ofearly sixteenth-century mural painting displayed in theroom that now forms the passage between GatehouseCourt and Hardwicke Building. The new organ was installed by Kenneth Tickell in 2009-10.The recital will include Bach’s Great C major Prelude &

Fugue, Franck’s Fantasie in A, two recent pieces inspired by words of Sir John Betjeman by Dennis Wick-ens, and a complete performance of Widor’s 5th Sym-phony. Programs and maps will be available from theSpecial Events Desk.Graham Blyth was born in 1948, began playing the

piano aged 4 and received his early musical training as aJunior Exhibitioner at Trinity College of Music in London,England. Subsequently, at Bristol University, he took upconducting, performing Bach’s St. Matthew Passion before he was 21. He holds diplomas in Organ Perfor-mance from the Royal College of Organists, The RoyalCollege of Music and Trinity College of Music. In the late1980s he renewed his studies with Sulemita Aronowskyfor piano and Robert Munns for organ. He gives numer-ous concerts each year, principally as organist and pianist, but also as a conductor and harpsichord player.He made his international debut with an organ recital atSt. Thomas Church, New York in 1993 and since thenhas played in San Francisco (Grace Cathedral), Los Angeles (Cathedral of Our Lady of Los Angeles), Ams-terdam, Copenhagen, Munich (Liebfrauendom), Paris(Madeleine and St. Etienne du Mont) and Berlin. He haslived in Wantage, Oxfordshire, since 1984 where he iscurrently Artistic Director of Wantage Chamber Concertsand Director of the Wantage Festival of Arts.He divides his time between being a designer of pro-

fessional audio equipment (he is a co-founder and Tech-nical Director of Soundcraft) and organ related activities.In 2006 he was elected a Fellow of the Royal Society ofArts in recognition of his work in product design relatingto the performing arts.

Session P7 Saturday, M ay 14 09:00 – 11:00 Room 1

LIVE AND INTERACTIVE SOUND

Chair: M ichael Kelly, Sony Computer Entertainment Europe, London, UK

09:00

P7-1 User Driven, Local M odel, Reclassification ofDrum Loop Audio Slices— Henry Lindsay-Smith,1 Skot McDonald,2 Mark Sandler11Queen Mary University of London, London, UK 2FXpansion Audio Ltd., London, UK

We present a method for significantly improvingthe results of drum loop slice classification. Anonset detector is used to slice loops of percus-sion only audio. Low level features are extractedfrom the audio slices and the slices are classi-fied into one of seven percussion classes by apreviously trained PART decision table. Thisgeneral classification algorithm shows only anadequate performance. The user is then allowedto correct incorrect classifications. Each correct-ed classification is combined with a subset of theoriginal classifications and a nearest neighbor algorithm reclassifies the remaining slices ac-cording to the corrected local model. The resul-tant algorithm converges on a 100% correct so-lution, with nearly 40% fewer re-classificationsthan a non-assisted approach. Convention Paper 8356

09:30

P7-2 Kick-Drum Signal Acquisition, Isolation andReinforcem ent Optim ization in Live Sound—Adam J. Hill,1 Malcolm O. J. Hawksford,1 AdamP. Rosenthal,2 Gary Gand21University of Essex, Colchester, Essex, UK 2Gand Concert Sound, Glenview, IL, USA

A critical requirement for popular music in live-sound applications is the achievement of a robust kick-drum sound presented to the audi-ence and the drummer while simultaneouslyachieving a workable degree of acoustic isola-tion for other on-stage musicians. Routinely atransparent wall is placed in parallel to the kick-drum heads to attenuate sound from the drum-mer’s monitor loudspeakers, although this cancause sound quality impairment from comb-filterinterference. Practical optimization techniquesare explored, embracing microphone selectionand placement (including multiple microphonesin combination), isolation-wall location, drum-monitor electronic delay, and echo cancellation.A system analysis is presented augmented byreal-world measurements and relevant simula-tions using a bespoke Finite-Difference Time-Domain (FDTD) algorithm. Convention Paper 8357

10:00

P7-3 Developm ent of a Virtual Perform ance Studiowith Application of Virtual Acoustic Recording M ethods— Iain Laird,1 Damian Murphy,2 Paul Chapman,1 Seb Jouan31Glasgow School of Art, Glasgow, Scotland, UK 2University of York, Heslington, York, UK 3Arup, Glasgow, Scotland, UK

A Virtual Performance Studio (VPS) is a spacethat allows a musician to practice in a virtual ver-sion of a real performance space in order to acclimatize to the acoustic feedback received onstage before physically performing there. Tradi-tional auralization techniques allow this by con-

Page 14: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

14 Audio Engineering Society 130th Convention Program, 2011 Spring

volving the direct sound from the instrument withthe appropriate impulse response on stage. Inorder to capture only the direct sound from theinstrument, a directional microphone is oftenused at small distances from the instrument.This can give rise to noticeable tonal distortiondue to proximity effect and spatial sampling ofthe instrument’s directivity function. This work reports on the construction of a prototype VPSsystem and goes on to demonstrate how an auralization can be significantly affected by theplacement of the microphone around the instru-ment, contributing to a reported “PA effect.” Informal listening tests have suggested thatthere is a general preference for auralizationsthat process multiple microphones placedaround the instrument. Convention Paper 8358

10:30

P7-4 Interactive Audio Realities: An Augm ented /M ixed Reality Audio Gam e Prototype— NikosMoustakas, Andreas Floros, Nicolas Grigoriou,Ionian University, Corfu, Greece

Audio-games represent a game alternativebased on audible feedback rather than on visual.They may benefit from parametric sound synthe-sis and advanced audio technologies (i.e., aug-mented reality audio), in order to effectively real-ize complex scenarios. In this paper amultiplayer game prototype is introduced thatemploys the concept of controlled mixed realityin order to augment the sound environment ofeach player. The prototype is realized as multi-ple user audiovisual installations, which are interconnected in order to communicate the sta-tus of the selected control parameters in real-time. The prototype reveals significant relevanceto the-well known on-line multiplayer games,with its novelty originating from the fact that userinteraction is realized in augmented reality audioenvironments. Convention Paper 8359

Session P8 Saturday, M ay 14 09:00 – 12:30 Room 4

AUDIO EQUIPM ENT

Chair: John Dawson

09:00

P8-1 Signal Level and Frequency Dependent Losses Inside Audio Signal Transform ersand How to Prevent Those— Menno van derVeen, ir. bureau Vanderveen, Zwolle, TheNetherlands

In an earlier work (Convention Paper 7125) amodel was presented that explains that low volt-age level audio signals are extra weakenedwhen they are fed through a transformer. Thisextra weakening is caused by the signal leveland frequency-dependent inductance of thetransformer. Combining this extra weakeningwith the threshold of hearing curves, showedthat noticeable loss of micro details occurs in the

frequency band from 20 Hz to 1 kHz. This paperexpands the previous work with measurementson several valve amplifiers, refines the model,and makes it applicable to macro signal levelsclose to saturation of the transformer. Methodsare also given to minimize this extra weakeningin transformers. Convention Paper 8360

09:30

P8-2 Diaphonic Pum p: A Sound-Activated Alternating to Static Pressure Converter—Stephen D. Ambrose,1 Robert Schulein,1Samuel Gido1,21Asius Technologies LLC, Longmont, CO, USA 2University of Massachusetts, Amherst, MA, USA

This paper discusses the operating principlesand basic construction of a diaphonic pump,which is a newly invented device for harvestingthe energy inherent in sound waves and using itto pump air, thereby pressurizing a vessel. Although this device is of general utility, the embodiment discussed in this paper is used toharvest sound energy from the speaker (bal-anced armature transducer) of a personal listen-ing device (headset or hearing aid), and use thisas a power source to inflate a bubble in the lis-tener’s ear, thereby creating an acoustic seal.The diaphonic pump utilizes a natural asymme-try in the flow pattern when fluid is alternatinglypushed and pulled, back and forth through asmall orifice known as a “synthetic jet.” Soundwaves provide the alternating flow patternacross the synthetic jet orifice. Prototype dia-phonic pumps were built, which attach to a backvolume of a balanced armature transducer andare small enough that the whole assembly,transducer, and pump, can fit in a human earcanal. Convention Paper 8361[Paper presented by Bob Schulein]

10:00

P8-3 Scanning the M agnetic Field of Electro-Dynam ical Transducers— Wolfgang Klippel,University of Technology, Dresden Germany,Klippel GmbH, Dresden, Germany

The magnetic flux density in the magnetic gapand the geometry of the moving coil determinethe force factor Bl, which is an important para-meter of the electro-dynamical transducer. Thepaper presents a new measurement techniquefor scanning the flux density B(z, φ) on a cylin-drical surface within and outside the magneticgap using a Hall sensor and robotics changingthe position of the sensor versus vertical positionz and angle φ. The results derived from thescanning process reveal the real B field in thegap considering the fringe field and irregularitiesin the magnetization, which may initiate a rock-ing mode and rubbing of the voice coil at higheramplitudes. Using the geometry of the coil thestatic force factor Bl(x, i=0) can be calculated asa function of voice coil displacement x and compared with the dynamic force factor B(x,i)measured by a dynamic system identificationtechniques. Discrepancies between dynamicand static force factor characteristics are dis-

Technical Program

Page 15: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 15

cussed and conclusions for loudspeaker designand manufacturing are derived. Convention Paper 8362

10:30

P8-4 Com parison of Anem om etric Probe and Tetrahedral M icrophones for Sound IntensityM easurem ents— Giulio Cengarle, Toni Mateos,Fundació Barcelona Media, Barcelona, Spain

The measurement of sound intensity requiresthe acquisition of sound pressure and acousticvelocity in a coincident position. Various trans-ducer topologies can be used to measure theacoustic velocity directly or indirectly. In this paper three transducers are compared: a pres-sure-velocity anemometric probe and two tetra-hedral B-Format microphones from differentmanufacturers. The comparison has been car-ried out in different fields, ranging from anechoicto diffuse, reverberant field conditions. Theanalysis and comparison is based on intensimet-ric quantities such as the radiation index and thesound intensity vector. Strengths and limitations ofthe various approaches are reported, to suggestthe preferred applications for each transducer.Convention Paper 8363

11:00

P8-5 Prediction of Perceived W idth of Stereo M icrophone Setups— Hans Riekehof-Böhmer,1Helmut Wittek21HAW-Hamburg, Hamburg, Germany2Schoeps Mikrofone, Karlsruhe, Germany

The diffuse-field correlation of the two signals gen-erated by a stereophonic microphone setup hasan effect on the perception of spatial width. A cor-relation meter is often used to measure the corre-lation coefficient. However, due to the frequencydependence of the correlation function, the corre-lation coefficient is not an appropriate value forpredicting the perceived width when it comes totime-delay stereophony. By using the newly defined “Diffuse-Field-Image-Predictor” (DFI-Pre-dictor) presented in this paper an attempt is madeto reliably predict perceived width. Listening testsshow that the DFI-Predictor is fairly suitable for thistask. The aim of the study is to compare the spa-tial properties of different stereophonic microphonetechniques by one calculated value.Convention Paper 8364

11:30

P8-6 Synthesis of Polar Patterns as a Function of Frequency with a Twin M icrophone: Audio Exam ples and Applications within theCreative Process of M usic M ixing— MatthiasKock,1 Markus Kock,2 Rainer Maillard,3 MalteKob11Erich-Thienhaus-Institute, Detmold, Germany2Leibniz Universität Hannover, Hannover, Germany3Emil-Berliner Studios, Berlin, Germany

The directivity of a twin microphone can be chosen by variable weighting of the two outputsignals. In addition, the polar pattern can be adjusted as a function of frequency when con-trolled with a VST Plug-in in a modern DAW

environment. A number of recordings were per-formed in rooms with variable size and quality.Presets with beneficial frequency-dependent directivities are compared to settings with con-stant directivity. It is discussed to what extendrecordings can be further improved using theplug-in. Convention Paper 8365

12:00

P8-7 Digital M icrophones— W hat’s it All About?—John Willett, Circle Sound Services, Oxfordshire,UK

It’s now ten years since the first AES42 specifi-cation was published (AES42-2001) and the firstAES42-compliant digital microphone came to themarket. So this seems an opportune moment tolook at AES42 digital microphones, their history,what they offer, the current market situation, andwhat the future may hold. Convention Paper 8366

Session P9 Saturday, M ay 14 09:00 – 10:30Foyer

POSTERS: PERCEPTION AND EVALUATION

09:00

P9-1 The Effect of Loudness Overflow on Equal-Loudness-Level Contours— Andrew J. R.Simpson, Joshua D. Reiss, Queen Mary Univer-sity of London, London, UK

This paper presents a formal derivation of theLoudness Overflow Effect (LOE), which describes the impact of nonlinear distortion onloudness. Computational analysis is then per-formed, comprised of two experiments involvingtwo compressive static nonlinearities and usingtwo well-known time-varying loudness models.The results characterize the nonlinearities interms of LOE as a function of frequency and oflistening level in the case of 250-ms pure-tonestimuli, and in terms of the traditional equal-loud-ness-level contours. The analysis is then extend-ed to synthesized wind instruments for one ofthe nonlinearities. The effect of the nonlinearityon loudness as a function of musical note funda-mental frequency and listening level is describedfor various synthesized instruments. Convention Paper 8367

09:00

P9-2 Evaluating the Use of Audio Sm artphoneApps for Higher Education— Anne Nortcliffe,Andrew Middleton, Ben Woodcock, SheffieldHallam University, Sheffield, UK

Digital audio technology has garnered interest inEducation recently, being deployed by earlyadopter academics to provide audio feedback.Students have also used it, gathering audionotes on their personal devices to enhance theirlearning. However, the sharing and distributingof the recordings is time-consuming and requiresseparate technology. Smartphones with audioapps are able to support recording and distribu- �

Page 16: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

16 Audio Engineering Society 130th Convention Program, 2011 Spring

tion/sharing of learning conversations more effectively because of their additional customiz-able and integrated functionality. This is attrac-tive to Education now that it is clear that smart-phones are becoming ubiquitous on campus.This paper describes an evaluation of audioapps for recording learning conversations by anacademic and students and their experience inusing smartphone audio apps to date. Convention Paper 8368

09:00

P9-3 A Study of Hum an Perception of Tem poral and Spectral Distortion Caused by Subwoofer Arrays— Elena Shabalina,1Janko Ramuscak,2 Michael Vorländer11RWTH Aachen University, Aachen, Germany 2d&b audiotechnik GmbH, Backnang, Germany

The key task for a sound reinforcement systemis to provide an even sound pressure level distri-bution over the whole listening area with possi-bly the same frequency response; and reduceradiation in wrong directions. For that a systemshould show a certain directivity, which for lowfrequencies can be achieved only by using multi-ple sound sources, for example, placed in a rowin front of the stage. This technique can help toavoid strong interference and correspondingspace sound pressure level variations of theconventional left/right setup of subwoofers. Onthe other hand, multiple sources cause changesof the impulse and frequency response of an array. Listening tests showed that thesechanges are audible for experienced listeners. Convention Paper 8369

09:00

P9-4 Evaluation of the Psychoacoustic Perceptionof Geom etric Acoustic M odeling-Based Auralization — Aglaia Foteinou, Damian T. Murphy, University of York, Heslington, York, UK

The subjective evaluation of the auralization of asimulated acoustic, with a view to establishingthe success or otherwise of the results obtained,is usually best achieved using listening testscomparing the virtual environment with the actu-al measured space. As existing modeling meth-ods still need to be improved, it is of critical importance to focus on the human perception ofacoustic of the given space, rather than optimiz-ing room acoustic parameters based only on objective measures. This paper uses a muchsimplified representation of a space with the re-sulting computer model giving the capability tocontrol all of the examined acoustic or simulationparameters independently. A 3-D shoeboxshape room is created and a variety of factorsare changed every time in order to investigatetheir relevance and influence on human percep-tion. These results are obtained from listeningtests, and conclusions for the psychoacousticperception of such a space are given. Convention Paper 8370

09:00

P9-5 A Com parative Perceptual Evaluation of theTim bral Variations in Choral LocationRecordings Created by Four Com m on Stereo

M icrophone Techniques— Duncan Williams,University of Oxford (Wolfson College), Oxford,UK

Choral recordings created on location wereevaluated perceptually to determine the natureof the variations in timbre that might be elicitedby the use of different stereo microphone tech-niques. Four stereo recordings were made simultaneously with coincident, near coincident,and spaced stereo microphone techniques. Lis-teners were invited to describe any perceivedchanges through a verbal elicitation experiment,informing an adjective “pool” of possible attribut-es. These attributes were reduced in number tosix by verbal protocol analysis. The six remain-ing attributes were then scaled in a second lis-tening experiment. Mean and standard deviationvalues in the results suggested that there wasvariation in three timbral attributes. This illustrat-ed that the manipulation of timbral attributes bymicrophone technique, combined with perceptu-al analysis, is possible. Convention Paper 8371

09:00

P9-6 Anchor Signals Validation for Two Dim ensions of a Four-Dim ensional Perceptive Space— Yves Zango,1,2,3 Régine Le Bouquin-Jeannès,2,3 Nathalie Costet,2,3Catherine Quinquis11Orange Labs Lannion Tech/Opera, Lannion Cedex, France, 2INSERM U642, Rennes France3Université de Rennes, Rennes France

The subjective assessment of speech and soundcodecs requires anchor signals to ensure its reli-ability. The reference system currently used isModulated Noise Reference Unit (MNRU), whichsimulates only quantization noise. Now, the newgenerations of codecs present other impair-ments. In this study we consider speech qualityas a multidimensional phenomenon and use dimensional reduction techniques to projectcodecs’ impairments in a four-dimensionalspace, each axis of the perceptive space corre-sponding to one of them. A verbalization test allowed characterizing two of these dimensionsby the following attributes: “muffle” and “back-ground noise.” Anchor signals were designed forthese two dimensions, and a statistical analysisallowed validating the accuracy of at least one ofthese signals. Convention Paper 8372

09:00

P9-7 Auditory Distance Perception: Criteria and Listening Room — Jean-Christophe Messonnier,1 Alban Moraud21Conservatoire de Paris CNSMDP, Paris, France2Altia, Paris, France

This paper is the result of a series of listeningexperiments carried out to investigate the corre-lation between auditory distance and two criteria:the ratio of direct to reverberant sound energyand the clarity C80. In the first section of this paper we will determine which of the two criteriais more efficient. The second section comparesthe values of these criteria when the same signal

Technical Program

Page 17: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 17

is played on a well damped control room loud-speaker system and when it is played on a domestic stereophonic loudspeaker system. Asecond series of listening experiments shows howthe auditory distance is perceived in both cases. Convention Paper 8373

09:00

P9-8 Subjective Com parison between Stereo and Binaural Processing from B-Form at Am bisonic Raw Audio M aterial— Fábio W.Sousa, University of York, York, North Yorkshire,UK

Using audio recorded in Ambisonic B-formatfrom a sound field microphone and processedthrough both stereo and binaural tools, a subjec-tive comparison is made. Hearing tests wereperformed taking into account personal experi-ence and preference, as well as some spatial attributes concepts defined in previous works.Aiming to evaluate the real effectiveness of bin-aural processing, this paper considers the possi-bility of distributing contemporary music, original-ly developed for Ambisonic reproduction,through either conventional stereo or a methodof high fidelity spatial processing directed specif-ically to reproduction through headphone sys-tems, based on binaural technology. The soundimages represented in both binaural and stereoprocessing are examined. Spatial attributes likewideness, depth, naturalness, and presence areevaluated. Convention Paper 8374

W orkshop 3 Saturday, M ay 1409:00 – 11:00 Room 3

PANNING FOR M ULTICHANNEL LOUDSPEAKERSYSTEM S

Chair: Ville Pulkki, Aalto University School of Science and Technology, Aalto, Finland

Panelists: Filippo Fazi, ISVR, UKMatthias Frank, IEM, AustriaFlorian Keiler, Technicolor DE, Germany

Amplitude panning is the most used method to positionvirtual sources over layouts where the number of loud-speakers is between two and about forty. The method isreally simple, it provides a nice spatial effect, and doesnot color the sound prominently. This workshop reviewsthe working principle and psychoacoustic facts of ampli-tude panning for stereophony and for multichannel lay-outs. Panelists will describe some recent improvementsuggestions to amplitude panning, which target someshortcomings of amplitude panning in spatial accuracy. Alively discussion is expected on the pros and cons ofsuch processing.

Tutorial 3 Saturday, M ay 1409:00 – 11:00 Room 2

M ANAGING TINNITUS AS A W ORKING AUDIO PROFESSIONAL

Presenters: Neil Cherian, Neurological Institute,

Cleveland Clinic, Cleveland, OH, USAM ichael Santucci, Sensaphonics Hearing Conservation, Chicago, IL, USA

Tinnitus is a common yet poorly understood disorderwhere sound is perceived in the absence of an externalsource. Significant sound exposure, with or without hear-ing loss, is the most common risk factor. Tinnitus can bedebilitating and can impair quality of life. Anxiety, depres-sion, and sleep disorders are potential consequences.Most importantly for those in the audio industry, it cansignificantly impair auditory perception. This tutorial will focus on methods in managing tinnitus

in the life of an audio professional. Background informa-tion will be provided regarding the basic concept of tinni-tus, pertinent anatomy and physiology, audiologic parameters of tinnitus, and an overview of current research. Suggestions for identifying and mitigating highrisk behaviors will be covered. Elements of medical andaudiologic evaluations of tinnitus will also reviewed.

Student/Career Developm ent EventEDUCATION FORUM PANELSaturday, May 14, 09:00 – 10:45Room 5

Moderator: Alex Case, University of Massachusetts at Lowell, Lowell, MA, USA

Teaching the Teachers— A Round Table Discussion Am ong Audio Educators

While audio itself—in all her disciplines—advances atbreakneck speed, the educators supporting it must makeequivalent progress. AES conventions are reliable cata-lysts for earnest discussions among audio educators.However the convention, rich with so many activities, always seems to end too soon. Curriculum, personnel,and facilities must offer both time-proven fundamentalsand cutting edge innovations. It happens in multiplemodes: the classroom, the studio, the lab, online, cam-pus committee meetings, and through industry relation-ships. We've all found solutions here, and nuggets ofwisdom there; we wrestle with challenges and unknownselsewhere. Join this discussion as we seek to define andprioritize the key issues facing educators and create a vision for the most effective way to address them in fu-ture AES activities—through conventions, conferences,online interactions, and more. Share your ideas for themost essential forms of research and the best ways topresent the results: publications, tutorials, workshops,and other collaborations. What are the topics educatorsneed to discuss, and what is the best format for sharingour advancements? AES provides the essential commu-nity for sharing and learning among audio educators.Help us design the next steps for increasing our produc-tivity, accelerating our innovation, enriching our cama-raderie, and enhancing our quality as educators.

Saturday, M ay 14 09:00 Room Saint JulienTechnical Com m ittee M eeting on Audio Forensics

Student/Career Developm ent EventSTUDENT SCIENCE SPOTSaturday, May 14 through Monday, May 16Exhibits Area

Check out the Student Science Spot for student designsand projects! See the young creative geniuses demon-strate their knowledge in the flesh. Learn about other stu-dent events or just hang and meet some new audiofriends. This is an amazing opportunity to share your �

Page 18: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

18 Audio Engineering Society 122nd Convention Program, 2007 Spring

hard work with convention participants! Hope to see youall there! http://www.aes.org/ students/blog/2011/3/130th-student-science-spot-call

Saturday, M ay 14 10:00 Room Saint JulienTechnical Com m ittee M eeting on M icrophones and Applications

Session P10 Saturday, M ay 14 11:00 – 13:00 Room 1

AUDIO CONTENT M ANGEM ENT

Chair: Jam ie Angus, University of Salford, Salford, Greater Manchester, UK

11:00

P10-1 A Com prehensive and M odular Fram eworkfor Audio Content Extraction, Aim ed at Research, Pedagogy, and Digital LibraryM anagem ent— Olivier Lartillot, University of Jyväskylä, Jyväskylä, Finland

We present a framework for audio analysis andthe extraction of low-level features, mid-levelstructures, and high-level concepts, altogetherstudied as a fully interwoven complex system.Composite operations are constructed via an intu-itive programming language on top of Matlab.Datasets of any size can be processed thanks toimplicit memory management mechanisms. Thedata structure enables a tight articulation betweensignal and symbolic layers in a unified framework.The resulting technology can be used as a peda-gogical tool for the understanding of audio,speech, and musical processes and concepts,and for content-based discovery of digital libraries. Other applications includes intelligentbrowsing and structuring of digital library, informa-tion retrieval, and the design of content-based audio interfaces.Convention Paper 8375

11:30

P10-2 Selected Playback Problem s of HistoricalGrooved M edia— Nadja Wallaszkovits,1 FranzLechleitner,1 Heinrich Pichler21Phonogrammarchiv Austrian Academy of Sciences, Vienna, Austria 2Audio Consultant, Vienna, Austria

The paper discusses some selected playbackproblems of the replay and high quality archivaltransfer of historical grooved media, like cylin-ders, instantaneous discs and early coarsegroove records. The topics outline the problemsof noise reduction and the compensation of thehorizontal tracking angle by means of stereoplayback and modification of the sum and differ-ential signals. A comparison between existingnoise reduction methods and analog as well asdigital phase and group delay compensationmethods is given and discussed. Finally possiblecompensation methods for the change in noisespectrum caused by the groove velocity decrease at inner diameters with early discs areoutlined. The authors propose a radius equaliza-tion in the digital domain by using a digital highpass filter without group delay distortion, using

diameter dependent change of cut-off frequency. Convention Paper 8376

12:00

P10-3 Autom atic Recognition of Events in AudioData Using Supercom puter Cluster— KubaLopatka, Andrzej Czyzewski, Henryk Krawczyk,Gdansk University of Technology, Gdansk,Poland

Dangerous events’ automatic recognition by audio analysis employing parallel processing ona supercomputer cluster is described in the paper. Sound files recorded by microphones op-erating in a security surveillance system areprocessed by a sound event detection and clas-sification algorithm. Because of the large amountof data, parallel computation is employed tospeed up the analysis. The sound file recordedby the surveillance system is divided into chunksand processed by separate threads or process-es. Several strategies for such parallel computa-tion are introduced and discussed. Results obtained in tests using a supercomputer clusterare presented. Convention Paper 8377

12:30

P10-4 Using Support Vector M achines for Autom atic M ood Tracking In Audio M usic—Renato Panda, Rui Pedro Paiva, University of Coimbra, Coimbra, Portugal

In this paper we propose a solution for automaticmood tracking in audio music, based on super-vised learning and classification. To this end,various music clips with a duration of 25 sec-onds, previously annotated with arousal and valence (AV) values, were used to train severalmodels. These models were used to predictquadrants of the Thayer’s taxonomy and AV val-ues, of small segments from full songs, revealingthe mood changes over time. The system accu-racy was measured by calculating the matchingratio between predicted results and full song an-notations performed by volunteers. Differentcombinations of audio features, frameworks, andother parameters were tested, resulting in an accuracy of 56.3% and showing there is stillmuch room for improvement. Convention Paper 8378

W orkshop 4 Saturday, M ay 1411:00 – 13:00 Room 3

PRODUCTION FOR UPCOM ING SPATIAL AUDIO SYSTEM S

Chair: Frank M elchior, IOSONO, Erfurt, Germany

Panelists: Marc Emerit, Orange Labs, FranceKimio Hamasaki, NHK, Tokyo, JapanJeff Levison, IOSONO GmbH, USAJörn Loviscach, University of Applied Sciences,Bielefeld, Bielefeld, GermanyWieslaw Woszcyk, McGill University, Montreal,Quebec, CanadaGregor Zielinsky, Sennheiser, Hannover, Germany

Technical Program

Page 19: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 19

Driven by the force of 3-D pictures the demand for spa-tial audio experiences is increasing. Several systemshave been proposed just adding more channels aroundthe auditorium. While the integration of new channelsseems to be simple from a workflow perspective, prob-lems arise if the content needs to be adapted to differentsetups in different venues. On the other hand, and drivenby new spatial audio reproduction methods, a paradigmshift toward object-based audio production is currentlyunder discussion in the community. This workshopbrings together the experience and tools in production fornew multichannel formats and new spatial audio repro-duction methods in general. Examples of related produc-tions and installations are given.

Tutorial 4 Saturday, M ay 1411:00 – 13:00 Room 5

IN-EAR M ONITORING— PAST, PRESENT, AND EM ERGING DEVELOPM ENTS

Presenter: Stephen Am brose, Asius Technologies

In the 1970s, sound engineer and performer StephenAmbrose began pioneering work experimenting with theuse of sound isolating earphones as a better means ofhearing himself and others during live performances.This form of monitoring was in sharp contrast to the normat the time, which consisted of an array of dedicated on-stage loudspeakers. Unlike with loudspeakers, per-formers with isolating earphones could hear their ownunique mix, separate from others. As an additional bene-fit of in-ear monitoring, stage monitors are no longer pre-sent to leak output into the house sound system orrecording mixes. Early work of Ambrose, with a widerange of well-known performers, established the basicsof in-ear monitoring but was challenged by technical bar-riers and acceptance issues. Today, however, IEM (In-Ear-Monitoring) is in widespread use in many productforms. This tutorial, punctuated with a variety of demon-strations, will cover key developments responsible for thegrowing acceptance of IEM systems. Additionally, newimprovements in the areas of sound isolation, comfort,occlusion effect suppression, and reduction of hearingfatigue, most of which are not yet commercially available,will be discussed.

Saturday, M ay 14 11:00 Room Saint JulienTechnical Com m ittee M eeting on Coding of AudioSignals

Special EventAES/M PG EVENT: W HERE DOES THE BUCK STOP?QC IN THE M ANUFACTURING CHAINSaturday, May 14, 11:15 – 13:00 Room 2

Moderator: Tony Platt, MPG Director

Panelists: Ray Staff, Award-winning chief mastering engineer, Air MasteringPieter Stenekes, Founder of Sonoris Audio Engineering who are responsible for the DDP delivery software

We are all human and can make mistakes. It is also saidthat we do not recognize our own mistakes. In the videoand broadcast world it is standard for the client to pay foran independent QC check at the studio and again at thebroadcaster's facility. With vinyl it has always been nor-mal to have a test pressing. All these procedures havebeen bypassed in CD mastering, and, in general, the

passage of a master to manufacture is out of step withany QC that takes place. This can and has resulted insometimes major and costly mistakes.Our discussion will focus on the challenges posed by

an emerging new shape of the industry where artists,small labels, and producers are dealing directly withmanufacturers. Furthermore, these problems are alsocommon to downloaded products. The incidences thathave raised the issue of QC during the manufacturingstage of CDs can only be exacerbated if consumer deliv-ery moves toward high quality downloads.The growing tendency to third party agents arranging

manufacture and distribution also raise issues in respect ofdelivery formats to manufacturers and aggregators andwith the upsurge in delivery via FTP we need to establishsome good practice for checking that what goes to manu-facture is correct and most especially who is responsible.This discussion launches the Mastering Section of Music

Producers Guild which will be headed by Ray Staff. Theypropose to outline what quality control is and where the responsibilities lie and to publish a specification that will beposted on the MPG website so that anyone involved in theprocess can refer to for good practice and guidance.

Student/Career Developm ent EventEDUCATION AND CAREER/JOB FAIRSaturday, May 14, 11:30 – 13:00Foyer

Institutions offering studies in audio (from short courses tograduate degrees) will be represented in a “table top” ses-sion. Information on each school’s respective programswill be made available through displays and academicguidance. There is no charge for schools to participate.

The Career Fair will feature several companies from theexhibit floor. All attendees of the Convention, studentsand professionals alike, are welcome to come talk withrepresentatives from the companies and find out moreabout job and internship opportunities in the audio indus-try. Bring your resume! Admission is free and open to allConvention attendees.

Saturday, M ay 14 12:00 Room Saint JulienTechnical Com m ittee M eeting on Spatial Audio

Saturday, M ay 14 12:00 Room M outon CadetStandards Com m ittee M eeting SC-02-12 on AudioApplications of Networks

Saturday, M ay 14 13:00 Room Saint JulienTechnical Com m ittee M eeting on Studio Practicesand Production

Session P11 Saturday, M ay 14 14:00 – 17:30Room 1

ROOM ACOUSTICS

Chair: Diem er de Vries, Delft University of Technology, Delft, The Netherlands

14:00

P11-1 DTS M ultichannel Audio Playback System : Characterization and Correction— Zoran Fejzo,1 James Johnston21DTS, Inc., Calabasas, CA, USA2DTS, Inc., Kirkland, WA, USA

Audio playback system correction methods are �

Page 20: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

20 Audio Engineering Society 130th Convention Program, 2011 Spring

now commonplace in audio-video receivers. Onegoal of these systems is to correct deviation ofloudspeaker/ room frequency response fromsome desired target curve. Unfortunately thiscorrection may be inappropriate outside of asmall area around the microphone location, andaveraged measurements may provide unwantedtimbre shifts. Some room correction algorithmscapture the room responses at multiple locationsand combine them to obtain a representative response that is used for frequency correction.We will present a loudspeaker/room correctionsystem that attempts to achieve perceptually appropriate frequency correction in a wide listen-ing area by using a closely spaced non-coinci-dent multi-microphone array placed in a singlelocation in the room. By use of special probe sig-nals, this is achieved within a short measure-ment period. Convention Paper 8379

14:30

P11-2 Evaluating the Auralization of Perform anceSpaces and Its Effect on Singing Perform ance— Judith Brereton, Damian T. Murphy, David M. Howard, University of York,Heslington, York, UK

Musicians alter their performance according tothe acoustic environment in which they perform,but as yet a thorough parametric investigation ofthe effect of room acoustics on musical perfor-mance has not yet been achieved. A sufficiently“realistic” synthesized Room Impulse Response(RIR) will facilitate such a study, since this will allow the investigator greater control and knowl-edge of the room acoustic parameters involved.This paper reports the results of an experimentto evaluate a virtual acoustic space through theperformance, interview, and audio analysis ofthe performance of a solo singer. Simulations ofthe same performance space using synthesizedRIRs and measured RIRs were compared. Ingeneral, singers who took part in the trial coulddistinguish between the two simulations and rat-ed the measured RIR simulation more highly interms of warmth and reverberance. Convention Paper 8380

15:00

P11-3 Enhancing the Configuration and Design of Sound System s through Sim ulation— Fred-erick Otten, Richard Foss, Rhodes University, Grahamstown, South Africa

Audio Engineers are required to design and deploy large multichannel sound systems thatmeet a set of requirements and use networkingtechnologies such as Firewire and Ethernet.Bandwidth utilization and latency need to beconsidered. Network Simulation can be used toaccurately model a network and return such information. This paper discusses a softwaresystem that has been developed to create a sim-ulation of a network using the AES-X170 proto-col for command and control. This system showsinformation about bandwidth and latency and isable to detect problems with parameter relation-ships. It also provides the ability to perform offlineediting. These features significantly enhance the

audio engineer’s ability to effectively design, con-figure, and evaluate their sound systems.Convention Paper 8381

15:30

P11-4 W hat’s W rong with Scattering Theory?— IanM. Dash, Fergus R. Fricke, University of Sydney,Sydney, NSW, Australia

The theory of long wave scattering from the sideof a cylinder originated with Rayleigh and wasextended to a general solution by Morse. Bothsolutions are based on an angular harmonic series expansion. This model is conceptuallyflawed. A number of physical paradoxes inherentin the model are outlined and discussed. Convention Paper 8382

16:00

P11-5 New Proposals for the Calibration of Sound in Cinem a Room s— Philip Newell,1 Keith Holland,2 Julius Newell,3 Branko Neskov41Acoustics Consultant, Moaña, Spain2ISVR, University of Southampton, Southampton, UK3Electroacoustics Engineer, Lisbon, Portugal4Loudness Films, Lisbon, Portugal

The current practices for calibrating cinemarooms date back to the early 1970s. Much hasbeen learned since then about the perception ofsound, and measurement techniques have advanced greatly. Evidence has been growingthat the present degree of room-to-room com-patibility leaves much to be desired, and com-plaints about loudness and intelligibility problemspersist. This paper looks at reassessing thewhole process of the loudspeaker and room cali-bration from a modern perspective. Convention Paper 8383[Paper presented by Keith Holland]

16:30

P11-6 Som e Prelim inary Com parisons between the Diffusion Equation M odel and Room -Acoustic Rendering Equation in Com plexScenarios— Juan M. Navarro,1 Jose Escolano,2Jose J. López31San Antonio’s Catholic University of Murcia, Guadalupe, Spain2University of Jaén, Linares, Spain3Universidad Politecnica de Valencia, Valencia, Spain

Recently, a model named acoustic radiativetransfer equation has been proposed as a gen-eral theory to expand geometrical room acousticmodeling algorithms. This room acoustic model-ing technique establishes the basis of two recently proposed algorithms, the acoustic diffu-sion equation model and the room acoustic rendering equation. This paper presents somecomparisons of room-acoustic parameters in-situmeasurements with prediction values from bothmethods in a real complex shape room in orderto clarify advantages and limitations of bothmethods. Moreover, the memory requirementsand computation time have been evaluated. Convention Paper 8384

Technical Program

Page 21: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 21

17:00

P11-7 Spatial Room Im pulse Responses with a Hybrid M odeling M ethod— Alex Southern,Samuel Siltanen, Lauri Savioja, Aalto University,Aalto, Finland

The synthesis of an arbitrary enclosure room im-pulse response (RIR) may be performed usingacoustic modeling. A number of acoustic model-ing methods have been proposed previouslyeach with their own advantages and limitations.This paper is concerned with mixing the RIRsfrom different modeling methods to synthesize ahybrid RIR. Low frequencies are modeled usingthe finite difference time domain method(FDTD), high frequencies are treated with geo-metric methods. A practical implementation forforming a hybrid RIR is discussed and furtherdemonstrated in the context of a 2nd order B-Format spatial encode of the modeled soundfield. The paper discusses the considerationsand limitations of forming such hybrid RIRs us-ing wave-based and geometric-based methods. Convention Paper 8385

Session P12 Saturday, M ay 14 14:00 – 17:30Room 4

BINAURAL SOUND

Chair: Bozena Kostek, Gdansk University of Technology, Gdansk, Poland

14:00

P12-1 Com parison of Speech Intelligibility in Artificial Head and Jecklin Disc Recordings—Roger Johnsson, Arne Nykänen, Luleå University of Technology, Luleå, Sweden

Binaural recordings are often done using artifi-cial heads but can also be done with a Jecklindisc. In this study an experiment was designedthat allowed evaluation of noise and reverbera-tion suppression based on speech intelligibilitymeasurements. Recordings of a voice and disturbing noise were done in a reverberant en-vironment using one artificial head and fourJecklin discs of various sizes. A listening experi-ment using headphones was conducted to deter-mine the speech intelligibility in the recordingsand in a real life situation. It was found that therewas no significant difference in the speech intel-ligibility between the artificial head and Jecklindisc with a diameter of 36 cm. Convention Paper 8386

14:30

P12-2 A Com parison of Speech Intelligibility for In-Ear and Artificial Head Recordings— ArneNykänen, Roger Johnsson, Luleå University ofTechnology, Luleå, Sweden

Good binaural reproductions should allow the lis-tener to suppress noise and reverberation aswhen listening in real life. An experiment was

designed where room properties and reproduc-tion techniques were varied in a way that allowed evaluation of noise and reverberationsuppression based on speech intelligibility mea-surements. Artificial head recordings were com-pared to in-ear recordings and real life listening.Artificial head recordings were found to beequivalent to real life listening. The speech intel-ligibility for in-ear recordings surpassed real lifelistening. A possible explanation may be inaccu-rate equalization. The equalization is critical forcorrect reproduction of binaural cues. The proce-dure used is convenient for validation of the performance of recording and reproductionequipment intended for sound quality studies. Convention Paper 8387

15:00

P12-3 Perceptually Robust Headphone Equalizationfor Binaural Reproduction— Bruno Masiero,Janina Fels, RWTH Aachen University, Aachen,Germany

Headphones must always be adequately equal-ized when used for reproducing binaural signalsif they are to deliver high perceptual plausibility.However, the transfer function between head-phones and ear drums (HpTF) varies quite heav-ily with the headphone fitting for high frequen-cies, thus even small displacements of theheadphone after equalization will lead to irregu-larities in the resulting frequency response.Keeping in mind that irregularities in the form ofpeaks are more disturbing than equivalent val-leys, a new method for designing headphoneequalization filters is proposed where not the average but an upper variance limit of manymeasured HpTFs is inverted. Such a filter yieldsperceptually robust equalization since the equal-ized frequency response will, with high chance,differ from the ideal response only by the pres-ence of valleys in the high frequency range. Convention Paper 8388

15:30

P12-4 Prediction of Perceived Elevation Using M ultiple Pseudo-Binaural M icrophones—Tommy Ashby, Russell Mason, Tim Brookes,University of Surrey, Guildford, Surrey, UK

Computational auditory models that predict the per-ceived location of sound sources in terms of azimuth are already available, yet little has beendone to predict perceived elevation. Interaural timeand level differences, the primary cues in horizontallocalization, do not resolve source elevation, result-ing in the “Cone of Confusion.” In natural listening,listeners can make head movements to resolvesuch confusion. To mimic the dynamic cues provid-ed by head movements, a multiple microphonesphere was created, and a hearing model was developed to predict source elevation from the sig-nals captured by the sphere. The prototype sphereand hearing model proved effective in both hori-zontal and vertical localization. The next stage ofthis research will be to rigorously test a more physi-ologically accurate capture device.Convention Paper 8389

Page 22: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

22 Audio Engineering Society 130th Convention Program, 2011 Spring

16:00

P12-5 BRTF (Body Related Transfer Function) and W hole-Body Vibration Reproduction System s— M. Ercan Altinsoy, SebastianMerchel, Dresden University of Technology,Dresden Germany

If binaural recorded signals are played back viaheadphones, the transfer characteristic of the reproduction system has to be compensated for.Unfortunately, the transfer characteristic depends not only on the transducer itself, butalso on mounting conditions and individual prop-erties of the respective ear. This is similar withreproduction systems for whole-body vibrations.The transfer characteristic depends to a greatextent on the individual body properties, e.g.,weight or body mass index. In this study bodyrelated transfer functions of 60 subjects aremeasured using an electrodynamic excitationsystem. In addition anthropometric data of thesubjects are collected. This paper reviews theexisting whole-body vibration reproduction sys-tems and discusses the importance of the indi-vidual transfer functions for whole-body vibrationreproduction. Convention Paper 8390

16:30

P12-6 HRTF-Enabled M icrophone Array for BinauralSynthesis— Malcolm O. J. Hawksford, University of Essex, Colchester, Essex, UK

A synthesis technique incorporating a circularphased-array microphone is described wherethe horizontal polar response is matched to anarbitrary set of head-related transfer functions(HRTFs). The array can emulate the function ofan artificial listener but without the need to embed physical anatomical features. Designtechniques are described based upon polar response equalization computed over a discretefrequency space that together with dynamic coefficient processing enables spatial image manipulation and bespoke multi-listener environ-ments with individual head tracking. A method of2-D spatial filtering is described to scale thenumber of microphone signals. Applications include binaural recording optimally matched toan arbitrary number of listeners, distributed gam-ing, teleconferencing, multi-user interactive virtu-al reality, and remote surveillance. Convention Paper 8391

17:00

P12-7 Interpolation and Range Extrapolation ofHead-Related Transfer Functions Using Virtual Local W ave Field Synthesis— SaschaSpors, Hagen Wierstorf, Jens Ahrens, DeutscheTelekom Laboratories, Technische UniversitätBerlin, Berlin, Germany

Virtual environments that are based on binauralsound reproduction require datasets of head-related transfer functions (HRTFs). Ideally, theseHRTFs are available for every possible positionof a virtual sound source. However, in order toreduce measurement efforts, such datasets aretypically only available for various source direc-tions but only for one or very few distances. This

paper presents a method for extrapolation ofmeasured HRTF datasets from the source dis-tance used in the measurements to other sourcedistances. The method applies the concept of local Wave Field Synthesis to compute extrapo-lated HRTFs for almost arbitrary source positions with high accuracy. The method iscomputationally efficient and numerically stable. Convention Paper 8392

W orkshop 5 Saturday, M ay 1414:00 – 15:00 Room 3

AURO 3D— HIGH QUALITY AND LOW LATENCY PCM ENCODING AND DATA REDUCTION

Chair: W ilfried Van Baelen, Galaxy Studios, Belgium

Auro-3D is first and foremost a family of surround for-mats including height information, starting with 4 over-head channels (and loudspeakers) and leading up to13.1. The obvious use for such enlarged arrays is in thecinema, but the 9.1 Auro-3D solution is also applicable tothe home theater, as the additional speakers are exactlyabove the existing ones (L+R front, L+R surround). Be-sides being a family of formats, also a codec has beendeveloped that can fold down 9.1 to 5.1 or 5.1 to 2.0purely in the PCM domain, with ultra-low latency decod-ing (1ms). The man behind Auro-3D, Wilfried Van Baelenfrom Galaxy Studios in Belgium, will give an introductionto the Auro-3D codec, talk about latest developments,and play some examples using the codec.

Special EventAES/APRS/DV247 EVENT: TALKBACK-PRO 1ACOUSTIC TREATM ENT FOR SM ALL SPACESSaturday, May 14, 14:00 – 15:45Room 2

Moderator: M alcolm Atkin

Panelists: TBA

As 3-D visuals hit the screens, there are more opportuni-ties to provide surround-sound mixes. How can smallrooms best be set-up to mix in surround and if mixing/control room space is terminally limited, can non-acoustic solutions offer successful fixes? This perennialproblem has plagued aspiring producers and productionrooms since audio began it’s “democratizing” journey.Hear studio acoustics experts and technologists

debate and demonstrate the best way to exploit smallproduction spaces for both conventional stereo and sur-round-sound mixes.

Saturday, M ay 14 14:00 Room M outon CadetStandards Com m ittee M eeting SC-05-02 on AudioConnectors

Student/Career Developm ent EventSTUDENT RECORDING CRITIQUESSaturday, May 14, 15:00 – 16:00Sunday, May 15, 10:00 – 11:00Monday, May 16, 11:00 – 12:00Room Muscadet

Moderator: Ian Corbett, Kansas City Kansas Community College, Kansas City, KS, USA

Following the success of these events at the recent SanFrancisco convention—STUDENTS— bring a track to be

Technical Program

Page 23: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 23

critiqued in London!These are non-competitive sessions, the idea being

that you get “another” set of ears to listen to your workand make some constructive suggestions. Students areencouraged to bring in their stereo or surround projectsfor feedback and comments from a panel and audience.You will be able to sign-up for time slots at the first SDAmeeting, on a first come, first served basis. Studentswho are finalists in the Recording Competition are excluded from participating in this event to allow the non-finalists an opportunity for feedback on their hard work.Bring your stereo or surround work on CD, DVD, or harddisc as clearly-labeled .wav files. The Student RecordingCritiques are generously sponsored by PMC.

Saturday, M ay 14 15:00 Room Saint JulienTechnical Com m ittee M eeting on Perception and Subjective Evaluation of Audio

W orkshop 6 Saturday, M ay 1415:15 – 16:15 Room 3

BOUNCE TO APP

Chair: M ichael Hlatky, University of Applied SciencesBremen, , Bremen, Germany

Panelists: Jörn Loviscach, University of Applied Sciences, Bielefeld, GermanyMartin MacMillian, Bounce MobileMartin Roth, RjDJ

Recorded music used to be a lean-back experience, butmobile devices have changed the game: Applicationssuch as interactive 360-degree music videos begin toleverage the nonclassical man-machine interfaces ofthese devices—cameras, accelerometers, and GPS receivers—and employ their always-connectedness toaccess social music recommendation sites or musicanalysis and synthesis Web services. So far, the sound-tracks of these apps have been produced from the finalmixes. Can there be a better integration of audio produc-tion and the development of interactive music applica-tions? What would that mean in terms of workflow anduser interfaces? The issues that arise in this context aremarkedly different from those arising in the production offull-scale video games.

W orkshop 7 Saturday, M ay 1415:15 – 16:15 Room 5

ART OF RECORD PRODUCTION

Co-Chairs: Katia IsakoffSim on Zagorski-Thom as

Panelists: Haydn Bendall, Engineer/Music ProducerSteve D’Agostino, Artist/Music ProducerPaschall de Paor, University of GlamorganTony Platt, MPG and JAMES

This year will see the seventh international Art of RecordProduction (ARP) Conference in San Francisco and there-launch of the ARP Journal. The Association for theStudy of the Art of Record Production aims to provide aninternational and interdisciplinary organization for pro-moting the study of the production of recorded music thatinvolves both academics and industry professionals. Thisworkshop brings together the two directors of ARP withtop record producers to discuss how the development of

record production as an academic subject within the uni-versity system relates to the nuts and bolts of doing thejob. What kind of approaches are researchers engagingin? What do producers think about these types of research and the emergence of practice led research inthis field?

Saturday, M ay 14 15:30 Room M outon CadetStandards Com m ittee M eeting SC-05-05 on Grounding and EM C Practices

W orkshop 8 Saturday, M ay 1416:00 – 18:00 Room 2

HIGH RESOLUTION AUDIO PUBLISHING

Chair: Stefan Bock, MSM Studios, Germany

Panelists: Morten Lindberg, 2L, NorwayDarcy Proper, Wisseloord Studios, The NetherlandsTokuyama Takeshi, Imagica, JapanNeil Wilkes, Opus ProductionsYunichi Yoshio, Pioneer, Japan

There are a great deal of formats suitable for high resolutionaudio delivery. Which ones will succeed? What challengesdo they need to overcome to become a standard? This pan-el will present a complete panorama of the available options, reviewing their advantages and disadvantages.

Saturday, M ay 14 16:00 Room Saint JulienTechnical Com m ittee M eeting Advisory Group on Regulations

Session P13 Saturday, M ay 14 16:30 – 18:00Foyer

POSTERS: PRODUCTION AND BROADCAST

16:30

P13-1 A Com parison of Kanun Recording Techniques as They Relate to Turkish M akamM usic Perception — Can Karadogan, IstanbulTechnical University, Istanbul, Turkey

This paper presents a quality comparison of microphone techniques applied on the kanun, aprominent traditional instrument of TurkishMakam music. Disregarding the effects of pre-amplifier color, A/D converter, compression,equalization, mixing, and mastering, only thestudio recording step of music production is tak-en into focus. Microphone techniques were applied with varying placements and microphonetypes, and doing so, original Turkish Makam mu-sic etudes were recorded. Using short excerptsof these etudes, a survey comparing microphonetechniques and placements as well as micro-phone types was prepared. Subjects were cho-sen from kanun players, sound engineers, andnon-musicians who showed different perspec-tives, preferences, and descriptions to the soundsamples of various microphone techniques. Convention Paper 8393

Page 24: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

24 Audio Engineering Society 130th Convention Program, 2011 Spring

16:30

P13-2 Objective M easurem ent of Produced M usicQuality Using Inter-Band Dynam ic Relationship Analysis— Steven Fenton,1Bruno Fazenda,2 Jonathan Wakefield11University of Huddersfield, Huddersfield, UK2University of Salford, Salford, UK

This paper describes and evaluates an objectivemeasurement that grades the quality of a com-plex musical signal. The authors have previouslyidentified a potential correlation between inter-band dynamics and the subjective quality of pro-duced music excerpts. This paper describes thepreviously presented Inter-Band Relationship(IBR) descriptor and extends this work by testingwith real-world music excerpts and a greaternumber of listening subjects. A high degree ofcorrelation is observed between the Mean Sub-ject Scores (MSS) and the objective IBR descriptor suggesting it could be used as an ad-ditional model output variable (MOV) to describeproduced music quality. The method lends itselfto real-time implementation and therefore can beexploited within mixing, mastering, and monitor-ing tools. Convention Paper 8394

16:30

P13-3 Evaluation of a New Algorithm for Autom aticHum Detection in Audio Recordings—Matthias Brandt,1 Thorsten Schmidt,2Joerg Bitzer11Jade University of Applied Sciences, Oldenburg, Germany 2Cube-Tec International, Bremen, Germany

In this paper an evaluation of a recently pub-lished hum detection algorithm for audio signalsis presented. To determine the performance ofthe method, large amounts of artificially generat-ed and real-world audio data, containing a vari-ety of music and speech recordings, areprocessed by the algorithm. By comparing thedetection results with manually determinedground truth data, several error measures arecomputed: hit and false alarm rates, frequencydeviation of the hum frequency estimation, offsetof detected start and stop times, and the accura-cy of the SNR estimation. Convention Paper 8395

16:30

P13-4 Interactive M ixing Using W ii Controller—Rod Selfridge, Joshua Reiss, Queen Mary University of London, London, UK

This paper describes the design, construction,and analysis of an interactive gesture-controlledaudio mixing system by the means of a wirelessvideo game controller. The concept is based onthe idea that the mixing engineer can step awayfrom the mixing desk and become part of theperformance of the piece of audio. The systemallows full, live control of gains, stereo panning,equalization, dynamic range compression, and avariety of other effects for multichannel audio.The system and its implementation are described in detail. Subjective evaluation and lis-

tening tests were performed to assess usabilityand performance of the system, and the test pro-cedure and results are reported. Convention Paper 8396

16:30

P13-5 The Quintessence of a W aveform : Focus and Context for Audio Track Displays— JörnLoviscach, Fachhochschule Bielefeld (Universityof Applied Sciences), Bielefeld, Germany

Oscilloscope-style waveform plots offer great insight into the properties of the audio signal.However, their use is impeded by the hugespread of timescales extending from fractions ofa millisecond to several hours. Hence, waveformplots often require zooming in and out. This paper introduces a graphical representationthrough a synthesized quintessential waveformthat shows the spectrum of the traditional wave-form plot but does so at a much larger timescale.The quintessential waveform can reveal detailsabout single periods at zoom levels where a reg-ular waveform plot only indicates the signal’s envelope. Compression renders the enormousranges of frequencies and amplitudes more legible. Convention Paper 8397

16:30

P13-6 Spatial Audio Processing for Interactive TVServices— Johann-Markus Batke,1 Jens Spille,1Holger Kropp,1 Stefan Abeling,1 Ben Shirley,2Rob G. Oldfield21Technicolor, Research, and Innovation, Hannover, Germany2University of Salford, Salford, UK

FascinatE is a European funded project thataims at developing a system to allow end usersto interactively navigate around a video panora-ma showing a live event, with the accompanyingaudio automatically changing to match the se-lected view. The audiovisual content will beadapted to the users particular kind of device,covering anything from a mobile handsetequipped with headphones to an immersivepanoramic display connected with large loud-speaker setup. We describe how to handle audiocontent in the FascinatE context, covering sim-ple stereo through to spatial sound fields. Thiswill be performed by a mixture of Higher OrderAmbisonics and Wave Field Synthesis. Our pa-per focuses on the greatest challenges for bothtechniques when capturing, transmitting, andrendering the audio scene. Convention Paper 8398

16:30

P13-7 W ireless High Definition M ultichannelStream ing Audio Network Technology Basedon the IEEE 802.11 Standards— Seppo Nikkila,Tom Lindeman, Valentin Manea, ANT –Advanced Network Technologies Oy, Helsinki,Finland

A novel approach for the wireless distribution ofuncompressed real-time multichannel streamingaudio is presented. The technology is based onthe IEEE 802.11 Point Coordination Function,

Technical Program

Page 25: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 25

Contention Free Medium Access Control withsize-optimized multicast frames. The implemen-tation supports eight independent audio streamswith simultaneous 24-bit audio samples at therate of 192 kHz. Frame length alignment algo-rithm is developed for smooth, low jitter flow. Anaudio specific Forward Error Correction schemeand a low system latency inter-channel synchro-nization method are described. The clock drift issolved by a sample stuffing/stripping algorithm.Implemented hardware and software structuresare presented and the technology is comparedwith other wireless audio networking concepts.Emerging multichannel content formats arebriefly reviewed. Convention Paper 8399

16:30

P13-8 Gestures to Operate DAW Software— WincentBalin,1 Jörn Loviscach21Universität Oldenburg, Oldenburg, Germany;2Fachhochschule Bielefeld (University of Applied Sciences, Bielefeld, Germany

There is a noticeable absence of gestures—bethey mouse-based or (multi-)touch-based—inmainstream digital audio workstation (DAW) soft-ware. As an example for such a gesture consid-er a clockwise O drawn with the finger to increase a value of a parameter. The increasingavailability of devices such as smartphones,tablet computers, and touchscreen displays rais-es the question in how far audio software canbenefit from gestures. We describe designstrategies to create a consistent set of gesturecommands. The main part of this paper reportson a user survey on mappings between 22 DAWfunctions and 30 single-point as well as multi-point gestures. We discuss the findings andpoint out consequences for user-interface design. Convention Paper 8456

Student/Career Developm ent EventRECORDING COM PETITION— PART 1Saturday, May 14, 16:30 – 18:30Room 3

The Student Recording Competition is a highlight at eachconvention. A distinguished panel of judges participatesin critiquing finalists of each category in an interactivepresentation and discussion. This event presents stereoand surround recordings in these categories:

• Sound for Visual Media 16:30 to 17:30

• Traditional Acoustic Recording 17:30 to 18:30

The top three finalists in each category, as identified byour judges, present a short summary of their productionintentions and the key recording and mix techniquesused to realize their goals. They then play their projectsfor all who attend. Meritorious awards are determinedhere and will be presented at the closing Student Dele-gate Assembly Meeting (SDA-2) on Monday afternoon.The competition is a great chance to hear the work ofyour fellow students at other educational institutions.Everyone learns from the judges’ comments even if yourproject isn’t one of the finalists, and it’s a great chance tomeet other students and faculty members.

Saturday, M ay 14 17:00 Room Saint JulienTechnical Com m ittee M eeting on Electro M agneticCom patibility

Saturday, M ay 14 17:00 Room M outon CadetStandards Com m ittee M eeting IEC 60268-3 ad-hocm eeting

Special EventOPEN HOUSE OF THE TECHNICAL COUNCIL AND THE RICHARD C. HEYSER M EM ORIAL LECTURE Saturday, May 14, 18:30 – 20:00Room 1

Lecturer: Karlheinz Brandenburg

The Heyser Series is an endowment for lectures by emi-nent individuals with outstanding reputations in audio engi-neering and its related fields. The series is featured twiceannually at both the United States and European AESConventions. Established in May 1999, The Richard C.Heyser Memorial Lecture honors the memory of RichardHeyser, a scientist at the Jet Propulsion Laboratory, whowas awarded nine patents in audio and communicationtechniques and was widely known for his ability to clearlypresent new and complex technical ideas. Heyser wasalso an AES governor and AES Silver Medal recipient.The Richard C. Heyser distinguished lecturer for the

130th AES Convention is Karlheinz Bandenburg. Bran-denburg has been a driving force behind some of today’smost innovative digital audio technology, notably theMP3 and MPEG audio standards. The research resultsof his dissertation are the basis of MPEG-1 Layer 3(MP3), MPEG-2 Advanced Audio Coding (AAC), andmost other modern audio compression schemes. He isacclaimed for pioneering work in digital audio coding,perceptual measurement techniques, Wave Field Syn-thesis (WFS), and psychoacoustics. His honors includethe AES Silver Medal, the “IEEE Masaru Ibuka Con-sumer Electronic Award,” the German Future Award,which he shared with his colleagues, and the Cross ofthe Order of Merit of the Federal Republic of Germany.Furthermore he is a member in the “Hall of Fame” of theConsumer Electronics Association and of the Internation-al Electrotechnical Commission. In 2009 he was appoint-ed as Ambassador of the European Year of Creativityand Innovation. Brandenburg holds a Doctorate in Elec-trical Engineering from Friedrich-Alexander University Erlangen-Nuremberg and received honorary Doctoratedegrees from the universities Koblenz-Landau andLüneburg for his outstanding research work in the field ofaudio coding. He is professor at the Institute for MediaTechnology at Ilmenau University of Technology and director of the Fraunhofer Institute for Digital MediaTechnology IDMT in Ilmenau, Germany. He is married toInes Rein-Brandenburg. They share their home in Ilme-nau with two lovely cats. The title of his speech is “Howto Provide High Quality Audio Everywhere: The MP3Story and More …”Once upon a time there was the challenge to transmit

high quality audio over phone lines. While this seemedimpossible, ideas from psychoacoustics and signal pro-cessing—work by many researchers—helped the seem-ingly impossible to become reality: MP3 and other audiocodecs enabled seamless transport of audio over thinlines. However, the MP3 story did not end there. The Internet was changing shape, transforming from a text-based medium into a major carrier for sound of all kinds,including music. This meant changes not only for thepayload (from text to audio), but also meant new dangers

Page 26: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

26 Audio Engineering Society 130th Convention Program, 2011 Spring

for the audio quality delivered to music lovers, and itchanged business models for music sales dramatically,shaking up the music industry.Today the challenges are different: we have a multi-

tude of (legal) sources of music, and many of us haveaccess to Terabytes of music. How do we find our waythrough this abundance of available content; how do wefind the gems in the millions of medium quality music?Even then, audio reproduction still is far from perfect, sohow can we really fulfill the dream of complete auditoryillusion, and what are the problems even today? The talkwill introduce some current work on both MIR (Music Information Retrieval), on audio reproduction and thepsychoacoustic research needed to get further along to-ward the dream of perfectly reconstructed sound.Brandenburg’s presentation will be followed by a

reception hosted by the AES Technical Council.

Student/Career Developm ent EventSTUDENT PARTYSaturday, May 14, 20:00 – 22:00

Join us for a fun and exciting evening at the Student Social. On Saturday night, May 14, at 8 pm the fun begins! Don’t miss this chance to meet and engage withstudents from all over the world! Bring your own tracksand we will play them! Later, the local Student Sectionwill guide us through the night. More information and details will be provided at SDA-1.

Session P14 Sunday, M ay 15 09:00 – 12:30Room 1

M ULTICHANNEL AND SPATIAL SIGNAL PROCESSING, PART 2

Chair: Russell M ason, University of Surrey, Guildford, Surrey, UK

09:00

P14-1 Spatial Analysis of Room Im pulse ResponsesCaptured with a 32-Capsule M icrophone Array— Angelo Farina, Alberto Amendola, Andrea Capra, Christian Varani, University ofParma, Parma, Italy

The authors developed a new measurementsystem, which captures 32-channel impulse responses by means of a spherical microphonearray and a matrix of FIR filters, capable of prov-ing frequency-independent directivity patterns.This allows for spatial analysis with resolutionmuch higher than what was possible with obso-lete sum-and-delay beamforming. The softwaredeveloped for this application creates a false-color video of the spatial distribution of energy,changing with running time along the impulse response duration. A virtual microphone probeallows extraction of the sound coming from anyspecific direction. The method was successfullyemployed in three concert halls, providing guid-ance for correcting some acoustical problems(echo, focusing) and for placing sound reinforce-ment loudspeakers in optimal positions. Convention Paper 8400

09:30

P14-2 Control of the Beam width of a Beam form erwith a Fixed Array Configuration— Wan-HoCho,1 Marinus M. Boone,2 Jeong-Guon Ih,3Takeshi Toi11Chuo University, Tokyo, Japan 2Delft University of Technology, Delft, The Netherlands 3Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea

The directional characteristic of the optimalbeamformer of a transducer array depends notonly on its hardware configuration but also onthe stability factor. This parameter can be usedto control the directivity of the array. In this papera method, which is based on the proper selec-tion strategy of the stability factor, is suggestedto control the directional characteristics of theoptimal beamformer without changing the arrayconfiguration. The selection method of the stabil-ity factor was investigated considering the trade-off relation between spatial resolution and noiseamplification or array gain. The suggestedmethod was applied to the problems of both microphone and loudspeaker arrays to obtain aspecific directivity pattern of high resolution witha constant beamwidth. Convention Paper 8401

10:00

P14-3 Design 3-D High Order Am bisonics EncodingM atrices Using Convex Optim ization— HaohaiSun, U. Peter Svensson, Norwegian Universityof Science and Technology, Trondheim, Norway

In this paper we propose a convex optimizationmethod for the design of 3-D High Order Ambisonics (3-D HOA) encoding matrices usingspherical microphone arrays, which offers thepossibility to impose spatial stop-bands in the directivity patterns of all the spherical harmonicswhile keeping the transformed audio channelsstill compatible with the 3-D HOA reproductionsound format. Using the proposed convex opti-mization method, the globally optimal encodingmatrices can be obtained, and the suitabletrade-off among several design factors, e.g., response distortions in the obtained sphericalharmonic, the dynamic range of matrix coeffi-cients, i.e., the system robustness, and frequen-cies, can be analyzed and illustrated. The pro-posed convex optimization is formulated as aform of second-order cone programming thatcan be efficiently solved. Numerical results vali-date the proposed method. This method can beeasily generalized to the 2-D HOA cases. Convention Paper 8402

10:30

P14-4 Principles in Surround Recordings withHeight— Günther Theile,1 Helmut Wittek21VDT, Geretsried, Germany2Schoeps Mikrofone, Karlsruhe, Germany

New multichannel sound formats extending 5.1with height channels are adding the third dimen-sion to recordings. They provide a much widerrange of spatial sound effects and allow more real-ism of spatial reproduction in terms of direct

Technical Program

Page 27: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 27

sound, early and late reflections, reverberation,and ambience sound. Using the example of twoupper layer front and two upper layer surroundcomplementary loudspeakers (5.1+4, known as“Auro 3D 9.1”) the psychoacoustic principles in theperception of elevated phantom sound sources,spatial depth, spatial impression, envelopment,ambient atmosphere, as well as directional stabilitywithin the sweet area are discussed. Concreteproposals for microphone configurations canevolve from these considerations.Convention Paper 8403

11:00

P14-5 Efficient 3-D Sound Field Reproduction—Mincheol Shin,1 Filippo Fazi,1 Jeongil Seo,2Philip A. Nelson11ISVR, University of Southampton, Southhampton, UK2Electronics Telecommunication Research Institute, Daejeon, Korea

A method is presented for efficient sound fieldreproduction with a loudspeaker array constitut-ed by multiple sound sources that are three-dimensionally distributed. The physical reproduction of several target sound fields is in-vestigated when the target sound fields are sur-rounded by multiple sound sources. A cost func-tion to be minimized has been developed toobtain the optimal solution with reasonable ener-gy distribution and sufficient sweet area whenthe distribution of loudspeakers is non-uniform.The performance of the proposed method is ver-ified by the results of computer simulations andsubjective tests in the cases of NHK 22.2 andETRI 10.2 channel configurations. Convention Paper 8404

11:30

P14-6 On the Scattering of Synthetic SoundFields— Jens Ahrens, Sascha Spors, DeutscheTelekom Laboratories, Technische UniversitätBerlin, Berlin, Germany

In sound field synthesis a given arrangement ofelementary sound sources is employed in orderto synthesize a sound field with given desiredphysical properties over an extended region.The calculation of the driving signals of thesesecondary sources typically assumes free-fieldpropagation conditions. The present paper in-vestigates the scattering of such synthetic soundfields from unavoidable scattering objects likethe head and body of a person apparent in thetarget region. It is shown that the basic mecha-nisms are similar to the scattering of naturalsound fields. Though, synthetic sound fields canexhibit properties different to those of naturalsound fields. Consequently, in such cases alsothe scattered synthetic sound fields exhibit prop-erties different to those of scattered naturalsound fields. Convention Paper 8405

12:00

P14-7 Toward M ass-Custom izing Up/Down Generic3-D Sounds for Listeners: A Pilot Experim entConcerning Inter-Subject Variability— JohnAu, Richard So, Andrew Horner, The Hong Kong

University of Science & Technology, ClearwaterBay, Kowloon, Hong Kong

The “delay-and-add” theory (Hebrank andWright, 1974) was used to calculate 4068matching scores between each of the 192HRTFs and the dimensions of 12 pairs of earsfor two incident sound directions (30 degrees upand 30 degrees down). Five HRTFs with 0, 25,50, 75, and 100th percentile average matchingscores were selected for two incident directions.These 10 HRTFs were used to produce 10sound cues (5 from 30 degrees up and 5 from30 degrees down). Ten listeners participated in asound localization experiment to localize the 10sound cues presented in random order and with5 repetitions. Preliminary results indicated thatmatching scores can explain up to 22% of the inter-subject variations in localization errors. Potential applications are discussed. Convention Paper 8406

W orkshop 9 Sunday, M ay 1509:00 – 11:00 Room 2

W HAT EVERY SOUND ENGINEER SHOULD KNOW ABOUT THE VOICE

Chair: Eddy B. Brixen, EBB Consult, Denmark, TC-MA and TC-AF

Panelists: Evelyn Abberton, Department of Phonetics and Linguistics University College, London, UKAdrian Fourcin, Department of Phonetics and Linguistics University College, London, UKHenrik Kjelin, Vocal Institute, DenmarkJulian McGlashan, Department of Otorhinolaryngology, Queen’s Medical CentreCampus, Nottingham University Hospitals, UKMikkel Nymand, Timbre Music / DPA Microphones, DenmarkCathrine Sadolin, Complete Vocal Institute, Denmark

The purpose of this workshop is to teach sound engi-neers how to listen to the voice before they even think ofmicrophone picking and knob-turning. The presentation and demonstrations are based on

the “Complete Vocal Technique” (CVT) where the funda-mental is the classification of all human voice soundsinto one of four vocal modes named Neutral, Curbing,Overdrive, and Edge. The classification is used by pro-fessional singers within all musical styles, and has in aperiod of 20 years proved easy to grasp in both real lifesituations and also in auditive and visual tests (sound examples and laryngeal images/Laryngograph® wave-forms). These vocal modes are found in the speakingvoice as well. Cathrine Sadolin, the developer of CVT, will involve

the audience in this workshop, while explaining anddemonstrating how to work with the modes in practice toachieve any sound and solve many different voice prob-lems like unintentional vocal breaks, too much or too littlevolume, hoarseness, and much more. Julian McGlashan,Adrian Fourcin, and Evelyn Abberton will explain thephysical aspects of the voice and demonstrate laryngo-graph waveforms and analyses. Mikkel Nymand will explain essential parameters in the recording chain—especially the microphone—to ensure reliable and natur-al recordings. �

Page 28: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

28 Audio Engineering Society 130th Convention Program, 2011 Spring

Tutorial 5 Sunday, M ay 1509:00 – 10:30 Room 3

AUDIO OVER IP— THE BASICS

Presenter: Peter Stevens, BBC R&D, London, UK

With the slow demise of ISDN connections and technolo-gy, audio over IP plays an increasingly important role fortransporting audio signals from A to B. But this is onlyone area where audio over IP is used. Many more poten-tial and also existing applications show where we areheading. This tutorial looks at the basics of this technolo-gy and pinpoints the advantages as well as the pitfallsand how they can be avoided. Practical examples willunderline the concepts.

Tutorial 6 Sunday, M ay 1509:00 – 10:30 Room 4

NO— I SAID “INTERACTIVE” M USIC

Co-Chairs: Dave Raybould, Leeds Metropolitan University, Leeds, UKRichard Stevens, Leeds Metropolitan University, Leeds, UK

This session will recap a number of common approachesto so called “interactive” music in games through a seriesof practical in-game demonstrations. We’ll discuss whatis understood by the terms reactive, adaptive, and inter-active and put forward the argument that there's actuallyvery little about the current use of music in games that istruly “interactive.” The session will conclude with furtherexamples of some potential future solutions of how wemight better align the time-based medium of music withthe nonlinear medium of games. This session is intendedto be accessible to complete beginners but also thoughtprovoking to old pros!

Sunday, M ay 15 09:00 Room M outon CadetStandards Com m ittee M eeting SC-04-04 on M icrophone M easurem ent and Characterization

Sunday, M ay 15 10:00 Room Saint JulienTechnical Com m ittee M eeting on Loudspeakers and Headphones

W orkshop 10 Sunday, M ay 1511:00 – 12:30 Room 5

ACOUSTICS FOR SOUND REINFORCEM ENT

Chair: Peter M app, Peter Mapp Associates, Colchester, Essex, UK

Panelist: Ben Kok, Ben Kok – Acoustic Consulting

Traditionally, acoustic design for performance spaces focuses on optimum acoustics for non-reinforced perfor-mances. However, a fine concert hall for symphonic music can be very troublesome for reinforced sound.Even performance spaces designed with variableacoustics are mostly optimized for acoustic sources, littleconsideration is given to the needs for reinforced sound.This session addresses the different acoustic require-

ments for reinforced sound as opposed to non-reinforcedsound. Topics to be discussed will include:

– Reverberation, support or interference?

– Source directivity, positioning and aiming

– Discrete reflections and how to avoid them

– What to include in a new venue for optimum adaptation for reinforced sound

– Practical measures in existing situations

– Workarounds in acoustic harsh environments.

Tutorial 7 Sunday, M ay 1511:00 – 12:30 Room 3

LOUDNESS AND EBU R 128— A BASIC TUTORIAL OF THE W HY AND W HAT

Presenter: Florian Cam erer, ORF, Austrian TV, and Chairman of EBU PLOUD

The EBU recommendation R 128 “Loudness normaliza-tion and permitted maximum level of audio signals” is amilestone for establishing a new way to level audio sig-nals. Together with four supporting documents it pro-vides the framework for the transition into this new worldof leveling audio based on loudness, not peaks. Thechairman of the EBU group PLOUD will present R 128 indetail with the help of examples and will give an outlookof how and where the recommendation is already inpractical use.

Sunday, M ay 15 11:00 Room Saint JulienTechnical Com m ittee M eeting on Hearing and Hearing Loss Prevention

Sunday, M ay 15 11:00 Room M outon CadetStandards Com m ittee M eeting SC-02-01 on DigitalAudio M easurem ent Techniques

Special EventAES/APRS/DV247 EVENT: TALKBACK-PRO 2LEAD M E DOW N THE SIGNAL PATHSunday, May 15, 11:15 – 13:00Room 2

Moderator: Gil Lim or

Panelists: TBA

“All you need is a great mic and a decent pre-amp andyou can expect your home recordings to match any stu-dio.” A familiar claim but a pretty glib observation, espe-cially when there are so many “affordable” options andsuch a wide diversity of “subjective” opinions.“Listen, position, and condition” is the mantra of an engi-

neer trying to capture a live sound accurately. How doeseach skill impact on the others? Can the flaws in a cheapmicrophone be rectified by cunning signal manipulation ordoes the notion of “less is more” apply to signal path?A group of key experts from academics to manufactur-

ers discuss the fundamentals of audio capture and re-finement.

Session P15 Sunday, M ay 15 11:30 – 13:00Foyer

POSTERS: SPEECH AND CODING

11:30

P15-1 A New Robust Hybrid Acoustic Echo Cancellation Algorithm for Speech

Technical Program

Page 29: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 29

Com m unication in Car Environm ents— YiZhou, Chengshi Zheng, Xiaodong Li, ChineseAcademy of Sciences, Beijing, China

This paper studies a new robust hybrid adaptivefiltering algorithm for acoustic echo cancellation(AEC) in a car speech communication system.The proposed algorithm integrates the affineprojection algorithm (APA) and normalized leastmean square (NLMS) algorithm. It can switchbetween them with a coefficient derivative-basedtechnique to achieve overall optimum conver-gence performance. To help the algorithm com-bat deteriorating impulsive interferences widelyencountered in car environments and enhancethe system robustness in double talk period, robust statistics technique is also incorporatedinto the proposed algorithm. Experiments areconducted to verify the improved and robustoverall AEC convergence performance achievedby the new algorithm. Convention Paper 8407

11:30

P15-2 Speech Source Separation Using a M ulti-Pitch Harm onic Product Spectrum -Based Algorithm — Rehan Ahmed, Roberto Gil-Pita,David Ayllón, Lorena Álvarez, University of Alcalá, Alcalá de Henares, Spain

This paper presents an efficient algorithm forseparating speech signals by determining multi-ple pitches from mixtures of signals and assign-ing the sources to one of those estimated pitches. The pitch detection algorithm is basedon Harmonic Product Spectrum. Since the pitchof speech signals fluctuates readily, a frame-based algorithm is used to extract the multiplepitches in each frame. Then, the fundamentalfrequency (pitch) for each source is estimatedand tracked after comparing all the frames. Theestimated fundamental frequency of the sourcesis then used to generate a set of binary masksthat allow separating the signals in the ShortTime Fourier Transform domain. Results show aconsiderable separation of the speech signals,justifying the feasibility of the proposed method. Convention Paper 8408

11:30

P15-3 User-Orientated Subjective Quality Assessm ent in the Case of Sim ultaneous Interpreters— Judith Liebetrau,1 ThomasSporer,1 Paolo Tosoratti,2 Daniel Fröhlich,1Sebastian Schneider,1 Jens-Oliver Fischer11Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany2DG Interpretation, Brussels, Belgium

A study was conducted to explore various sub-jective quality aspects of audio-visual systemsfor interpreters. The study was designed to bringthe existing requirements for on-site simultane-ous interpretation up-to-date and to specify newrequirements for remote interpretation (telecon-ferencing). The feasibility of using objectivemeasurement methods in this context was examined. Several parameters influencing perceived quality, such as audio coding, videocoding, room characteristics, and audio-visuallatency were assessed. The results obtained are

partially contradictory to previous studies. Thisleads to the conclusion that perceived quality isstrongly linked to the focus, background, andabilities of the assessors. The test design, real-ization, and obtained results are shown, as wellas a comparison to studies conducted with dif-ferent types of users. Convention Paper 8409

11:30

P15-4 Analysis of Param eters of the Speech Signal Loudness of Professional Television Announcers— Mirjana Mihajlovic,1 Dejan Todor-ovic,2 Iva Salom31Radio Television of Serbia, Belgrade, Serbia 2Radio Belgrade, Belgrade, Serbia 3Institute Mihajlo Pupin, Belgrade, Serbia

The paper analyzes speech loudness of profes-sional announcers, who do large percentage oftelevision audio signals. Over 100 files of speechsignals recorded with same equipment were analyzed. For each file the authors calculatedloudness and true peak, according to ITU-R BS1770 and EBU R128 and RMS. There were cri-teria adopted for classification of records, andresults were statistically analyzed. Analysisshowed that the loudness of the speech signaldepends on the gender of speakers, intonation,and type of program material, and also that filesof greater loudness may not always have higherpeak value to files of lower loudness. It was not-ed there was a significant overlap among theshort-term loudness and RMS values calculatedwith the same time constant. Convention Paper 8410

11:30

P15-5 Im proved Prediction of NonstationaryFram es for Lossless Audio Com pression—Florin Ghido, Tampere University of Technology,Tampere, Finland

We present a new algorithm for improved predic-tion of nonstationary frames for asymmetricallossless audio compression. Linear prediction isvery efficient for decorrelation of audio samples,however it requires segmentation of the audiointo quasi-stationary frames. Adaptive segmen-tation tries to minimize the total compressedsize, including the quantized prediction coeffi-cients for each frame, thus longer frames thatare not quite stationary may be selected. Thenew algorithm for computing the linear predictioncoefficients improves compressibility of nonsta-tionary frames when compared with the leastsquares method. With adaptive segmentation,the proposed algorithm leads to small but con-sistent compression improvements up to 0.56%,on average 0.11%. For faster encoding usingfixed size frames, without including adaptive seg-mentation, it significantly reduces the penalty oncompression with more than 0.21% on average.Convention Paper 8411

11:30

P15-6 A Lossless/Near-Lossless Audio Codec forLow Latency Stream ing Applications on Em bedded Devices— Neil Smyth, Cambridge

Page 30: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

30 Audio Engineering Society 130th Convention Program, 2011 Spring

Silicon Radio, Belfast, Northern Ireland

Increasingly there is a need for high quality audio streaming in devices such as modularhome audio networking systems, PMPs, andwireless loudspeakers. As wireless transmissioncapacities continue to increase it is desirable toutilize this increasing data bandwidth to performreal-time wireless streaming of audio contentcoded in a lossless or near-lossless format. Inthis paper we discuss the development of anadaptive audio coding algorithm that balancesthe design goals of low latency, low complexity,error robustness, and a dynamically-variable bitrate that scales to mathematically-lossless cod-ing under suitable conditions. Particular empha-sis is placed on the optimization of the algorithmstructure for real-time audio processing applica-tions and the mechanism by which “hybrid” loss-less and near-lossless coding is achieved. Convention Paper 8412

11:30

P15-7 Low-Delay Directional Audio Coding for Real-Tim e Hum an-Com puter Interaction— TapaniPihlajamäki, Ville Pulkki, Aalto University Schoolof Electrical Engineering, Espoo, Finland

Games and virtual worlds require low-cost,good-quality, and low-delay audio engines. TheShort-Time Fourier Transform-based (STFT) implementation of Directional Audio Coding(DirAC) for virtual worlds fulfills the two first demands. Unfortunately, the delay can be per-ceivably large. In this paper, a modification toDirAC is introduced that uses different time-fre-quency resolutions for STFT concurrently, whichare not aligned in time in reproduction. Thisleads to the high-frequency non-diffuse contentbeing reproduced as soon as possible and thusreduces the perceived delay. Informal testsshow that the delay was short enough for a musician to play an instrument through the pro-cessing. Moreover, the results of formal listeningtests show that the reduction in quality is percep-tible but not annoying. Convention Paper 8413

W orkshop 11 Sunday, M ay 1511:30 – 13:00 Room 4

EM ERGING TRENDS IN AUDIO FOR GAM ES

Presenters: M ichael Kelly, Sony Computer Entertainment Europe, London, UKSteve M artz, THX, CA, USA

This workshop looks at the current state of technologyrequirements for audio in game applications and discuss-es emerging trends and the technical requirements im-posed by those trends. The event is presented by theCo-Chairs of the AES Technical Committee on Audio forGames. The workshop also summarizes some of thetopics presented at the recent 41st International Confer-ence on Audio for Games..

Sunday, M ay 15 12:00 Room Saint JulienTechnical Com m ittee M eeting on Signal Processing

Sunday, M ay 15 13:00 Room Saint JulienTechnical Com m ittee M eeting on High ResolutionAudio

Sunday, M ay 15 13:00 Room M outon CadetStandards Com m ittee M eeting SC-04-03 on Loudspeaker M odeling and M easurem ent

Session P16 Sunday, M ay 15 14:00 – 17:30 Room 1

AUDIO SIGNAL PROCESSING AND ANALYSIS

Chair: Jayant Datta

14:00

P16-1 A New Approach to Designing Decim ationFilters for Oversam pled A/D Converters—Jamie A. S. Angus, University of Salford, Salford, Greater Manchester, UK

This paper presents a new approach to design-ing the necessary decimation filter in over-sam-pled noise shaping analog to digital converters.These filters are still finite impulse response designs but they are designed using novel win-dow functions that produce a stop-band attenua-tion that increases with frequency thus matchingthe out of band noise characteristics of thenoise-shaping modulator. As a result high qualitydecimation filters can be realized using shorterfilter lengths. Convention Paper 8414

14:30

P16-2 W arped IIR Filter Design with Custom W arping Profiles and its Application to RoomResponse M odeling and Equalization— BalázsBank, Budapest University of Technology andEconomics, Budapest, Hungary

In traditional warped FIR and IIR filters, the fre-quency-warping profile is adjusted by a singlefree parameter, leading to a less flexible alloca-tion of frequency resolution. As an example, it isnot possible to achieve a truly logarithmic fre-quency resolution, which would be often desiredin audio applications. In this paper a new approach is presented for warped IIR filter design where the filter specification is trans-formed by any desired (e.g., logarithmic) fre-quency transformation, and a standard IIR filteris designed to this transformed specification.Then, the poles and zeros of this transformed fil-ter are found and mapped back to the originalfrequency scale. Due to the approximations inmapping back the poles and zeros, the resultingtransfer function may show some discrepanciesfrom its optimal version. This is resolved by anadditional optimization of the zeros of the final fil-ter. Examples of loudspeaker-room responsemodeling and equalization are presented.Convention Paper 8415

15:00

P16-3 Com putationally Efficient Nonlinear Chebyshev M odels Using Com m on-Pole Parallel Filters with the Application to

Technical Program

Page 31: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 31

Loudspeaker M odeling— Balázs Bank, Budapest University of Technology and Economics, Budapest, Hungary

Many audio systems show some form of nonlin-ear behavior that has to be taken into account inmodeling. For this, often a black-box model isidentified, coming from the generality and sim-plicity of this approach. One such model is thesimplified Volterra model, using parallel branch-es that have a polynomial-type nonlinearity anda linear filter in series. For example, Chebyshevmodels use Chebyshev polynomials as nonlin-ear functions, making the model identification avery straightforward procedure by using logarith-mic sweep measurements. This paper proposesa highly efficient implementation of Chebyshevmodels by using fixed-pole parallel filters for thelinear filtering part. The efficiency comes fromthe fact that parallel filters can have a logarith-mic frequency resolution, which better fits thehuman hearing behavior than traditional FIR orIIR filters. Moreover, the branches can share thesame denominators, leading to an additionalperformance benefit. The proposed model isparticularly well suited for the real-time digitalsimulation of loudspeakers and other weaklynonlinear devices, such as tube guitar amplifiers. Convention Paper 8416

15:30

P16-4 Event-Driven Real-Tim e Audio Processingwith GPGPUs— Tiziano Leidi,1 Thierry Heeb,1Marco Colla,1 Jean-Philippe Thiran21ICIMSI-SUPSI, Manno, Switzerland2EPFL, Lausanne, Switzerland

Development of real-time audio processing applications for GPGPUs is not without chal-lenges. Parallel processing of audio signals is often constrained by serial dependencies withinor between the algorithms. On GPGPUs, insuffi-cient data pressure further limits the attainableperformance improvements, as it causes inactiv-ity of the GPU cores. In this paper we analyzethe limits of audio processing on GPGPUs andpresent an approach based on event-drivenscheduling, that maximizes data pressure to favor performance improvements. We also pre-sent recent enhancements of Audio n-Genie, anopen-source development environment for audio-processing applications. By combining Audio n-Genie and the proposed approach, weshow that it is possible to increase audio pro-cessing speed-up. Convention Paper 8417

16:00

P16-5 A Com parison of Param etric Optim izationTechniques for Tone M atching— Matthew Yee-King,1 Martin Roth21Goldsmiths, University of London, London, UK2Reality Jockey Ltd., London, UK

Parametric optimization techniques are com-pared in their abilities to elicit parameter settingsfor sound synthesis algorithms, which causethem to emit sounds as similar as possible totarget sounds. A hill climber, a genetic algorithm,a neural net, and a data driven approach arecompared. The error metric used is the Euclid-

ean distance in MFCC feature space. This met-ric is justified on the basis of its success in previ-ous work. The genetic algorithm offers the bestresults with the FM and subtractive test synthe-sizers but the hill climber and data driven approach also offer strong performance. Theconcept of sound synthesis error surfaces, allowing the detailed description of sound syn-thesis space, is introduced. The error surface foran FM synthesizer is described and suggestionsare made as to the resolution required to effec-tively represent these surfaces. This informationis used to inform future plans for algorithm improvements. Convention Paper 8418

16:30

P16-6 On the M ultichannel Sinusoidal M odel forCoding Audio Object Signals— ToniHirvonen,1 Athanasios Mouchtaris1,21Institute of Computer Science, Foundation for Research and Technology–Hellas (FORTH-ICS) Heraklion, Crete, Greece (now with Dolby Laboratories, Stockholm, Sweden) 2University of Crete, Heraklion, Crete, Greece

This paper presents two improvements on a recently proposed multichannel sinusoidal mod-eling system for coding multiple audio objectsignals. The system includes extracting the sinusoidal components and an LPC envelopefor each object signal, as well as transform cod-ing of the residuals’ downmix. The contributionsof this paper are: (a) a psychoacoustic model forenabling the system to scale well with multipleobject signals, and (b) an improved method toencode the common residual, tailored to the“white” nature of this signal. As a result, soundquality of around 90% on the MUSHRA scale isobtained for 10 simultaneous object signals cod-ed with a total rate of 150 kbit/s, while retainingthe individual object parametric representations. Convention Paper 8419

17:00

P16-7 An Additive Synthesis Technique for Independent M odification of the AuditoryPerceptions of Brightness and W arm th—Asteris Zacharakis, Joshua Reiss, Queen MaryUniversity of London, London, UK

An algorithm that achieves independent modifi-cation of two low-level features that are correlat-ed with the auditory perceptions of brightnessand warmth was implemented. The perceptualvalidity of the algorithm was tested through a series of listening tests in order to examinewhether the low-level modification was indeedperceived as independent and to investigate theinfluence of the fundamental frequency on theperceived modification. A Multidimensional Scal-ing analysis (MDS) on listener responses to pair-wise dissimilarity comparisons accompanied bya verbal elicitation experiment examined the per-ceptual significance and independence of thetwo low-level features chosen. This is a first stepfor the future development of a perceptuallybased control of an additive synthesizer. Convention Paper 8420

Page 32: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

32 Audio Engineering Society 130th Convention Program, 2011 Spring

Session P17 Sunday, M ay 15 14:00 – 15:30Foyer

POSTERS: BINAURAL AND SPATIAL AUDIO

14:00

P17-1 Repeatability of Localization Cues in HRTFData Bases— Elena Blanco-Martin, Silvia Merino Saez-Miera, Juan José Gomez-Alfageme, LuisIgnacio Ortiz-Berenguer, Universidad Politecnicade Madrid, Madrid, Spain

Head-Related Transfer Function (HRTF) repre-sents the time-spectral filtering that head andtorso do to a sound that goes from a soundsource to the ears. These transfer functionsbring the localization cues that vary according tosound source position (azimuth and elevation).Human auditory system uses such localizationcues for estimating the sound direction. HRTFare used in two ways. One way is for synthesiz-ing binaural sound (virtual audio 3-D). The sec-ond way is for analyzing binaural sounds in order to estimate the localization of soundsources. Therefore HRTF are important anduseful data for researchers. There are few publicHRTF data bases. The most known is the KemarData Base from MIT. There is also the CipicData Base from UC Davis and the Itakura DataBase. In this paper we present a new public database for researching use available on the web.The data base has been measured on the Headand Torso Simulator 4100 by Brüel & Kjær. Inaddition, a comparative study of localizationcues from the data bases is carried out for show-ing their repeatability. Convention Paper 8421

14:00

P17-2 Dynam ic Head-Related Transfer Function M easurem ent Using a Dual-Loudspeaker Array— Qinghua Ye, Qiujie Dong, LinglingZhang, Xiaodong Li, Institute of Acoustics, Chinese Academy of Sciences, Beijing, China

A dynamic Head-Related Transfer Function(HRTF) measurement method using a dual-loud-speaker array is presented, which reduces mea-suring requirements and increases the efficiencyas well. First, the dual-loudspeaker array emitsuncorrelated signals, while head size and headmotion are obtained through short-time time-delay estimation. Second, in the approximationof linear time-invariant (LTI) within the testingtime, multi-point continuous HRTF measurementis accomplished. Compared with FASTRAKhead tracking system, the experimental resultsconfirm the validity of the proposed method. Convention Paper 8422

14:00

P17-3 Statistical Analysis of Binaural Room Im pulse Responses— Eleftheria Georganti,Alexandros Tsilfidis, John Mourjopoulos, University of Patras, Patras, Greece

In previous work of the authors, the spectralmagnitude of room transfer functions (RTFs)

was analyzed using histograms and statisticalquantities (moments), such as the kurtosis andskewness. In this paper the above analysis isextended to binaural room impulse responses(BRIRs) and the dependence of the statisticalmeasures on the room acoustical properties,such as the reverberation time, the room size,and the source-receiver distance is discussed.Emphasis is given on the binaural measure ofthe magnitude squared coherence (MSC), whichis considered to be an important cue for binauralhearing related to perceptual aspects such asthe source width, the envelopment and the spa-ciousness. After a brief overview of the existingMSC models, a perceptually-motivated MSC implementation is examined, based on a gam-matone filterbank. MSC results for variousrooms and source-receiver positions are pre-sented and related to the existing MSC models. Convention Paper 8423

14:00

P17-4 Spatial Sound and Stereoscopic Vision— PaulMannerheim, University of California, Santa Bar-bara, Santa Barbara, CA, USA

This paper presents a technique for reproducingcoherent audio visual images for multiple users,only wearing 3-D glasses and without utilizinghead racking. The recent emergence of 3-D con-tent has increased the demand for technologythat can display visual images that are coherentwith sound images for multiple users. Audio visual object difference is here investigated foranalyzing the size of the sweet spot of a systemthat combines a visual display technique namedstereoscopy with a sound reproduction tech-nique called wave field synthesis. The sweetspot of such a configuration is limited due to dif-ferences in characteristics between the soundreproduction system and the visual display; how-ever as a consequence, it is found that the num-ber sources in the wave field synthesis array canbe reduced. Convention Paper 8424

14:00

P17-5 Designing Am bisonic Decoders for Im provedSurround Sound Playback in ConstrainedListening Spaces— David Moore,1 JonathanWakefield21Glasgow Caledonian University, Glasgow, Scotland, UK2University of Huddersfield, Huddersfield, UK

Much research has been undertaken to optimizeirregular 5-speaker Ambisonic decoders for ide-alized listening environments. In such environ-ments loudspeaker placement is not restrictedand can conform to the ITU 5.1 standard. In do-mestic settings, the room shape, furniture, andtelevision positioning often restrict speakerplacement. It is often the case that a compro-mised speaker layout is enforced by other domestic requirements. This paper seeks to de-rive Ambisonic decoders to optimize perceivedlocalization performance for these constrainedasymmetrical speaker layouts. This work uses aheuristic search algorithm to derive decoder coefficients and simultaneously optimize loud-

Technical Program

Page 33: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 33

speaker angle within specified bounds. Theoreti-cal results are shown for different orders of newly derived Ambisonic decoders for typicaldomestic scenarios. Convention Paper 8425

14:00

P17-6 Decoding for 3-D— Johannes Boehm, Technicolor, Research & Innovation, HannoverGermany

Three dimensional spatial sound reproductionusing irregular loudspeaker layouts requires aspecial decoder design. We present the funda-mentals of Higher Order Ambisonics (HOA) decoding. Then we focus on beam forming tech-niques to derive solutions for irregular spacedsetups. Panning functions are created by vectorbase amplitude panning or least-squares meth-ods as patterns and are modeled by sphericalharmonics. The required HOA order for effectivebeam forming proves to be in inverse proportionto the minimal angular spacing of speakers. Wedemonstrate decoder design using an examplesetup of 16 speakers, and evaluate objectiveperformance criteria. We conclude that decodersfor irregular setups require higher HOA orderscompared to decoders for regular setups usingthe same number of speakers and discuss theconsequences. Convention Paper 8426

14:00

P17-7 Real-Tim e Reproduction of M oving SoundSources by W ave Field Synthesis: Objectiveand Subjective Quality Evaluation— MicheleGasparini, Paolo Peretti, Stefania Cecchi, LauraRomoli, Francesco Piazza, Universita Politecnica delle Marche, Ancona, Italy

Wave Field Synthesis (WFS) is an audio render-ing technique that allows the reproduction of anacoustic image over an extended listening area.In order to achieve a realistic sensation, truerepresentation of moving sound sources is es-sential. In this paper a real time algorithm thatsignificantly reduces computational efforts in thesynthesis of WFS driving functions in presenceof moving sound sources is presented. High effi-ciency can be obtained taking into account amodel based on phase approximation and ShortTime Fourier Transform (STFT). The influence ofthe streaming frame size and of the source velocity on well known sound field artifacts hasbeen studied, considering PC simulations andlistening tests. Convention Paper 8427

14:00

P17-8 Assessing Diffuse Sound Field ReproductionCapabilities of M ultichannel Playback System s— Andreas Walther, Christof Faller,Ecole Polytechnique Fédérale de Lausanne,Lausanne, Switzerland

The generation of subjectively diffuse soundfields is an essential part of creating pleasingsynthetic sound fields using loudspeaker play-back. A number of studies have been publishedpresenting subjective evaluations of the diffuse

sound field reproduction capabilities of differentloudspeaker setups. We present a model, basedon interaural coherence and interaural level dif-ference, for estimating perceived diffuseness ofsynthetic sound fields evoked by an arbitrarynumber of transducers at different positions. Theresults of different loudspeaker setups and lis-tener orientations are compared. Convention Paper 8428

W orkshop 12 Sunday, M ay 1514:00 – 16:00 Room 2

DEAF AID LOOPS AND ASSISTIVE LISTENING SYSTEM S— UNDERSTANDING THE NEEDS AND THE TECHNOLOGY

Chair: Peter M app, Peter Mapp and Associates, Colchester, UK

Panelists: Conny Anderson, UnivoxDoug Edworthy, DGE AssociatesKen Hollands, AmpetronicJohn Woodgate, J M Woodgate & Associates

Although deaf aid loop systems have been around for aconsiderable period of time, they are still not well under-stood, and many systems do not achieve an acceptablestandard. Deaf aid loops—or AFILS as they are formallyknown—are installed in all forms of venue or public build-ing raging from churches, theaters, and cinemas to rail-way and underground stations. Loop systems can varyfrom large systems covering many hundreds of seats tocounter-top systems intended for individual users. Theworkshop will review the theory and practice of AudioFrequency Induction Loop Systems (AFILS) as well asdiscussing other technologies such as infra red. Ways ofoptimizing system performance and day-to-day problemswill be discussed.

Tutorial 8 Sunday, M ay 1514:00 – 15:30 Room 4

VIDEO FOR AUDIO ENGINEERS

Presenter: David Tyas, Ikon, Worcester, UK

Incorporating video into a system can be a good addi-tional source of revenue for audio professionals. It maybe simply adding a projector to a theatre or digital signage to augment a VA system. But, with multiple ana-logue video formats and a transition to an equally con-fusing array of incompatible digital formats, what con-nects together and works, how to you convert betweenthe formats and how do you cable and connect.This short tutorial will cover the basics of video with the

emphasis on the newer formats, how to interface theseand how to incorporate legacy products into systems.

Sunday, M ay 15 14:00 Room Saint JulienTechnical Com m ittee M eeting on Audio for Gam es

Special EventAES/APRS EVENT:THE RECORDING LEGACY OF PHIL SPECTORSunday, May 15, 14:30 – 16:15Room 3

Moderator: Barry M arshall�

Page 34: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

34 Audio Engineering Society 130th Convention Program, 2011 Spring

The First Producer? Phil Spector and the Legacy of the Svengali Producer

This lecture will explore the career of Phil Spector, thelegendary figure who arguably created the role of therecord producer as we know it today. It will trace Spec-tor’s historical development from his early career as anartist, his apprenticeship with Atlantic Records andLeiber & Stoller (the first credited producers), his flower-ing as a producer-mogul with his Philles record label, hisinitial “retirement” in 1967, his re-invention as the produc-er of John Lennon and George Harrison to his later spo-radic (yet sometimes still seminal) work with artists likethe Ramones. The emphasis of the lecture will be on thedevelopment of the role of the producer—through Spec-tor’s initial creation of his “Wall of Sound” and his Sven-gali approach to artist development—and then throughhis sublimation of the Svengali role into a more mea-sured approach with his work with the gigantic talents ofartists like Lennon and Harrison. The lecture will includelots of examples of Spector’s work.

Sunday, M ay 15 15:00 Room Saint JulienTechnical Com m ittee M eeting on Acoustics and Sound Reinforcem ent

Sunday, M ay 15 15:00 Room M outon CadetStandards Com m ittee M eeting SC-03-06 and SC-03-07 Digital Library and Archive System s and AudioM etadata (com bined m eeting)

W orkshop 13 Sunday, M ay 1515:45 – 16:45 Room 4

IM PLEM ENTATION OF INTERACTIVE M USIC IN GAM ES

Chair: Erasm us Talbot, Black Rock Studio, Brighton, UK

This workshop looks at the implementation of interactivemusic in current video games. It focuses on the approach taken in “Split/Second,” an arcade racing gamepacked with action of an epic scale. To reflect its allu-sions to Hollywood blockbusters in audio, the teamturned their backs to licensed music tracks and opted fora bespoke score. Initially all music was delivered exter-nally, but the talk explains how the team became increasingly involved in the music production process.This approach allowed the team to realize creative possi-bilities that could only emerge from strong familiarity withthe game.The workshop gives a detailed explanation of the nov-

el production method and playback system used inSplit/Second and highlights how bold choices, informedby a particular outlook on interactive music, can lead to asimple, yet effective smoke-and-mirror type solution. Itwill also highlight some pitfalls that were encountered,which elicited future improvements to the system.

Session EB2 Sunday, M ay 15 16:00 – 17:30 Foyer

ENGINEERING TOOLS AND M ETHODS

16:00

EB2-1 Creative Abuse in Tim e Stretching— JustinPaterson, University of West London, London,UK

An area of digital audio manipulation currently influx is that of time stretching. Following theemergence of real-time granular synthesis as acompositional tool, early sampler-based imple-mentations were pushed beyond “authenticity” tocreate new timbres in the commercial music ofthe 1990s. As the algorithms improve, allowingmore flexible and transparent implementation today, even more opportunities for a new “cre-ative abuse” exist. This brief will first contextual-ize through consideration of the metaphor of authenticity in the tape recording of the 1940sand its soon-parallel abuse, which offered newpathways into multi-tracking and Musique Con-crète. The brief will chronologize, and continueby examining potential for exploitation of stretch-ing artifacts in some contemporary algorithms,and propose a quantification of this effect. Engineering Brief 9

16:00

EB2-2 Activity Flow in M usic Equalization: The Cognitive and Creative Im plications of Interface Design— Joshua Mycroft,1 JustinPaterson21Queen Mary University of London, London, UK2University of West London, London, UK

The mixing desk metaphor found in many DigitalAudio Workstations (DAW) creates a quantitativevisual display that is highly structured and seg-mented. While this is useful for transmitting largeamounts of quantitative data it can inhibit the moreintuitive and performative aspects inherent in music mixing. This paper’s focus is the cognitiveand creative issues encountered using current music production software equalizers and the influence they exert on the initial approach, taskworkflow, and final output of the user. Equalizershave been chosen to exemplify this, due to theirpivotal balance between aural and visual modali-ties. The paper draws conclusions as to the effec-tiveness of current software equalizer designs andproposes modifications to design.Engineering Brief 10

16:00

EB2-3 The Preset Is Dead; Long Live the Preset—Justin Paterson, University of West London,London, UK

The use of preset sounds in audio productionhas long been scorned by professional produc-ers, some of whom have cited a lack of original-ity or integrity, or perhaps a proliferation of homogenization in productions. Despite this,manufacturers have continued to develop ever-larger ranges of presets, now extending beyondinstruments and effects, to EQ and even wholechannel strips. Developmental work continuesto further automate aspects of the mixingprocess itself. This brief will examine the impli-cations of presets from yesterday to today, andusing Logic Pro as a case study, offer some insight into the relevance of this evolving arenato the professional, and the implications to theenthusiast. It will conclude with some conjec-ture for the future. Engineering Brief 11

Technical Program

Page 35: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 35

16:00

EB2-4 Autom ated Pure-Tone Audiom etry SoftwareTool with Extended Frequency Range—Thomas Bisitz,1 Andreas Silzle21HörTech GmbH, Oldenburg, Germany 2Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

In research institutes and other areas it is neces-sary to check the hearing of test subjects for lis-tening tests (e.g., for assessing the quality of audio processing algorithms, etc.) by measuringpure tone audiograms. Often, neither a neces-sary skilled operator nor a special audiometer forhigh frequencies (>8kHz, important for the evalu-ation of a lot audio processing algorithms) areavailable. As an alternative, a software for auto-mated audiogram measurements is presentedhere. This new kind of software runs on a stan-dard PC with a high-quality sound card and audi-ological headphones and is operated by the testsubject him/herself (self-screening). The imple-mented adaptive procedure allows fast standardaudiogram measurements also for high frequen-cies up to 16 kHz. The challenges with respectto dynamic range and calibration are discussed. Engineering Brief 12

16:00

EB2-5 A Sim ple Reliable Power Am plifier with M inim al Com ponent Count— John Vanderkooy, Kevin B. Krauel, University of Waterloo, Waterloo, Ontario, Canada

We study an audio power amplifier that hasthree essential active components: two powerMOSFETS and one operational amplifier. Suchan amplifier will be reliable because MOSFETShave good safe-operating area properties, andthere are no small semiconductors that requirehigh voltage ratings. The topology is that of anop-amp directly driving a grounded-source com-plementary Class-B MOSFET output stage. Thecenter-tapped power supply for such an amplifieris floating, so each channel must have a sepa-rate supply, and there must be a small ±15-voltsupply for the op-amp as well. We discuss thedesign and study the amplifier with simulationsand an experimental prototype. It achieves goodperformance. Engineering Brief 13

16:00

EB2-6 Param etric Study of M agnet System Topologies for M icrospeakers— Holger Hiebel,NXP Semiconductors Austria GmbH, “SoundSolutions,” Vienna, Austria

A parametric simulation study was done in orderto compare 3 different electrodynamic magnetsystem topologies: center-magnet, ring-magnet,and double-magnet (combined inner and outermagnet ring). The study is based on a row ofsimulations of the BL-factor (FEM and coil wind-ing calculation), moving mass and effective radi-ating area of a microspeaker design where theinner coil diameter was changed. The depen-dency of the sound pressure level, the electricalquality factor, and the resonance frequency in aclosed box on the inner coil diameter were

derived to yield comparison charts for the 3topologies. Engineering Brief 14

16:00

EB2-7 Signal Processing Applications for Autom otive Audio— Robert Cadena, VisteonCorporation, Van Buren Township, MI, USA

The automotive environment presents uniquechallenges to audio playback. Modern technolo-gy such as smart phones and high-efficiencypower trains have created a new generation ofaudio sources and the need for audio systemscapable of dealing with them. This EngineeringBrief will introduce emerging automotive audioalgorithms (listed below) and their design considerations. Engineering Brief 15

Sunday, M ay 15 16:00 Room Saint JulienTechnical Com m ittee M eeting on Network Audio System s

W orkshop 14 Sunday, M ay 1516:30 – 18:00 Room 2

CHAM PAGNE LIFESTYLE ON A BEER BUDGET

Chairs: Sim on Bishop, UKRichard M errick, UK

Simon Bishop and Richard Merrick contrast location audio acquisition from both ends of the budget spectrum.Discussing and comparing techniques and tricks collec-tively acquired from 60 man-years of experience, frombeing awash with money and equipment to begging, bor-rowing, and hunting on eBay. Light-hearted, but informa-tive, both will prove that it’s not the size of your nail, butthe skill of the guy with the hammer! Regardless of your budget, we’ll prove you still have to

have STANDARDS!!

Student/Career Developm ent EventRECORDING COM PETITION— PART 2 Sunday, May 15, 16:30 – 18:30Room 3

The Student Recording Competition is a highlight at eachconvention. A distinguished panel of judges participatesin critiquing finalists of each category in an interactivepresentation and discussion. This event presents stereoand surround recordings in these categories:

• Traditional Multi-Track Studio Recording 16:30 to17:30

• Modern Multi-Track Studio Recording 17:30 to 18:30

The top three finalists in each category, as identified byour judges, present a short summary of their productionintentions and the key recording and mix techniquesused to realize their goals. They then play their projectsfor all who attend. Meritorious awards are determinedhere and will be presented at the closing Student Dele-gate Assembly Meeting (SDA-2) on Monday afternoon.The competition is a great chance to hear the work ofyour fellow students at other educational institutions.Everyone learns from the judges’ comments even if yourproject isn’t one of the finalists, and it’s a great chance tomeet other students and faculty members. �

Page 36: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

36 Audio Engineering Society 130th Convention Program, 2011 Spring

W orkshop 15 Sunday, M ay 1517:00 – 18:00 Room 4

GREEN ISSUES AND THE FUTURE OF TOURINGSOUND

Chair: Phil Anthony, Martin Audio

Panelists: Catherine Langabeer, Julie’s BicycleClaudio Lastrucci, PowersoftAndy Mead, Firefly Solar

A 2007 study by the University of Oxford’s EnvironmentalChange Institute, commissioned by music industry“greening” body Julie’s Bicycle, found that the annualoverall Greenhouse gas emissions for the UK music industry alone were approximately half a million tons ofcarbon. Of this total, three quarters of the impact were derived from activities associated with live musicperformances.With such a large share of the UK music industry’s

emissions, finding environmentally friendly solutions tothe live sector’s carbon and environmental impacts is anongoing concern—and an opportunity to get ahead ofartist and audience demand, more volatile energy costswith all the knock-on impacts, and emerging regulation.Phil Anthony from Martin Audio, manufacturers of pro-

fessional audio sound products, will chair this sessionand will also be explaining how they reduced the carbonfootprint of operating their new touring sound system.Alongside Phil, panelists from Julie’s Bicycle, a non-profitcompany working with the creative industries to coordi-nate best practice in sustainability and greenhouse gasemission reduction; Powersoft, the developers of efficient“Green Audio Power” amplifiers; and Firefly Solar, whoseprovision of AV equipment and solar generators aims tomove the industry towards sustainable power solutions,will present information on how members of the live com-munity are implementing green initiatives in their events,tours, and technology, many at low or no additional cost.

Tutorial 9 Sunday, M ay 1517:00 – 18:00 Room 5

POW ER LIM ITATIONS IN M ICRO TRANSDUCERS

Presenter: Bo Rohde Pedersen, Rohde ConsultingAndreas Eberhardt Sørensen, Pulse HVT

Voice coil temperature is typically a HOT topic when discussing power handling of transducers and manymanufactures use IEC tests to specify and rank the inputpower a specific transducer can endure. For microspeakers this situation is also the case but as we aredealing with small coils and thin wires, the need to knowand map the thermal factors are even more important. The factors are the input power, losses, and cooling

effects that happen around the voice coil. In this sessionthese factors will be mapped and investigated as well asdemonstrated. The first part of the session holds a minitutorial in using an infrared thermal camera to measurethe voice coil temperature as well as a discussion onwhat must be in focus when setting up such a system tomeasure micro transducer thermal aspects. As the temperature increases our tests shows temper-

atures of up to 200°C in the voice coil and as a result theRdc (Dc resistance) also increases and the resultingpower dissipated decreases. The theoretical thermalmodel of the transducer will be examined and revised to

fit the findings of our research. The used thermal modelis of second order and includes nonlinearities to describethe change of the voice coil resistance and the velocitydepending cooling of the voice coil. Under the test are two micro loudspeakers one with a

round and one with a rectangular voice coil because therequest for higher output and lower resonance frequen-cies have moved the development from small round coilsto larger (relative to the transducer) rectangular coils.The effect of this transition will be documented and thedifferences as well as the change in the thermal factorswill be explained. We want to prove with this session thatthe trend has an overall positive effect on the overallthermal behavior of the voice coil and the micro trans-ducer’s capability to endure higher input power. Finally this tutorial should end with a discussion of the

factual situation in the micro transducer market that thecustomers are asking for more power from the driving amplifier and first question is; will the micro transducer beable to handle that input power? Today the amplifiers deliver 1-1.5 W in 8 Ohms with a build in voltage step upbut customers are asking for 2-3 W and micro transducerswill have to be designed to handle input abuse of thisscale. Once this power handling is in place the next ques-tion will be; what is gained on the output side of the microtransducer?

Sunday, M ay 15 17:00 Room Saint JulienTechnical Com m ittee M eeting on Sem antic AudioAnalysis

Sunday, M ay 15 17:00 Room M outon CadetStandards Com m ittee M eeting SC-04-01 onAcoustics and Sound Source M odeling

Special EventBANQUETSunday, May 15, 18:45 – 22:30The Musical Museum 399 High Street Brentford, Middlesex

This year the Banquet will take place at The Musical Museum containing one of the world’s foremost collec-tions of automatic instruments. From the tiny clockworkMusical Box to the self playing “Mighty Wurlitzer,” thecollection embraces an impressive and comprehensivearray of sophisticated reproducing pianos, orchestrions,orchestrelles, residence organs, and violin players. Onthe ground floor there are 4 galleries to display workinginstruments; upstairs is a concert hall seating 230, com-plete with stage and, of course, an orchestra pit fromwhich the Wurlitzer console will rise to entertain you, justas it did in the cinema in the 1930s.While having a pre-dinner drink, you will be able to

wander round the exhibits and even listen to some ofthem. There will also be time at the end of the evening tolook round if you missed it before dinner.The ticket price includes all food and drinks and the

bus to the museum and back. There are limited spacesat this event, so book now to guarantee a place! The buswill leave the Novotel at 18:45 and return approximatelyhalf an hour later for the second group. The bus will beavailable at 22:30 for those needing an early night andwill return to collect the final group approximately half-an-hour later.

£ 60 for AES members and nonmembersTickets will be available at the Special Events desk.

Technical Program

Page 37: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 37

Session P18 M onday, M ay 16 09:00 – 13:00Room 1

SOURCE ENHANCEM ENT

Chair: John M ourjopoulos, University of Patras, Patras, Greece

09:00

P18-1 Dereverberation in the Spatial Audio CodingDom ain— Markus Kallinger,1 Giovanni Del Galdo,1 Fabian Kuech,1 Oliver Thiergart21Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany 2International Audio Laboratories, Erlangen, Germany

Spatial audio coding techniques are fundamentalfor recording, coding, and rendering spatialsound. Especially in teleconferencing, spatialsound reproduction helps in making a conversa-tion feel more natural reducing the listening effort. However, if the acoustic sources are farfrom the microphone arrangement, the renderedsound may easily be corrupted by reverberation.This paper proposes a dereverberation tech-nique, which is integrated efficiently into the parameter domain of Directional Audio Coding(DirAC). Utilizing DirAC’s signal model we derivea parametric method to reduce the diffuse por-tion of the recorded signal. Instrumental qualitymeasures and informal listening tests confirmthe efficiency of the proposed method to rendera spatial sound scene less reverberant withoutintroducing noticeable artifacts. Convention Paper 8429

09:30

P18-2 Blind Single-Channel Dereverberation for M usic Post-Processing— Alexandros Tsilfidis,John Mourjopoulos, University of Patras, Patras,Greece

Although dereverberation can be useful in manyaudio applications, such techniques often intro-duce artifacts that are unacceptable in audio engineering scenarios. Recently, the authorshave proposed a novel dereverberation approach, suitable for both speech and musicsignals, based on perceptual reverberation mod-eling. Here, the method is fine-tuned for soundengineering applications and tested for both nat-ural and artificial reverberation. The results showthat the proposed technique efficiently suppress-es reverberation without introducing significantprocessing artifacts and the method is appropri-ate for the post-processing of music recordings. Convention Paper 8430

10:00

P18-3 Joint Noise and Reverberation Suppressionfor Speech Applications— Elias K. Kokkinis,Alexandros Tsilfidis, Eleftheria Georganti, JohnMourjopoulos, University of Patras, Patras,Greece

An algorithm for joint suppression of noise andreverberation from speech signals is presented.

The method requires a handclap recording thatprecedes speech activity. A running kurtosistechnique is applied in order to extract an esti-mation of the late reflections of the room impulseresponse from the clap while a moving averagefilter is employed for the noise estimation. More-over, the excitation signal derived from the Lin-ear Prediction (LP) analysis of the noisy speechalong with the estimated power spectrum of thelate reflections are used to suppress late rever-beration through spectral subtraction while aWiener filter compensates for the ambient noise.A gain magnitude regularization step is also implemented to reduce overestimation errors.Objective and subjective results show that theproposed method achieves significant speechenhancement in all tested cases. Convention Paper 8431

10:30

P18-4 System Identification for Acoustic Echo Cancellation Using Stepped Sine M ethod Related to FFT Size— TaeJin Park, Seung Kim,Koeng-mo Sung, Seoul National University,Seoul, Korea

A stepped sine method was applied for systemidentification to cancel acoustic echoes in aspeaker phone system that has been widelyused in recent mobile devices. We applied thestepped sine method by regarding DiscreteFourier Transform (DFT) as a uniform-DFT filterbank. By using this stepped sine method, wewere able to obtain more accurate and detailedcharacteristics of non-linearity, dependent on theamplitude and frequency of speech. We storedthe non-linearity information into linear transformmatrices and estimated the responses of themobile device speaker. The proposed methodexhibits higher echo return loss enhancement(ERLE) and increased correlation when com-pared to the conventional method. Convention Paper 8432

11:00

P18-5 Using Spaced M icrophones with DirectionalAudio Coding— Mikko-Ville Laitinen,1 FabianKuech,2 Ville Pulkki11Aalto University, Espoo, Finland2Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

Directional audio coding (DirAC) is a perceptual-ly motivated method to reproduce spatial sound,which typically uses input from first-order coinci-dent microphone arrays. This paper presents amethod to additionally use spaced microphonesetups with DirAC. It is shown that since diffusesound is incoherent between spatially separatedmicrophones at certain frequencies, no decorre-lation in DirAC processing is needed, which improves the perceived quality. Furthermore, thedirections of sound sources are perceived to bemore accurate and stable. Convention Paper 8433

11:30

P18-6 Param eter Estim ation in Directional AudioCoding Using Linear M icrophone Arrays—Oliver Thiergart,1 Michael Kratschmer,2 Markus �

Page 38: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

38 Audio Engineering Society 130th Convention Program, 2011 Spring

Kallinger,2 Giovanni Del Galdo21International Audio Laboratories, Erlangen, Germany2Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

Directional Audio Coding (DirAC) provides an efficient description of spatial sound in terms offew audio downmix signals and parametric sideinformation, namely the Direction Of Arrival(DOA) and the diffuseness of the sound. Tradi-tionally, the parameters are derived based onthe active sound intensity vector that is often determined via 2-D or 3-D microphone grids.Adapting this estimation strategy to linear arrays,which are preferred in various applications dueto form factor constraints, yields comparativelypoor results. This paper proposes to replace theintensity based DOA estimation in DirAC by aspecific estimation of signal parameters via rota-tional invariant techniques, namely ESPRIT.Moreover, a diffuseness estimator exploiting thecorrelation between the array sensors is pre-sented. Experimental results show that theDirAC concept can be applied in practice also inconjunction with linear arrays. Convention Paper 8434

12:00

P18-7 Extraction of Voice from the Center of theStereo Im age— Aki Härmä, Munhum Park,Philips Research Laboratories Eindhoven, Eindhoven, The Netherlands

Detection and extraction of the center vocalsource is important for many audio format con-version and manipulation applications. First, westudy some generic properties of stereo signalscontaining sources panned exactly to the centerof the stereo image and propose an algorithm forthe separation of a stereo audio signal into acenter and side channels. In the 128th AESConvention a paper was presented (ConventionPaper 8071) on listening tests comparing theperception of the widths of the stereo images ofsynthetic signal combinations. In this paper thesame experiment is repeated with real stereoaudio content using the proposed center separa-tion algorithm. The main observation is thatthere are clear differences in the results. Thereasons for the differences are discussed in lightof the literature and analysis of the test signalsand their binaural characteristics in the listeningtest setup. Convention Paper 8435

12:30

P18-8 Directional Segm entation of Stereo Audio viaBest Basis Search of Com plex W avelet Packets— Jeremy Wells, University of York,York, North Yorkshire, UK

A system for dividing time-coincident stereo audio signals into directional segments is present-ed. The purpose is to give greater flexibility in thepresentation of spatial information when two-chan-nel audio is reproduced. For example, different in-ter-channel time shifts could be introduced for seg-ments depending on their direction. A novel aspectof this work is the use of complex wavelet packetanalysis, along with “best basis” selection, in an

attempt to identify time-frequency atoms that belong to only one segment. The system is described, with reference to the relevant underly-ing theory, and the quality of its output for the bestbases from complex wavelet packets is comparedwith methods based on more established analysisand processing methods.Convention Paper 8436

Session P19 M onday, M ay 16 09:00 – 10:30Foyer

POSTERS: ROOM ACOUSTICS

09:00

P19-1 System Identification of Equalized Room Im pulse Responses by an Acoustic EchoCanceller Using Proportionate LM S Algorithm s— Stefan Goetze,1 Feifei Xiong,1 JanOle Jungmann,2 Markus Kallinger,3 Karl-DirkKammeyer,4 Alfred Mertins21Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany 2University of Lübeck, Lübeck, Germany 3University of Oldenburg, Oldenburg, Germany (now with Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany) 4University of Bremen, Bremen, Germany

Hands-free telecommunication systems usuallyemploy subsystems for acoustic echo cancella-tion (AEC), listening-room compensation (LRC),and noise reduction in combination. This contri-bution discusses a combined system of a two-stage AEC filter and an LRC filter to remove reverberation introduced by the listening room.An inner AEC is used to achieve initial echo re-duction and to perform system identificationneeded for the LRC filter. An additional outerAEC is used to further reduce the acousticechoes. The performance of proportionate filterupdate schemes such as the so-called propor-tionate normalized least mean squares algorithm(PNLMS) or the improved PNLMS (IPNLMS) forsystem identification of equalized impulse response (IR) are shown and the mutual influ-ences of the subsystems are analyzed. If theLRC filter succeeds in shaping a sparse overallIR for the concatenated system of LRC filter androom impulse response (RIR) the PNLMS per-forms best since it is optimized for the identifica-tion of sparse IRs. However, the equalizationmay be imperfect due to channel estimation er-rors in periods of convergence and due to theso-called tail-effect of AEC, i.e., the fact that onlythe first part of a RIR is identified in practicalsystems. The IPNLMS is more appropriate inthis case to identify the equalized IR. Convention Paper 8437

09:00

P19-2 Virtual Room Acoustics : A Com parison ofTechniques for Com puting 3-D-FDTDSchem es Using CUDA— Craig Webb, StefanBilbao, University of Edinburgh, Edinburgh,Scotland, UK

High fidelity virtual room acoustics can be

Technical Program

Page 39: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 39

approached through direct numerical simulationof wave propagation in a defined space. Three-dimensional Finite Difference Time Domainschemes can be employed and adapt well to aparallel programming model. This paper exam-ines the various approaches for calculatingthese schemes using the Nvidia CUDA architec-ture. We test the different possibilities for struc-turing computation, based on the available memory objects and thread-blocking model. Astandard test simulation is computed at doubleprecision under different arrangements. We findthat a 2-D extended tile blocking system, com-bined with shared memory usage, produces thefastest computation for our scheme. However,shared memory usage is only marginally fasterthan direct global memory access, using the lat-est FERMI GPUs. Convention Paper 8438

09:00

P19-3 Acoustic Param eters of Chosen OrthodoxChurches Overview and Prelim inary Psychoacoustic Tests Using Choral M usic—Pawel Maleck, Jerzy Wiciak, AGH University of Science and Technology, Kraków, Poland

A lot of acoustic research was done for RomanCatholic churches and because of some differ-ences in traditions and culture comparing to Orthodoxy, it is not the best idea to use itsacoustic estimators for Orthodox churches. Thepaper shows results of measurements in someOrthodox churches in Poland and the proposalof psychoacoustic tests using the convolutiontechnique, which would allow formulating thenew acoustic outlines for Orthodox churches.The research has been done especially consid-ering choral music, which is an inseparable partof Eastern Christian culture; so as a test, therewere Orthodox choir sound samples recorded inan anechoic chamber convoluted with impulseresponse of measured churches. Convention Paper 8439

09:00

P19-4 A Com parative Study of Various “Optim um ”Room Dim ension Ratios— John Sarris, Are-taieio University Hospital, Athens, Greece

Various “optimum” room dimension ratios thathave been proposed in the literature are studiedand compared. Since each proposal is based ona different criterion, independent objective mea-sures of the acoustic quality of the various roomratios are applied in this paper. The moststraightforward metric is the flatness of a room’scorner to corner frequency response, but sincethis is not representative of the variations of thesound pressure within the closed space, differ-ent metrics are employed to quantify these varia-tions. Simulation results are presented to evalu-ate the effectiveness of the various ratios for thecase of a small and a larger room. Convention Paper 8440

09:00

P19-5 Perception of Spatial Distribution of RoomResponse Reproduced Using Different

Loudspeaker Setups— Javier Gomez,1Rafael C. D. Paiva,1,2 Kai Saksela,1 Thomas Svedström,1 Ville Pulkki11Aalto University, Espoo, Finland2Nokia Technology Institute INdT, Brazil

A listening test was conducted to assess the effect that different direct-to-reverberant ratios,loudspeaker setups, and reverberation timeshave on the directional perception of a syntheticroom response. The results show that the per-ceived spatial distribution of reverberation is dependent on the speaker setup, on the rever-beration time, and on the listener. Additionally,they show that a small amount of direct sound,even when barely masked by reverberation,shifts the overall perception to the front. Convention Paper 8441

09:00

P19-6 M odal Analysis and Sound Field Sim ulationof Vibration Panels in a Free Sound Field and in a Rectangular Enclosure— Kazuhiko Kawahara,1 Akihiro Sonogi,1 Shin-ya Sato1,21Kyushu University, Fukuoka, Japan2Nittobo Acoustic Engineering Co., Ltd., Tokyo, Japan

Several types of implementation of panel loud-speakers, for example, distributed mode loud-speaker (DML), are proposed. The diaphragm ofa DML is thought to behave like a bending plate.The diaphragm of an electrostatic loudspeaker isthought to move as a rigid plate. What is the dif-ference of an acoustical sound field generationfeature with each implementation? In this paperwe used models of same dimensional rectangu-lar panel. One is a rigid diaphragm. Another is abending panel that has an actuator in the centerof it. We made simulations of directivity andsound pressure distribution in a rectangularroom. We made simulations of sound pressuredistribution in a rectangular room. We couldshow less coherent SPL distribution in the caseof a bending panel. Convention Paper 8442

09:00

P19-7 Sim ple Room Acoustic Analysis Using a 2.5Dim ensional Approach— Patrick Macey, PACSYS Limited, Nottingham, UK

Cavity modes of a finite bounded region withrigid boundaries can be used to compute thesteady state harmonic response for excitation ofan acoustic cavity by a point source. In Cuboiddomains, with analytical modes, this is straight-forward. For more general regions, determininga set of orthonormal modes is more difficult. Thefinite element method could be used for arbitrary3-D regions. The current work investigates a hybrid numerical/analytical method, applicable torooms of constant height, but arbitrary crosssection. A 2-D FE analysis is used to computethe cross-section modes/frequencies. Three-dimensional modes are constructed as an outerproduct of the cross section modes and 1-Dmodes for the height. Comparison is made withBEM computations. Convention Paper 8443

Page 40: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

40 Audio Engineering Society 130th Convention Program, 2011 Spring

09:00

P19-8 Early Reflections Design for Natural StereoSound Listening— Chul-Jae (Jay) Yoo, RADSONE, Inc., Gyeonggi-do, Korea

For natural stereo sound in rooms, some consid-erations must be followed. For example, to maxi-mize time density, Feedback Delay Network(FDN) is used in which the feedback matrix hasevery element consisting of absolute value 1. Tomodel faster high frequency decay according tothe increasing number of reflections, absorptionfilters after delay lines in the FDN are generallyused. Mixing matrix for uncorrelated L/R outputis also used. In this paper an elaborate designscheme of the first early reflection portion in ear-ly reflections was proposed to obtain higherIACC values than those of conventional sys-tems, resulting in a more precise sound sourceimage with ambience. Convention Paper 8444

W orkshop 16 M onday, M ay 1609:00 – 11:00 Room 3

UP AND DOW NM IXING— APPROACHES, SOLUTIONS

Chair: Florian Cam erer, ORF – Austrian TV

Panelists: Tom Allom, PenteoAntoine Hurtado, IsostemClemens Par, SwissaudecPieter Schillebeeckx, SoundfieldChristian Struck, Lawo

Upmix solutions are coming to the market with increasingspeed. Broadcasters need those solutions especially forHD services to fulfill the expectations of the audience forconsistent surround sound presentation. Various approaches can be found that try to provide a convincing illusion of a discrete surround sound signal. In this work-shop, 6 company representatives will present their prod-ucts in a short introduction. Then their upmix results of 2stereo tracks previously given to them will be played anddiscussed. Also, downmix aspects will be touched as this isalso a vital ingredient of a suitable surround sound signal.

Tutorial 10 M onday, M ay 1609:00 – 10:30 Room 4

ANALOG STANDARDS— W HY ARE THERE SOM ANY?

Presenter: Sean Davies, S.W. Davies Ltd., Aylesbury, UK

“We like standards, that’s why we have so many.” (variously attributed)

Although most current audio work is in the digital domainthere remains a vast legacy of recordings made in ana-log, access to which will be required for some consider-able time to come. Knowledge is therefore required ofthe standards applicable to each recording, also the rele-vant technical literature.This tutorial will seek to explain how and why the

seemingly perverse plurality of standards arose. Fieldscovered will be: (1) Impedances; why 30 ohms (“R”),200R, 300R, 600R; (2) Levels; the TU, the Neper, theDecibel; (3) Metering, the VU versus the PPM; (4) Discrecordings; speeds, equalization curves; (5) Tape

recordings, speeds, equalization curves; (6) A bag ofmiscellaneous horrors such as color codes, connectors,and so on.

Tutorial 11 M onday, M ay 1609:00 – 10:15 Room 5

ROADIE— HOW TO GAIN AND KEEP A CAREER IN THE LIVE M USIC INDUSTRY

Presenter: Andy Reynolds, Owner, LiveMusicBusiness.com

Live music is a huge industry—65 million people aroundthe world attended a concert in 2010, with gross ticketsales of over $3 billion. Every single one of those con-certs needs a sound reinforcement system and audio engineers to rig the system and help enhance the soundcoming from stage.But how do you gain, and more importantly keep, an

audio job in the live music business? How do you differ-entiate between the different audio roles on a show orevent? What specialized skills are relevant to concert audio engineering?This tutorial will examine the role of the audio engineer

in modern concert production, look at routes into the in-dustry, offer advice on finding and dealing with clients aswell as the technical skills and knowledge that every liveaudio engineer should be aware of.

Special EventAES/APRS/DV247 EVENT: TALKBACK-PRO 3THE M YTHS AND M YSTERIES OF DSPMonday, May 16, 09:00 – 10:45Room 2

Moderator: W es M aebe

Where would we be without DSP? Given the colossaland ubiquitous influence DSP processing has on all ourfunctionality, the answer probably would be “way back inthe analog world” but is there good DSP and bad DSP?What should our priorities be when it comes to spendingon A/D – D/A, soft auxiliaries, tracking and mixing pro-grams, and master/finalizing tools.Issues with interfaces, latency, compression, and plain

taste will always plague the subjective judgments of engineers and producers. Is there a “normal” digital baseupon which creativity can grow unhindered by the pitfallsof mismatched technologies, uneven standards, and de-liberate hurdles to interoperability.Clearing the cloudy mists that envelop proprietary

toys will be a panel of technologists, enthusiasts, andpractitioners.

M onday, M ay 16 09:00 Room Saint JulienStandards Com m ittee M eeting AESSC Plenary

Tutorial 12 M onday, M ay 1610:30 – 12:00 Room 5

YOU, A ROOM , AND A PAIR OF HEADPHONES: A LESSON IN BINAURAL AUDIO

Presenter: Ben Supper, Focusrite

Stereo recordings that are intended for loudspeaker reproduction do not work properly over headphones.Methods for converting stereo for headphone monitoringhave become increasingly elaborate as the availability ofdigital signal processing has increased. The challenge is

Technical Program

Page 41: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 130th Convention Program, 2011 Spring 41

to arrive at a stereophonic effect that is as the composer,engineer, and producer intended: stable, with correctperspective, out-of-head localization, and no excessivepolarization of the stereo image. To do these things successfully is not as trivial as may

first be assumed. No two listeners are identical; choice ofheadphones are a matter of taste. Moreover, the illusionneeds reverberation, but this must not get in the way ofthe music. This tutorial picks a path through the peculiarities of

the human hearing system, discussing aspects of ourperception of loudspeakers and rooms, and how closewe can come to understanding and modeling them com-pletely. Insights will be given into how the human audito-ry system can be made to believe a virtually-renderedsound field, how to do this cheaply, and why we don’tneed to cram the entire gamut of reality onto a tiny pieceof silicon.

M onday, M ay 16 10:30 Room M outon CadetTechnical Com m ittee M eeting on Fiber Optics for Audio

W orkshop 17 M onday, M ay 1611:00 – 13:00 Room 2

(W HY) DOES LIVE SOUND HAVE A BAD NAM E?

Chair: M ark Bailey, UK

Panelists: Nichola Bailey, Nichola Bailey School of VoiceBob Lee, QSC Audio ProductsDavid Millward, Freelance FOH EngineerIan Nelson, FOH Engineer, Adlib AudioBruce Olson, Acoustic Consultant, ADA AcousticsTuomo Tolonen, Shure Distribution UK

In this session we will explore the aspects that contributeto good sound and those that conspire against it. Ourpanel of practitioners, users and manufacturers will dis-cuss and debate matters determined to be the scourge(and saviour) of live audio.In addition to discussion, we will attempt to practically

demonstrate some of the points. We look forward to see-ing you there!

Tutorial 13 M onday, M ay 1611:00 – 13:30 Room 4

HOT AND NONLINEAR— LOUDSPEAKERS AT HIGH AM PLITUDES

Presenter: W olfgang Klippel, Klippel GmbH, Dresden, Germany

Nonlinearities inherent in electro-dynamical transducerand the heating of the voice coil and magnetic systemlimit the acoustical output, generate distortion and othersymptoms at high amplitudes. The large signal perfor-mance is the result of a deterministic process and predictable by lumped parameter models comprisingnonlinear and thermal elements. The tutorial gives an in-troduction into the fundamentals, shows alternative mea-surement techniques and discusses the relationship between the physical causes and symptoms dependingon the properties of the particular stimulus (test signal,music). Selection of meaningful measurements, the inter-pretation of the results and practical loudspeaker diag-nostic is the main objective of the tutorial, which is impor-

tant for designing small and light transducers producingthe desired output at high efficiency and reasonable cost.

W orkshop 18 M onday, M ay 1611:30 – 13:00 Room 3

THE EUROVISION SONGCONTEST 2010 IN OSLO— A CASE STUDY

Co-chairs: Gaute Nistov, NRK OsloIna von Lucas, NRK Oslo

The Eurovision Songcontest is the biggest live transmis-sion in Europe. The demands on all aspects of produc-tion are daunting and the bar has been raised in pastyears to challenging heights. Sound Supervisor GauteNistov tells how he stepped up to the plate, and Ina vanLucas will elaborate on the complex communication androuting system installed.

M onday, M ay 16 12:30 Room M outon CadetTechnical Council M eeting

Session P20 M onday, M ay 16 14:00 – 16:00 Room 1

SUBJECTIVE EVALUATION

Chair: Bruno Fazenda, University of Salford, Salford, UK

14:00

P20-1 Selection of Audio Stim uli for Listening Tests— Jonas Ekeroot,1 Jan Berg,1 Arne Nykänen21Luleå University of Technology, Piteå, Sweden2Luleå University of Technology, Luleå, Sweden

Two listening test methods in common use forthe subjective assessment of audio quality arethe ITU-R recommendations BS.1116-1 for smallimpairments and BS.1534-1 (MUSHRA) for inter-mediate quality. They stipulate the usage of onlycritical audio stimuli (BS.1116-1) to reveal differ-ences among systems under test, or critical au-dio stimuli that represents typical audio materialin a specific application context (MUSHRA). Apoor selection of stimuli can cause experimentalinsensitivity and introduce bias, leading to incon-clusive results. At the same time this selectionprocess is time-consuming and labor-intensive,and is difficult to conduct in a systematic way.This paper reviews and discusses the selectionof audio stimuli in listening test-related studies. Convention Paper 8445

14:30

P20-2 A Listening Test System for Autom otive Audio: PART 5 – The Influence of ListeningEnvironm ent on the Realism of Binaural Reproduction— Francois Postel,1,2 PatrickHegarty,3 Søren Bech31Bang & Olufsen A/S, Struer, Denmark (now at Arkamys, Paris, France) 2Department of Mechanical Vibrations and Acoustics, UTC, Compiegne, France 3Bang & Olufsen A/S, Struer, Denmark

Binaural technology is used to capture elementsof an in-car sound field and reproduce them over �

Page 42: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

42 Audio Engineering Society 122nd Convention Program, 2007 Spring

headphones at another place and time. An experiment to test the influence of the listeningenvironment on the realism of such a binauralreproduction is described. A panel of 12 trainedlisteners rated a range of stimuli for 6 elicited attributes of sound quality. Ratings are made forthe actual sound field in the test vehicle, for abinaural reproduction in the same vehicle, andfor a binaural reproduction in a listening room.The results show that the tested binaural repro-duction system is able to preserve either therank order or the perceived magnitudes of theimpressions of the sound field for the attributesPrecision, Treble, Stereo impression, Bass, andReverberation, independent of the listening environment. Convention Paper 8446

15:00

P20-3 Differences in Preference for Noise Reduction Strength between Individual Listeners— Rolph Houben,1 Tjeerd M. H. Dijkstra,2,3 Wouter A. Dreschler11Academic Medical Center, Amsterdam, The Netherlands 2Radboud University, Nijmegen, The Netherlands 3Technical University Eindhoven, The Netherlands

There is little research on user preference for dif-ferent settings of noise reduction, especially forindividual users. We therefore measured individ-ual preference for pairs of audio streams differ-ing in noise reduction strength. Data was ana-lyzed with a logistic probability model that isbased on a quadratic preference utility function.This allowed for an estimate of the optimal set-ting for each individual subject. For five out often subjects the optimized setting differed signifi-cantly from the optimum obtained for thegrouped data. However, the predicted prefer-ence for the individual optimum (60%) was onlyslightly higher than chance level (50%), whichcan be considered as too weak to advocate individualization of noise reduction for these nor-mally hearing subjects. However, in hearing-impaired subjects this may be different. Convention Paper 8447

15:30

P20-4 Assessm ent of Stereo to Surround Upm ixersfor Broadcasting— David Marston, BBC R&D,London, UK

Broadcasters are now transmitting 5.1-channelsurround sound as part of their HD TV services.However, since much of the audio content inoriginal program material is 2-channel stereo,broadcasters are required to switch between thetwo formats. Switching between 2- and 5.1-channel formats can cause problems in decod-ing the content, including switching artifacts andloudness changes. In general it is preferable totransmit all program audio in 5.1 surround, andthis can be achieved by automatically upmixingany 2-channel stereo content to 5.1 format priorto broadcasting. This paper reports on tests

designed to assess the subjective performanceof a selection of upmixers for use in the broad-cast chain. Convention Paper 8448

Student/Career Developm ent EventSTU D EN T D ELEG A TE A SSEM B LY M EETIN G —PART 2Monday, May 16, 12:15 – 13:45Room 5

At this meeting the SDA will elect a new vice chair. Onevote will be cast by the designated representative fromeach recognized AES student section in the Europe andInternational Regions. Judges’ comments and awardswill be presented for the Recording Competitions. Plansfor future student activities at local, regional, and interna-tional levels will be summarized.

Session P21 M onday, M ay 16 14:00 – 15:30 Foyer

POSTERS: PROCESSING AND ANALYSIS

14:00

P21-1 Autom atic Classification of M usical AudioSignals Em ploying M achine Learning Approach— Pawel Zwan, Bozena Kostek, AdamKupryjanow, Gdansk University of Technology,Gdansk, Poland

This paper presents a thorough analysis of auto-matic classification applied to musical audio sig-nals. The classification is based on a chosen setof machine learning algorithms. A database of60 music composers/performers was preparedfor the purpose of the described research. Foreach of the musicians, 15 to 20 music pieceswere collected. All the pieces were partitionedinto 20 segments and then parameterized. Thefeature vector consisted of 171 parameters, including MPEG-7 low-level descriptors and mel-frequency cepstral coefficients (MFCC) comple-mented with time-related dedicated parameters.The task of the classifier was to recognize thecomposer/performer and to properly categorizea selected piece of music. The paper also pre-sents and discusses the results of classification. Convention Paper 8449

14:00

P21-2 Evaluation of Onset Detection Algorithm s in Popular Polyphonic M usic on a LargeScale Database— Stephan Hübler, RüdigerHoffmann, Technische Universität Dresden,Dresden Germany

This paper introduces a large database of popu-lar polyphonic music containing drums (10,238onsets) for the evaluation of onset detection algorithms. The database has been manuallyannotated by expert listeners. The inter-ratervariability leads to an understanding of inter-human variations. Four common detection func-tions are investigated: spectral difference, high

Technical Program

Page 43: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

Audio Engineering Society 122nd Convention Program, 2007 Spring 43

frequency content, phase deviation, and the psy-choacoustic one of Klapuri. We present an addi-tional detection function based on the MPEG7feature audio spectrum envelope. An adaptivepeak picker determines the onsets that are com-pared with the manual labels. Results show thatdetection functions based on spectral differenceobtain observable better results. The study pro-vides a thorough investigation of onset detectionalgorithms in popular polyphonic music. Convention Paper 8450 [Paper presented by Till Moritz Essinger]

14:00

P21-3 Using the Viterbi Algorithm for Error Correction in an Autocorrelation-Based PitchDetector— Bob Coover, Gracenote, Inc.,Emeryville, CA, USA

An autocorrelation-based method for detectingthe fundamental pitch of an audio signal is pre-sented in which the Viterbi algorithm is used inplace of the error correction portion of the detec-tor. The Viterbi algorithm is used to locate themost likely pitch path through the audio file. Thismethod is compared to a typical heuristic andmedian filtering-based error correction approachthat has been historically used in this type of algorithm. The Viterbi algorithm results are sig-nificantly better than the typical error correctionmethods for choosing the best and most plausi-ble path through the pitch estimates. Convention Paper 8451

14:00

P21-4 Spectral Equalization for GHA-AppliedRestoration to Dam aged Historical 78 rpmRecords— Teruo Muraoka, Takahiro Miura,Tohru Ifukube, The University of Tokyo, Tokyo,Japan

The authors have been engaged in the researchof In-harmonic Frequency Analysis “GHA,” whichenables the separation of desired signal-compo-nents and noise. Its primary purpose has beennoise-reduction. Recently, the authors succeed-ed in conducting GHA in practical time lengthand carried out many sound restorations of his-torical 78 rpm records. Thanks to GHA’s suffi-cient separation of target signal-component fromnoisy objects, the restored signal is noise-less,however its tone quality is unnatural when it isreproduced using current audio equipment. Thisis due to fact that the recorded sounds weretuned to match to audio equipment in that age,therefore spectral equalization is necessary. Inpractice, extreme frequency emphases are required, but it had been impossible because ofthe existences of scratch noise. GHA-appliedrestoration removed theses difficulties, andequalization curve was obtained by comparinglong-term spectrum of restored music with thatof the same recorded music by current musi-cians. Generally equalizations are very compli-cated and were done utilizing a parametricequalizer. Convention Paper 8452

14:00

P21-5 Selection of Approxim ated Activation Functions in Neural Network-Based SoundClassifiers for Digital Hearing Aids— LorenaÁlvarez, Cosme Llerena, Enrique Alexandre,Roberto Gil-Pita, Manuel Rosa-Zurera, University of Alcalá, Alcalá de Henares, Spain

The feasible implementation of signal processingtechniques on hearing aids is constrained by thelimited number of instructions per second to implement the algorithms on the digital signalprocessor the hearing aid is based on. This adversely limits the design of a neural network-based classifier embedded in the hearing aid.Aiming at helping the processor achieve accu-rate enough results, and in the effort of reducingthe number of instructions per second, this paper focuses on exploring the most adequateapproximations for the activation function. Theexperimental work proves that the approximatedneural network-based classifier achieves thesame efficiency as that reached by the “exact”networks (without these approximations), but,this is the crucial point, with the added advan-tage of extremely reducing the computationalcost on the digital signal processor. Convention Paper 8453

14:00

P21-6 Developm ent of M ultiband Dynam ic Range Com pressor Regarding Noise Characteristics— Hoon Heo, Mingu Lee, Seokjin Lee, Koeng-Mo Sung, Seoul National University, Seoul, Korea

It is hard to hear sounds from digital TVs or mobile phones in noisy environments because ofthe masking effect. It could be solved by a sim-ple amplification; however, special process formasking bands will be a further solution in somerestricted situation. We proposed an algorithmnamed “perceptual irrelevant component elimi-nation” using a modified multiband dynamicrange compressor, which does not increase thesignal level and enhances its perceptual signal-to-noise ratio by about 1 dB for speech signalsand about 3 dB for music signals. Convention Paper 8454

14:00

P21-7 Designing Sets of N Doubly Com plem entaryIIR Filters— Alexis Favrot, Christof Faller, Illu-sonic LLC, Lausanne, Switzerland

A filter design procedure is described for obtain-ing sets of N doubly complementary IIR filters forany N and any bandpass frequencies. The Ndoubly complementary IIR filters are built follow-ing a tree-like structure based on pairs of doublycomplementary IIR filters and additional all-passfilters. The sum of all band signals is an all-passfiltered version of the original audio signal. Thecomplementary IIR filters can be used instead ofan analysis filterbank (full rate). The correspond-ing synthesis filterbank is simply the sum of allband signals. The proposed filters enable highquality delay critical audio processing. Convention Paper 8455 �

Page 44: AES130 th Conven tion Pro gram - Audio Engineering SocietyThis year, the Speech Transmission Index cele - brates its 40th anniversary. While the first mea - suring device built in

44 Audio Engineering Society 130th Convention Program, 2011 Spring

W orkshop 19 M onday, M ay 1614:00 – 15:30 Room 4

AUTOM IXING AND ARTIFICIAL INTELLIGENCE

Chair: M ichael “Bink” Knowles, Independent Live Sound Mixer

Panelists: Alice Clifford, Queen Mary University of London, London, UKRon Lorman, Busy Puppy ProductionsJosh Reiss, Queen Mary University of London, London, UK

This workshop will cover two aspects of automixing: thecurrent use of automixers for live sound mixing, and aprojection of the future of artificial intelligence and auto-matic functions applied to live sound. The first part of the workshop will focus on the prob-

lems facing the operator who is mixing panel discussionsand improvisational theater, and the benefits of automix-ers under the control of human hands. The second partwill consider the aspects of live sound mixing that maybe totally automated in the future, including the setting ofinput gain, high-pass filtering, channel equalization, andthe more distant goal of musical balance.

Tutorial 14 M onday, M ay 1614:00 – 15:30 Room 3

SURROUND SOUND FORM ATS

Presenter: Hugh Robjohns, Technical Editor and Trainer, Project Manager

After many years of Surround Sound becoming main-stream in the broadcast business, this tutorial looks backat the basics of formats and technical details of surroundsound, covering diverse matrix techniques like DolbySurround, compression systems like DTS or Dolby AC-3,as well as broadcast-specific coding schemes like DolbyE. Metadata, lineup tones, and mixing issues with respect to the coding systems will be touched upon, too.

Special EventAES/APRS EVENT: I DIDN'T GET W HERE I AM TODAY . . .Monday, May 16, 14:00 – 15:45Room 2

Moderator: Peter Filleul, APRS

Panelists: David FisherElliott RandallDennis Weinreich

Careers are magically unpredictable things. So manysuccessful icons in the audio world will blame serendipi-ty, a chance meeting, or the whims of luck for how theyended up. Others had a clear direction, a specific focus,and have stuck to their original ambition and some swearby training and qualifications.Prominent audio characters describe what influenced

their “career moves” and whether flexibility, a broad out-look and the “maverick” spirit are essential elements toachieving longevity in the audio business and muse on theexpectations generated by audio education and training.A 90 minute seminar moderated by Peter Filleul of the

APRS exploring the differences between the theory andthe practice of the career ambitions of a set of accom-plished audio addicts.

W orkshop 20 M onday, M ay 1615:45 – 17:45 Room 4

LOUDNESS IN BROADCASTING— EBURECOM M ENDATION R 128 STEPS ON THE GAS

Chair: Florian Cam erer, ORF – Austrian TV, and Chairman of EBU PLOUD

Panelists: Yannick Dumartineix, TSR GenevaMatthieu Parmentier, France TelevisionAlessandro Travaglini, Fox Italia

The EBU group PLOUD has now published all their intended documents, centering around the loudness rec-ommendation R 128. Implementation of this concept isunderway, and the loudness normalization paradigm isgaining ground. The end of the “loudness war” is nigh! Atleast in broadcasting that seems to be the prospect. . . . In this workshop, members of PLOUD who already

implemented the concepts of R 128 will talk about theirexperiences and illustrate it with examples.

Tutorial 15 M onday, M ay 1616:00 – 17:30 Room 2

DELAY FX— LET'S NOT W ASTE ANY M ORE TIM E

Presenter: Alex Case, University of Massachusetts at Lowell, Lowell, MA, USA

The humble delay is an essential effect for creating theloudspeaker illusion of music that is better than real, big-ger than life. A broad range of effects—comb filtering,flanging, chorus, echo, reverb, pitch shift, and more—areborn from delay. One device, one plug-in, and you’ve gota vast pallet of production possibilities. Join us for thisthorough discussion of the technical fundamentals andproduction strategies for one of the most powerful signalprocesses in the audio toolbox: delay.

Technical Program