sphere mic

7
PAPERS On the Naturalnessof Two-Channel Stereo Sound* GUNTHER THEILE Institut fiir Rundfunktechnik GmbH, D-8000 Miinchen 45, Germany Psychoacoustic principles are considered in order to enhance the naturalness of the sound image achievable in a conventional two-loudspeaker arrangement. It is found that simulation of depth and space are lacking when the coincident microphone and panpot techniques are applied. To obtain optimum simulation of spatial perspective it is important for the two-loudspeaker signals to have interaural correlation that is as natural as possible. This requirement is met by the so-called sphere microphone, used as a main microphone, associated with the room-related balancing technique, which generates artificial reflections and reverberation from spot-microphone signals. Music recordings confirm that the sphere microphone combines favorable imaging characteristics with regard to spatial perspective, accuracy of localization, and sound color; and that the room-related balancing technique is able to preserve this stereophonic quality. 0 INTRODUCTION image "suitable for a living room"--in other words, the essential problems of loudspeaker stereophony-- A particularly large number of studies have been actually force a deviation from identity. The desired published during the last few years with the goal of natural stereophonic sound image should therefore meet improving the capabilities of current stereophony. This two requirements: it should satisfy aesthetically and it applies to microphone techniques as well as to mixing should match the tonal and spatial properties of the and reproduction techniques, and major overall progress original sound at the same time. can be expected. In this paper possible developments Both requirements will undoubtedly be contradictory in stereophonic recording technique are described, in many situations. However, the compromise, namely, which may improve the "naturalness" of the stereo- optimization by the sound engineer, will be the better, phonic sound image in the playback room. the more flexible the stereophonic recording technique First, how can the desired naturalness of the ster- is and the more accurately the psychoacoustic principles , eophonic sound image be defined? The simplest theorem are understood and taken into account from the technical would be: the reproduced sound image must correspond and artistic points of view. to the original sound image. This definition appears to be problematic because identity can definitely not be I STEREOPHONIC IMAGING OF SPACE required, in principle, as a goal for optimizing the ster- eophonic transmission technique. Identity may con- Which stereophonic loudspeaker signals does the ear ceivably be appropriate for head-referred stereophony, require so that a natural sound image is achieved? What or perhaps for the reproduction of a speaker's voice kind of quality of stereophonic presentation of direction, through loudspeakers, but it is probably appropriate to distance, and spatial impression _ is possible, in prin- a limited extent only for the reproduction of the sound ciple, in the case of conventional two-channel loud- of a large orchestra through loudspeakers. Aesthetic irregularities in the orchestra, poor conditions of room The term spatial impression comprises two attributes of acoustics, as well as the necessity of creating a sound the sound image [1], [2]. (1) reverberance (a temporal slurring of auditory events [2]), which is caused by late reflections and reverberation, and (2) auditory spaciousness (a spatial * Presented at the AES 9th International Conference, De- spreading of auditory events [2]), which is caused by early troit, MI, 1991February 1-2. reflectionsin the range of 10-80-ms delay. J.AudioEng.Soc., Vol.39,No.1O, 1991October 761

Upload: tity-cristina-barcelona

Post on 11-Jan-2016

3 views

Category:

Documents


1 download

DESCRIPTION

sphere mic

TRANSCRIPT

Page 1: Sphere Mic

PAPERS

On the Naturalnessof Two-ChannelStereo Sound*

GUNTHER THEILE

Institut fiir Rundfunktechnik GmbH, D-8000 Miinchen 45, Germany

Psychoacoustic principles are considered in order to enhance the naturalness of thesound image achievable in a conventional two-loudspeaker arrangement. It is foundthat simulation of depth and space are lacking when the coincident microphone andpanpot techniques are applied. To obtain optimum simulation of spatial perspective itis important for the two-loudspeaker signals to have interaural correlation that is asnatural as possible. This requirement is met by the so-called sphere microphone, usedas a main microphone, associated with the room-related balancing technique, whichgenerates artificial reflections and reverberation from spot-microphone signals. Musicrecordings confirm that the sphere microphone combines favorable imaging characteristicswith regard to spatial perspective, accuracy of localization, and sound color; and thatthe room-related balancing technique is able to preserve this stereophonic quality.

0 INTRODUCTION image "suitable for a living room"--in other words,the essential problems of loudspeaker stereophony--

A particularly large number of studies have been actually force a deviation from identity. The desiredpublished during the last few years with the goal of natural stereophonic sound image should therefore meetimproving the capabilities of current stereophony. This two requirements: it should satisfy aesthetically and itapplies to microphone techniques as well as to mixing should match the tonal and spatial properties of theand reproduction techniques, and major overall progress original sound at the same time.can be expected. In this paper possible developments Both requirements will undoubtedly be contradictoryin stereophonic recording technique are described, in many situations. However, the compromise, namely,which may improve the "naturalness" of the stereo- optimization by the sound engineer, will be the better,phonic sound image in the playback room. the more flexible the stereophonic recording technique

First, how can the desired naturalness of the ster- is and the more accurately the psychoacoustic principles, eophonic sound image be defined? The simplest theorem are understood and taken into account from the technical

would be: the reproduced sound image must correspond and artistic points of view.to the original sound image. This definition appears to

be problematic because identity can definitely not be I STEREOPHONIC IMAGING OF SPACErequired, in principle, as a goal for optimizing the ster-eophonic transmission technique. Identity may con- Which stereophonic loudspeaker signals does the earceivably be appropriate for head-referred stereophony, require so that a natural sound image is achieved? Whator perhaps for the reproduction of a speaker's voice kind of quality of stereophonic presentation of direction,through loudspeakers, but it is probably appropriate to distance, and spatial impression _ is possible, in prin-a limited extent only for the reproduction of the sound ciple, in the case of conventional two-channel loud-of a large orchestra through loudspeakers. Aestheticirregularities in the orchestra, poor conditions of room

The term spatial impression comprises two attributes ofacoustics, as well as the necessity of creating a sound the sound image [1], [2]. (1) reverberance (a temporal slurring

of auditory events [2]), which is caused by late reflectionsand reverberation, and (2) auditory spaciousness (a spatial

* Presented at the AES 9th International Conference, De- spreading of auditory events [2]), which is caused by earlytroit, MI, 1991February 1-2. reflectionsin the range of 10-80-ms delay.

J.AudioEng.Soc.,Vol.39,No.1O,1991October 761

Page 2: Sphere Mic

THEILE PAPERS

speaker reproduction? The following fundamental during loudspeaker reproduction,2 an equivalent loud-statements have been derived in earlier papers by means speaker-referred presentation of the spatial perspectiveof the association model [3] for spatial hearing, in the simulation plane, which is comparable to the

1) The distance of the phantom sound source [2] is spatial perspective of a picture.

equal to the (mean) distance from the two stereo loud- To verify this important statement, suitable exper-speakers. The spatial perspective can only be repre- iments can be carried out. When the dummy-head signalssented in the simulation plane between the loudspeakers are compared in a listening test with stereophonic signalsin a manner similar to the perspective presentation in which do not provide the head-specific interaural signalthe visual area [4] (see Fig. 1). The real distance from differences with sufficient accuracy, a relatively highthe loudspeakers corresponds to the real distance from degree of sensitivity of the ear to such interaural "ir-the picture, regularities" is noted during playback through head-

2) The spatial perspective in the simulation plane is phones. The quality of the perceived spatial image suf-better achieved as the interaural signal differences during fers in some way and to some degree. The result of the

natural listening are imitated more accurately by the same listening comparison during playback throughloudspeaker signal differrences [4]. Due to an inverse loudspeakers is both surprising and impressive: thefiltering process postulated in [3]- [5], the auditory quality of the perceived spatial image (in the simulationsystem recognizes the relations between the left and plane) suffers in a similar way and to almost the sameright loudspeaker signals independent of the binaural degree. Two examples are shown in Fig. 2.crosstalk and evaluates them according to the listening 1) A dummy head recording of a sound source [such

experience, as a loudspeakerlocatedon the right sideof the dummyThus, in principle, optimum presentation of direction, head; see Fig. 2(a)] will produce a correspondent image

distance, and space in the simulation plane is made on the right side of the headphone listener and, in thepossible by the stereophonic signal differences generated case of loudspeaker reproduction, a sharp image locatedby a dummy head [4]. A dummy-head signal which close to the right loudspeaker, due to maximum mag-produces the head-referred three-dimensional perception nitudes of the interaural signal differences of the dummyof space during headphone reproduction generates, head (case A). When the maximum natural interaural

time difference of 0.74 ms is enlarged to the unnaturalvalue of about 1 ms by means of a delay device (caseB), the stereophonic quality drops distinctly. This istrue even in the case of playback through loudspeakers.The sound event appears in a more blurred and vaguemanner in the case of the unnatural interaural timedifference of 1 ms.

When comparing a dummy-head recording A and acoincident-microphone recording B of an orchestra bymeans of headphones, the superiority of the dummyhead is obvious. The dummy-head signal produces ahead-referred natural spatial impression, but the co-incident-microphone signal produces a poor spatialimpression. It is important that a corresponding ster-eophonic quality difference can be observed in the caseof loudspeaker reproduction: the dummy-head signalgenerates a loudspeaker-referred presentation of thespatial perspective in the simulation plane (accordingto Fig. 1), but the coincident-microphone signal pro-duces a fiat distribution of sound sources between the

two loudspeakers in front of the listener without sim-ulating spatial perspective. The coincident-microphonesignal, which does not provide any head-specific in-teraural signal difference, fails not only in generatinga head-referred presentation of the authentic spatialimpression and depth, but also in generating a loud-

2 Loudspeaker reproduction does not include the techniqueof biphonal reproduction, that is, loudspeaker reproduction

Fig. 1. The distance of this picture can be compared with techniques that aim to simulate headphone reproduction bythe distance of stereo loudspeakers. The visual perspective, compensating the interaural crosstalk portions (a survey iswhich is simulated by applying phenomena of spatial vision, found in [6]). The biphonal reproduction methods cannot becan be compared to the stereophonic perspective, which can considered as a possibility to improve the capability of loud-be simulated by applying corresponding phenomena of spatial speaker stereophony because the listening area is alwayshearing, minimal.

762 J.AudioEng.Soc.,Vol.39,No.10,1991October

Page 3: Sphere Mic

PAPERS TWO-CHANNELSTEREOSOUND

speaker-referred simulation of the spatial impression requirement is not met by pure intensity or time stere-and depth, ophony, and the stereophonicquality is not advanta-

Summarizing loudspeaker stereophony according to geous with respect to depth and space imaging (intensitythe association model is based on introducing corre- stereophony) or localization (time stereophony) insponding physical attributes of the ear signals (which comparison to dummy-head, OSS, or ORTF techniques,correlate with phenomena of natural spatial hearing) as found in practical comparison tests on the perform-into the stereophonic signals [4]. This is contradictory ance of the main microphones in different concert hallsto summing localization theories, which attempt to in- [9].troduce them into the resulting ear signals of the listener. On this basis, there are possibilities for optimizingOn the basis of summing localization theories it is even the stereophonic presentation of direction, distance,today tried to assess stereophonic techniques (see, for and spatial impression through two loudspeakers. Theexample [7], [8]). As a recent example, Lipshitz has so-called sphere-microphone [10] and the room-relatedconcluded that coincidence-microphone techniques are balancing techniques [11]-[13] represent appropriatemost advantageous for getting a natural spatial impres- optimization approaches for the recording end.sion [7]:

I believe that spaced-microphone recording tech- 2 SPHERE MICROPHONEniques are fundamentally flawed, although highlyregarded in some quarters, and that coincident-mi- In [ 10] a microphone system has been proposed wherecrophone recordings are the correct way to go. two boundary-layer microphones are placed on the sides,

His arguments are based on an analysis of the resulting, of a sphere, as shown in Fig. 3. This so-called sphereinteraural characteristics of the listener's ear signal, microphone produces stereophonic signals which areaccording to the principle of summing localization: composed of natural interaural differences, quite similar

The level and time (or phase) differences at the lis- to dummy-head signals, as required in Sec. 1. However,tener's ears are not the same as those at the loud- in contrast to the dummy head, it has a linear frontalspeakers .... It is important that, as far as possible, frequency response (see Fig. 4, upper curve). Thethe two loudspeaker signals combine at the listener's sphere-microphone signal does not contain thoseears to produce cues which are compatible with natural dummy-head-specific spectral cues, which are used forhearing, front-back orientationduring headphonelistening[2],However, natural interaural attributes of the listener's [5] (Fig. 4, lower curve), but which are not used during

ear signals can only be obtained by using the dummy- loudspeaker-referred presentation and would thereforehead technique (head-referred imaging). In contrast, cause coloration problems [10].conventional two-loudspeaker stereophony is a loud- The frequency responses of the sphere microphonespeaker-referred imaging technique, and it is important Schoeps KFM 6U are plotted in Fig. 5. They are linearthat, as far as possible, the two loudspeaker signals for sound reaching the sphere from the front (0°), andcontain natural interaural attributes rather than the re-

sultant listener's ear signals in the playback room. This

orchestra ,_[ iI dummysoundI I dummycoincidenceI

source I I head microphone II head

I <lc x I I I

1reproduction reproduction reproduction reproduction

(a) (b)

Fig. 2. Two examples for demonstrating stereophonic quality differences in headphone and loudspeaker reproduction.

J. Audio Eng. Soc., Vol. 39, No. 10, 1991 October 763

Page 4: Sphere Mic

THEILE PAPERS

the sum of left and right energy is frequency independent use to produce a stereo image of outstanding naturalness

for sound sources moving toward the side. Also, the and spatial integrity, combined with excellent sound-frequency response is linear for the integrated resultant color neutrality and low-frequency response.of sounds reaching the sphere from any angle in the At present the sphere microphone Schoeps KFM 6Ureverberant (diffuse) field [3]. The choice of pressure (Fig. 6) is being tested in different situations and com-capsules sharply reduces sensitivity to air motion, while pared with other main microphones. First results confirmensuring frequency response to the lower limit of human that the sphere microphone in fact combines favorablehearing (see Fig. 5). The sphere microphone fuses ad- imaging characteristics with respect to spatial per-vantageous features of the systems already in general spective, accuracy of localization, and sound color.

Ironlal incidence lateral incidence

+'-'-; ..0,os,u,ozo°e

4Qnt ntry

11 "°°w--Fig. 3. Principal function of sphere microphone.

spheremicrophone .

25 dB

?-

100 125160200250320t*005006308001k 1,251,6 2 2.53.2 t. 5 6.3 g 10 12,516k

HZ =

Fig. 4. Frequency response of sphere microphone and dummy head (0° free field minus diffuse field).

"', "ri_"

L_ ,_-l-_- -_L_'_44-_-_ _ -L4 ÷ _' '!17 ! I I H Ill ! _]_L]__.,""""""""_

.-_-+_-._.]_1_ :]-_T_4--t-.-'f.,t+kl-k-_-- __:=_-_-_ff--f--_i , -- t_,+--t--i+_-_-,'--+_,'1-

Fig. 5. Frequency response of sphere microphone (Free field 20 °, 40 °, 60°). Upper curves--right; lower curves--left; middlecurve--sum of left and right energies.

764 J. Audio Eng. Soc., Vol. 39, No. 10, 1991 October

Page 5: Sphere Mic

PAPERS TWO-CHANNEL STEREO SOUND

3 ROOM-RELATED BALANCING for example, [14]). However, those techniques are notsatisfying, because the stereophonic quality of the direct

Theoretical considerations [11] and practical tests sound is not improved by this method. In practice a[13] show that spot-microphone signals can be added delay compensation leads to "notching" effects, whichto the sphere-microphone signal without disturbing the are particularly disturbing when the musicians moveperception of spatial perspective. A disturbing effect about near the spot microphone. Experiments haveoccurs in the case of conventional panpot balancing, shown [13] that pure panpot balancing can even beas demonstrated in Fig. 7. The signal picked up by a preferred in comparison to delay-compensated panpotspot microphone is reproduced earlier than the corre- balancing, depending on the recording situation andsponding main-microphone signal. Thus the ear inter- the desired balancing gain. 3prets the spot-microphone signal as the direct sound To preserve the perception of spatial perspective due[11]i [14], and due to the precedence effect, the fa- to the main-microphone signal and, at the same time,vorable imaging characteristics of the sphere micro- achieve a high balancing gain, the spot-microphonephone (or any appropriate main microphone) are lost. signal should be delayed much more than necessarySuch recordings sound flat, without spatial depth, for the compensation, so as to fall within the region

It is common practice to moderate this space-dis- of the early reflections. It has been proposed to achieveturbing effect by artificial reverberation or by com- the desired increase in volume by adding sound energypensating the delay of the main-microphone signal (see, from artificially generated reflections [11]. This so-

called room-related balancing technique has been testedand optimized recently [13] with the aid of an appro-

\ priate audioprocessing unit [15]. It was foundthat aloss in depth can be avoided satisfactorily by generatingjust two artificial reflections from the spot-microphonesignal (according to Fig. 8), and that the greater therequired balancing gain, the more the room-relatedbalancing technique is favorable.

Principally, the room-related balancing algorithmcould be implemented into digital mixing desks so thatit could be used alternatively to conventional panpotbalancing. However, first of all further optimizationis useful to minimize the signal processing effort andto introduce improvements, such as distance equali-zation (taking into account changes in spectrum, suchas by absorbtion at the room boundaries), additionalartificial reverberation (generated from the spot-mi-crophone signal in accordance with the artificial re-flections), and so on.

3 Balancing gain is the level of the balancing signal withreference to the balancing signal's threshold level, which ismeasured at the threshold of perception of the balancing signal

Fig. 6. Sphere microphone KFM 6U (Schoeps), suspended, in the main-microphone signal.

direct sound --first reflections and reverberation-

o_T-'].......... l................. ............._ii_-20 I

from "spot microphone" lrom "main microphone_---_,,- [O I -- !

I/ : :·_ ' _ ii T---

_-2o I - -- ii' !I

,i I/ _ 5 10 15 20 25 30 35 40 45 50 ms

0,5

L-_-directional imaging delay of reflections "(phantom sound sources)

Fig. 7. Panpot balancing: main-microphone signal plus spot-microphone signal without time delay.

J. Audio Eng. Soc., Vol. 39, No. 10, 1991 October 765

Page 6: Sphere Mic

THEILE PAPERS

4 SUMMARY of Human Sound Localization (M.I.T. Press, Cam-bridge, MA, 1983).

It can be concluded that current two-channel stere- [3] G. Theile, "Zur Theorie der optimalen Wie-ophony recording techniques can be improved with re- dergabe von stereofonen Signalen fiber Lautsprechergard to the naturalness of the stereophonic sound image und Kopfh6rer" (On the Theory of Optimum Repro-as defined. A consistent consideration of new knowledge duction of Stereophonic Signals through Loudspeakersand understanding with regard to psychoacoustic prin- and Earphones), Rundfunktechn. Mitt., vol. 25, pp.ciples, particularly with regard to spatial hearing, leads 155-170 (1981).

to the principal result that the two-channel stereo pre- [4] G. Theile, "On the Stereophonic Imaging ofsentation of direction, distance, and space is only pos- Natural Spatial Perspective via Loudspeakers: Theory,"sible as a presentation of spatial perspective in the in Perception of Reproduced Sound (1987), pp. 135-simulation plane between the loudspeakers. This has 146.the effect of creating a natural two-dimensional image [5] G. Theile, "On the Standardization of the Fre-

of a three-dimensional space, quency Response of High Quality Studio Headphones,"We consider that an optimization of the techniques J. Audio Eng. Soc., vol. 34, pp. 956-969 (1986 Dec.).

for simulating spatial perspective is achievable by using [6] G. Steinke, "Stand und Entwicklungstendenzen

natural interaural signal differences instead of pure in- der Stereofonie" (Current Situation and Developmenttensity or time differences. The room-related recording Trends in Stereophony), pts. 1 and 2, Tech. Mitt. RFZ,technique is consequently based on this knowledge, vol. 28, pp. 1-10, 25-32 (1984).This technique can be applied to the well-tried main/ [7] S. P. Lipshitz, "Stereo Microphone Techniquesspot-microphone methods, which means that the op- . . . Are the Purists wrong?" J. Audio Eng. Soc. (Fea-timum main microphone is the sphere microphone or tures), vol. 34, pp. 716-744 (1986 Sept.).a corresponding microphone generating natural inter- [8] J. C. Bennett, K. Barker, and F. O. Edeko, "Aaural signal differences, and that any spot-microphone New Approach to the Assessment of Stereophonic Soundsignal added to the main-microphone signal represents System Performance," J. Audio Eng. Soc., vol. 33,additional artificial reflections from the recording room. pp. 314- 321 (1985 May).Since the generation of natural interaural signal dif- [9] M. W6hr and B. Nellesen, "Untersuchungen zurferences and the simulation of artificial reflections and Wahl des Hauptmikrofonverfahrens" (Studies on thereverberation from a spot-microphone signal is possible Selection of the Main Microphone Method), in Proc.with the aid of modern computer technology, a room- 14th Audio Engineers' Conf., pp. 106-120 (1986).related recording technique can also be applied in prin- [ 10] G. Theile, "Das Kugelfl_ichenmikronfon" (Theciple to polymicrophony. This would result in aster- Sphere Microphone), in Proc. 14th Audio Engineers'eophonic presentation of any artificially created space Conf., pp. 277-293 (1986).in the simulation plane. [11] G. Theile, "Hauptmikrofone und Stfitzmikro-

fone--neue Gesichtspunkte for ein bew_ihrtes Auf-

5 REFERENCES nahmeverfahren" (Main Microphone and SpotMicrophones--New Aspects for a Proven Recording

[1] W. Kuhl, "R_umlickeit als Komponente des Technique), in Proc. 13th Audio Engineers' Conf.,H6reindrucks" (Spaciousness as Component of the pp. 170-184 (1984).Auditory Impression)," Acustica, vol. 40, pp. 167- [12] M. W6hr, J. Goeres, C. P6sselt, and G. Theile,181(1978). "Raumbezogene Sttitztechnik--M6glichkeit zur Op-

[2] J. Blauert, Spatial Hearing--The Psychophysics timierung der Aufnahmequalit_it" (Room-Related Sup-

directsound firstsound reverberationi

-20 I,, ,, T/ I]", I I-- dB

artificial reflection artificial reflection 2 artificial reverberation 'I

°Jt 3, tl/t11 I I I I, II II I II II

11, I I I I I I i/ _ 5 I0 15 20 25, 30 315 40 45 50 ms0_.__ delay of the reflectionsdirectional imaging

Fig. 8. Room-related balancing: main-microphone signal plus artificial reflections and reverberation.

766 J. AudioEng.Soc.,Vol.39,No.10,1991October

Page 7: Sphere Mic

PAPERS TWO-CHANNEL STEREO SOUND

porting Technique--Possibility for Optimizing the p. 55.Recording Quality), in Proc. 15th Audio Engineers' [15] F. Richter and A. Persterer, "Design and Ap-Conf., pp. 302-315 (1988). plications of a Creative Audio Processor," presented

[13] M. W6hr, G. Theile, H. J. Goeres, and A. at the 86th Convention of the Audio Engineering So-Persterer, "Room-Related Balancing Technique--A ciety, J. Audio Eng. Soc. (Abstracts), vol. 37, p. 398Method for Optimizing Recording Quality," J. Audio (1989 May), preprint 2782.Eng. Soc., vol. 39, pp. 623-631 (1991 Sept.).

[14] J. Jecklin, Musikaufnahmen: Grundlagen,

Technik, Praxis (Music Recordings: Foundations, The biography for Giinther Theile was published in theTechnique, Practice). Franzis-Verlag, 1980), Munich 1991 September issue of this Journal.

J. Audio Eng. Soc., Vol. 39, No. 10, 1991 October 767