percieved key movement in four-voice harmony and single voices

18
426 Ken'ichi Miyazaki Siegel, J. A., & Siegel, W. Absolure idenrificarion of notes and inrervals by musicians. Perception & Psychophysics, 1977, 21, 143-152. Terhardr, E. Absolute and relarive pitch revisired on psychoacousric grounds. Proceedings of the llth Internationa/ Congress on Acoustics, Paris, 1983, 4, 427-430. Trehub, S. E., Cohen, A. J., Thorpe, L. A., & Morrongiello, B. Development of the per- ceprion of musical relations: Sernirone and diatonic strucrure. Journa/ of Experimental Psychology: Human Perception & Performance, 1986, 12, 295-301. Ward, W. D., & Burns, E.M. Absolute pirch. In D. Deursch-tfid.), The psychology of music. New York: Academic Press, 1982, pp. 431-0451. Wedell, C. H. Narure of absolute judgmenr of pitch, journal of Experimental Psychology, 1934, 17, 485-503. 1 1 1 ti ~ B. Music Perception Summer 1992, Vol. 9, No. 4, 427~438 © 1992 BY THE REGENTS OF THE UNIVERSITI" Of CALIFORNIA Perceived Key Movement in Four-Voice Harmony and Single Voices WILLIAM F. THOMPSON Atkinson College, York University LOLA L. CUDDY Queen's University at Kingston Lisreners with a moderare amounr of musical rraining rated the disrance berween rhe firsr and final key of short chorale excerprs under one of four presenration conditions, The disrance berween keys, or modularion distance, was either zero, one, or rwo steps in eirher the clockwise or counrerclockwise direction on the cycle of fifrhs. Presentarion condirions were four-voice harmonic sequences excerpred from the complete set of Bach chorales, single voices of the larrer sequences, four-voice harmonic sequences simplified ro block chords, and single voices of the latter sequences. Consisrenr wirh earlier findings (Thompson & Cuddy, 1989), judgments far borh four-voice harmonic presenrarions and single-voice presentarions revealed a clase correspondence berween modularion dis- tance and judged distance. Rarings far harmonic sequences wirhin a given key disrance, however, showed influences of direcrion of mod- ularion and of harrnonic progression thar were not reflecred in rarings far single voices. The findings suggesr thar harmony and melody follow somewhar different principies in the process of idenrifying key change. I N harmonized music, influences on the perception of key structure and key change are notably complex. Both melodic and harmonic structure conrribute to a lisrener's sense of key. It is difficult to isolate their separare influences, however, beca use the tonal implications of -i:nelody and har- mony are highly correlared. In a previous repon, we addressed this problem by investigating a strict hierarchical description of key (Thompson & Cuddy, 1989). In this de- scription, a melodic line implicares key structure by first implicating an Requesrs far reprints rnay be senr to W. F. Thompson at the Departrnenr of Psychology, Arkinson College, York Universiry, North York, Onrario, Canada, M3J 1P3 orto L. L. Cuddy ar rhe Departrnenr of Psychology, Queen's Universiry, Kingsron, Onrario, Canada, K7L 3N6. 427

Upload: matias-german-tanco

Post on 12-Dec-2015

5 views

Category:

Documents


2 download

DESCRIPTION

Percieved Key Movement in Four-Voice Harmony and Single Voices

TRANSCRIPT

Page 1: Percieved Key Movement in Four-Voice Harmony and Single Voices

426 Ken'ichi Miyazaki

Siegel, J. A., & Siegel, W. Absolure idenrificarion of notes and inrervals by musicians.Perception & Psychophysics, 1977, 21, 143-152.

Terhardr, E. Absolute and relarive pitch revisired on psychoacousric grounds. Proceedingsof the llth Internationa/ Congress on Acoustics, Paris, 1983, 4, 427-430.

Trehub, S. E., Cohen, A. J., Thorpe, L. A., & Morrongiello, B. Development of the per­ceprion of musical relations: Sernirone and diatonic strucrure. Journa/ of ExperimentalPsychology: Human Perception & Performance, 1986, 12, 295-301.

Ward, W. D., & Burns, E.M. Absolute pirch. In D. Deursch-tfid.), The psychology ofmusic. New York: Academic Press, 1982, pp. 431-0451.

Wedell, C. H. Narure of absolute judgmenr of pitch, journal of Experimental Psychology,1934, 17, 485-503.

11

1ti~B.

Music PerceptionSummer 1992, Vol. 9, No. 4, 427~438

© 1992 BY THE REGENTS OF THE

UNIVERSITI" Of CALIFORNIA

Perceived Key Movement in Four-Voice Harmonyand Single Voices

WILLIAM F. THOMPSONAtkinson College, York University

LOLA L. CUDDYQueen's University at Kingston

Lisreners with a moderare amounr of musical rraining rated the disranceberween rhe firsr and final key of short chorale excerprs under one offour presenration conditions, The disrance berween keys, or modulariondistance, was either zero, one, or rwo steps in eirher the clockwise orcounrerclockwise direction on the cycle of fifrhs. Presentarion condirionswere four-voice harmonic sequences excerpred from the complete set ofBach chorales, single voices of the larrer sequences, four-voice harmonicsequences simplified ro block chords, and single voices of the lattersequences. Consisrenr wirh earlier findings (Thompson & Cuddy, 1989),judgments far borh four-voice harmonic presenrarions and single-voicepresentarions revealed a clase correspondence berween modularion dis­tance and judged distance. Rarings far harmonic sequences wirhin agiven key disrance, however, showed influences of direcrion of mod­ularion and of harrnonic progression thar were not reflecred in raringsfar single voices. The findings suggesr thar harmony and melody followsomewhar different principies in the process of idenrifying key change.

IN harmonized music, influences on the perception of key structure andkey change are notably complex. Both melodic and harmonic structure

conrribute to a lisrener's sense of key. It is difficult to isolate their separareinfluences, however, beca use the tonal implications of -i:nelody and har­mony are highly correlared.

In a previous repon, we addressed this problem by investigating a stricthierarchical description of key (Thompson & Cuddy, 1989). In this de­scription, a melodic line implicares key structure by first implicating an

Requesrs far reprints rnay be senr to W. F. Thompson at the Departrnenr of Psychology,Arkinson College, York Universiry, North York, Onrario, Canada, M3J 1P3 orto L. L.Cuddy ar rhe Departrnenr of Psychology, Queen's Universiry, Kingsron, Onrario, Canada,K7L 3N6.

427

Page 2: Percieved Key Movement in Four-Voice Harmony and Single Voices

428 William F. Thompson & Lola L. Cuddy

underlying harmonic progression, which, in turn, implica tes key structureand key movement. Because the process of deriving an implied harmonicprogression from a sequence of tones may be subject to error or ambiguity,the strict hierarchical descriptíon predicrs that judgments of key structureshould in general be more dífficult for single voices than fer harmonicsequences.

Befare Thompson and Cuddy (1989), investigators had not explicitlyrested a hierarchical conception of tones, chords, and keys by askinglisteners directly to judge key movement in borh harrnonic sequences andin single voices. However, demonstrations of top-clown influences involv­ing a variety of other judgments ha ve generally supported a hierarchicaldescriprion of key structure. For example, it has been shown rhat keyconrext influences judgments of chords or chord sequences (Krumhansl,Bharucha, & Castellano, 1982; Krumhansl, Bharucha, & Kessler, 1982;Krumhansl & Kessler, 1982) and melodies (e.g., Cuddy, Cohen, & Mew­horr, 1981; Cuddy, Cohen, & Miller, 1979; Dowling, 1978; Krumhansl,1979). Moreover, judgments of melodies are influenced by the impliedharmonic progression (Bharucha, 1984; Cuddy, Cohen, & Mewhort,1981; Cuddy & Lyons, 1981). These resulrs are consistent wirh the notionthat, within a hierarchy of musical structure, keys are represented at thehighest leve!, chords atan intermediare level, and single tones ar the lowestleve! (Bharucha, 1987; Krumhansl, Bharucha, & Kessler, 1982, p. 34;Lerdahl, 1988).

Thompson and Cuddy (1989) investigated the perception of key struc­ture and key movernent by asking listeners to judge the distance betweenthe first and final key of excerpts adapted from Bach chorales. Five typesof sequences were included in the presentations: nonmodulating, mod­ularing to the dominant, modulating to the subdominant, modulating tothe supertonic, and modulating to rhe flartened seventh. Judgments of keydistance in four-voice sequences were compared with judgments of keydistance in che individual voices of those sequences. There were two mainfindings: firsr, both harmonic and single-voice presentations reliably con­veyed a similar degree of key movement to lisreners; second, for harmonicpresentarions, but not for single-voice presenrations, grearer distance wasassociared with modulations moving in rhe counterclockwise, rarher thanin the clockwise, direction. This second finding suggests that the key rnove­ment conveyed by harmonic sequences, but nor by single voices, wasasymmetric with respect to direction of modulation.

Togerher, rhe findings suggest that judgments of key distance in har­mony and single voices are nor adequately modeled by a strict hierarchicalsystern. Wirhin a strict hierarchical processing sysrern, sorne informationloss is expected between the levels of harmony and single voices. There­fore, judgments of single voices should be less reliable than judgmenrs of

i1

'

Four-Yoice Harmony and Single Voices 429

harmonic presentations. An exception to this latter outcome could occuronly if listeners were able to abstraer with complete accuracy the un­derlying harmonic progression from presentations of single voices: in thatcase, however, one would expect effects such as the directional asymmetry,mentioned earlier, to be evident for both judgments of harmonic sequencesand judgments of single voices. Because our results were inconsistent witheither possible outcome, we concluded that melody and harmony do notimplicare key structure within a strict hierarchical systern. Rather, theprocesses of abstracting key structure from melody may operare somewhatindependently from the processes of abstracring key structure from har­mony.

The present investigation was designed to replicare and elaborare on theearlier findings. For both harmonic and single-voice presentations, wecompared judgments of key distance for three versions of the choralesequences. The first version was the original form as notated by Leuchter(1968). The other versions were simplifications of the sequences [sirn­plifications (1) and (2)]. Simplified versions were sequences of eight four­voice chords, that is, without passing notes or ornamenrarion.

Simplificarion (1) was provided by Professor Fred Lerdahl. Ir retains theoverall key movemenr theoretically deemed presenr in the original choralesequences but alters the harmonic progressions in order to equate the poinrof modulation across ali sequences. Lerdahl (personal communication,1986) commented that sorne of the modulations occurred either "too soonor too late," particularly those in the counterclockwise direction. Thisimbalance may have contributed to the directional asymmetry nored inThompson and Cuddy (1989) for ratings of harmonic presenrations, Sim­plification (2) was the version used in our earlier research: it closely pre­served the harmonic progression of the original chorales.

Judgments of key distance in original sequences and Simplification (1)sequences were collected and compared with judgments previously col­lected for Simplification (2) sequences (Thompson & Cuddy, 1989). Sincethe three versions of each chorale sequence always maintained the samekey structure, similarities among che results for the three versions shouldrevea! the overall contribution of key structure to perceived key rnove­ment. Differences among rhe versions should revea! influences by localmelodic and harmonic details.

Method

The merhod described refers to experimentation wirh original and Simplificarion (1)sequences. Procedures far Simplification (2) sequences were similar (Thornpson & Cuddy,1989).

Page 3: Percieved Key Movement in Four-Voice Harmony and Single Voices

430 William F. Thompson & Lola L. Cuddy

LISTENERS

Four groups of Iisreners, each consisring of 20 undergraduare srudenrs from rhe Uni­versiry of Queensland, Australia, were rested, Listeners were selecred from a firsr-yearsubjecr pool and were given course credir for participaring. The participants had little orno formal training in tradirional music rheory, bur all lisrened to classical music on a regularbasis, or currenrly played a musical insrrument. Ali subjecrs reporred normal hearing.

APPARATUS AND STIMULI

Tones were rriangular waveforms produced by a Yamaha DX-11 synthesizer, conrrolledby a Maclnrosh SE-20 compurer. The synrhesizer was ser to the sysrem of equal rern­peramenr wirh A4 equal to 440 Hz. Sequences were prepared wirh rhe music sofrwarepackage "Professional Composer" and presenred under the control of the sofrware package"Professional Performer." The tempo of each presentation was ser to 120 quarter noresper minure.

Lisreners were rested in soundproof boorhs. Tones and chords were delivered binaurallyrhrough Sennheiser headphones (HDH 424), and responses were enrered on rhe keyboardof a cornpurer terminal. Befare the experimental session began, each lisrener was allowedto adjusr che average SPL ro a comforrable lisrening leve] wirhin rhe range 65 ro 75 dBSPL.

CHORALE SEQUENCES

Ten phrases were excerpred from rhe complere set of Bach chorales (Leuchrer, 1968).The original sources are lisred in Thompson and Cuddy (1989, appendix). The 10 excerprsprovided rwo examples of each of five modularion condirions. The five condirions wereas follows: nonmodularing [Condirion NM], modularing ro the key of the dominanr[Condirion M(V)], modularing to the key of the subdominanr [Condirion M(IV)], mod­ularing ro rhe key of rhe superronic [Condirion M(II)], and modularing ro rhe key of theflarrened sevenrh [Condirion M(VIIb)].

Ali excerprs ended with a perfecr cadence to the tonic chord of the final key. Twoversions of each sequence were used in rhe invesrigarion: the original excerprs and asimplified version of the excerprs [Simplification (1), prepared by Professor F. Lerdahl].The simplified (1) version of each sequence consisred of eighr chords with no ornamentalor passing notes. For each sequence, rhe key strucrure, the roor of the firsr chord, and rheroors of rhe final rwo chords were always rhe same in the original and simplified versions.However, harmonic progressions were sornetimes altered in rhe simplified (1) sequencesin order to equare rhe poinr of modularion across ali sequences.

Figure 1 shows the original versions of rhe 10 chorale sequences in musical noration.Figure 2 shows rhe simplified (1) versions of the 10 sequences. The rypes of key movemenrdisplayed in Figures 1 and 2 are as follows: nonmodularing-Sequences 1 and 2, mod­ularing one srep on rhe cycle of fifrhs ro rhe dominanr or to rhe subdominanr-Sequences3, 4, 5, and 6, and modulating rwo sreps on rhe cyde of fifrhs to the superronic orto theflarrened sevenrh-Sequences 7, 8, 9, and 10.1

PROCEDURE

Subjecrs were randomly assigned ro one of four groups. Presenrations for rhe fourgroups were as follows: Croup 1-harmonic sequences excerpted from Leuchrer (1968),Group 2-single voices of rhe larrer sequences, Group 3-harmonic presenrarions of Sirn-

l. Sequences 3 and 4 of Thompson and Cuddy (1989) were dropped for reasons ex­plained in rhat paper, and rhe sequences of chis paper have been renuribered accordingly.

Fóur-Voice Harmony and Single Voices 431

Fig. l. The original versions of the 10 sequences, excerpted frorn Bach Chorales (Leuchrer,1968).

plificarion (1) sequences, Group 4-single voices of rhe latrer sequences. Each tria! con­sisred of an inirial melodic partern of five quarrer nores, followed by a pause equal ro rwoquarrer notes, and then a harmonic or single-voice presenrarion. The inirial melodic parternoudined rhe tonic rriad of rhe inirial key of rhe sequence and was included ro give rhelisrener a srrong sense for rhe inirial key. Presenred in ascending order, rhe five notes ofrhe parrem were ronic, tonic one octave above rhe firsr tone, medianr, dominant, and ronicrwo ocraves above rhe firsr rone. For harmonic presenrations, each harmonic sequence was

Page 4: Percieved Key Movement in Four-Voice Harmony and Single Voices

432 William F. Thompson & Lola L. Cuddy

C :: 1:; ; ; I~; ; 1

7

J

;, 11

J J ~ J J J J '- ....... ~ - - J

Fig. 2. The simplified (1) versions of the 10 sequences.

presented once. For single-voice presentations, each voice from each harmonic sequencewas presented once. The order of presentation was randomly and independendy deter-mined for each lisrener. ·

Listeners were inforrned rhar rhey should rate rhe disrance berween the first and finalkey of each presentation on a scale of one ro seven. For listeners not familiar with a formaldefinition of key, an explanarion was provided. This explanarion included reference ro thescale, do-re-me-fa-sol-la-ri-do, and ro rhe sense of srabiliry associared with rhe firsr note

Four-Voice Harmony and Single Voices 433

of rhe scale. Lisreners were inforrned rhar rhere were no righr or wrong answers and tharrhey should try ro use rhe entire range of rhe response scale.

Results and Discussion

Table 1 displays mean ratings of perceived key distance for three typesof key structure and three versions of the sequences. The three types ofkey structure in the table are nonmodulating (Nonrnod), modulating onestep on the cycle of fifths (Mod 1 step), and modulating rwo steps on thecycle of fifths (Mod 2 steps). The upper part of the table shows meanratings for four-voice harrnonic presentations; the lower part shows meanratings for single voices averaged across voices (soprano, alto, tenor, bass).

Table 1 reveals similar patterns of resu!ts for each of the three versions.Single voices, on the average, conveyed as much inforrnation about keychange as four-voice harmonic presentations: As the theoretical distanceon the cycle of fifths increased, ratings of perceived distance increased byabout the same amount for four-voice harmony and for single voices.Analysis of variance yielded strong effects of modulation distance and nointeractions between modulation distance and version of the sequences.For four-voice harmonic presentations, nonmodulating sequences wererated lower than modulating sequences [F(l, 57) = 57.19, p < .01], andkey changes of one step on the cycle of fifths were associated with lowerratings than key changes of two steps [F(l, 57) = 97.18, p < .01]. Sim­ilarly, for single-voice presentations, nonmodulating sequences were ratedlower than modulating sequences [F(l, 61) = 190.93, p < .01], and keychanges of one step were associated with lower ratings than key changes

TABLE 1Mean Ratings of Key Distance for Four-Voice Harmony

and Single Voices

Versión

Original Simplified (1) Simplified (2)

Four-voice harmonyNonmod 2.58 2.45 2.54Mod 1 step 3.03 3.58 3.32Mod 2 steps 4.46 4.50 4.45

Mean of four single voicesNonmod 2.65 2.49 2.64Mod 1 step 3.38 3.60 3.23Mod 2 sreps 4.38 4.50 4.16

Page 5: Percieved Key Movement in Four-Voice Harmony and Single Voices

434 William F. Thompson & Lola L. Cuddy

of two steps [F(l, 61) = 154.20, p < .01]. Thus, ratings of key distancewere as reliable for single voices as they were for full hannonic sequences.

ASYMMETRY OF MODULATION DIRECTION

The asymmetry of modulation direction reponed for Simplification (2)by Thompson and Cuddy (1989) was replicated for the original sequencesbut was much less evident in the simplified (1) sequences provided byLerdahl. For four-voice harmonic presentations of the original sequences,key changes involving clockwise movement on the cycle of fifths wereassociated with lower ratings (mean, 3.1) than key changes involvingcounterclockwise movement on the cycle of fifths (mean, 4.4). For thesimplified (1) harmonic sequences, however, mean ratings were 3.9 and4.2 for clockwise and counterclockwise movement, respectively. Thus,original and simplified (1) harmonic sequences differed significantly withrespect to the presence of an asymrnetry of modularion direction [F(l,38) = 4.43, p < .05]. This finding suggests that directional asymmetry isrelated to the specific characteristics of the harmonic progressions-themeans by which modulation is effecred. It is not necessarily a properryof the psychological representation of key relationships.

There was no asymrnerry of modulation direction for judgments ofsingle voices, for either version tested in this investigation. This findingis consistent with the results reported in Thompson and Cuddy (1989):In both investigations, asymmetry of modulation direcrion, when itemerged for Iour-voice harmonic sequences, was not evident in judgmentsof single voices. This difference between judgments of harmonic presen­tations and judgmems of single-voice presentations occurred in spite ofthe fact that ratings of key distance were equally reliable for single-voiceand harmonic presenrarions. Evidently, processes underlying the asyrn­metry effect reponed for judgments of harmonic presentations were notoperating for judgments of single voices. More generally, harmony andsingle voices may be partially independent in the implication of key struc­ture.

JUDGMENTS OF SINGLE VOICES

Table 2 displays mean ratings assigned to soprano, alto, tenor, and bassvoices, for each of the three versions, averaged across modulating se­quences only. For single-voice presentations, the three versions differedwith respect to the amount of key change conveyed by the four voicescontained in rnodulating sequences. For the original versions, rarings ofkey distance were lower for soprano voices than for other voices. For therwo simplified versions, rarings of key distance for the soprano voice of

1

Four-Voice Harmony and Single l/oices 435

TABLE 2Mean Ratings of Key Distance in Soprano, Alto, Tenor, and Bass

Voices, Averaged across Modulating Sequences

Voice---

. Soprano Alto Tenor Bass

Original 2.78 3.88 4.50 4.38Simplified (1) 3.57 3.44 4.59 4.59Simplified (2) 3.87 3.34 3.70 3.87

most sequences were similar to ratings of key distance for other voices.In orher words, the degree of key change conveyed by single voices wasless balanced across voices in the original versions than in the simplifiedversions. These differences were supported by an overall interaction be­tween Voice x Sequence-type x Version [F(24, 732) = 5.42, p < .01].

Not surprisingly, voices that introduced the new note or notes involvedin a key change were .most informarive that a key change had occurred.For example, ratings of the soprano voice of Sequence 9 were much lowerfor the original version (mean, 2.45) than for the simplified (1) version(mean, 5.15) [t(19) = 5.88, p < .001]. This difference appears to be re­lated to the presence of accidentals in the voice: the original soprano voiceinvolves no accidentals, but the simplified (1) soprano voice involves twoaccidentals (i.e., notes from the new key) in the last bar. For the alto ofthe same sequence, mean ratings for the original and simplified (1) versionwere 5.55 and 3.25, respectively [t(19) = 4.67, p < .001]. Again, the dif­ference appears to be relared to the presence of an accidental. In theoriginal alto voice, an accidental is.introduced by the new key at thepenu!timate note. In the simplified (1) alto voice, there are no accidentals.

COMPARISON OF HARMONY AND SINGLE VOICES

As noted in Table 1, mean ratings of key distance for four-voice har­mony and for single voices increased by similar amounts as theoreticaldistance on the cycle of fifths increased. However, exarnination of ratingsfor the individual sequences within a given modulation distance revealedthat judgments of harmonic presentations and judgments of single-voicepresentations did not always correspond. The musical changes made tothe original sequences to creare simplified (1) sequences did not alwayshave the same effect on judgments of harmonic presenrations as rhey didon judgments of single-voice presentations. As noted earlier, fer example,rarings of original harmonic sequences contained a directional asymmetrythat was significantly reduced in ratings of simplified (1) harmonic se-

Page 6: Percieved Key Movement in Four-Voice Harmony and Single Voices

436 William F. Thompson & Lola L. Cuddy

quences. However, no corresponding influence of the musical changes wasfound for the comparison of ratings of original single voices and sirnplified(1) single voices.

For each sequence, we examined the effect of the changes in musicaldetail introduced by Sirnplification (1) on both judgments of harrnonicpresentations and judgrnents of single voices. Differences in ratings be­tween original and sirnplified (1) versions of each hármonic sequence werecalculated and compared with differences in ratings between original andsimplified (1) versions of the single voices contained in each harmonicsequence. The changes introduced by Sirnplification (1) sometirnes in­creased and sometimes decreased ratings both for single-voice presenta­tions and for harrnonic presentations. However, differences for judgrnentsof harrnonic presentationswere uncorrelated with differences for judg­rnents of single voices [r(8) = -.324, n.s.].2

In a few cases, the direction of the difference berween original andsirnplified (1) versions for judgments of harmonic sequences was oppositeto the direction of the difference berween these two versions for judgrnentsof single voices. This interaction between Version and Condition of Pre­sentation was most evident for Sequence 10. Figure 3 shows mean ratingsfor original and simplified (1) versions of Sequence 10, for harmonic andsingle-voice presentations.

Sequence 10 involves a key change from C major to B¡, major. Incomparison to the original version of Sequence 10, the sirnplified (1)version involves earlier introduction of chords from the final key. Forjudgments of harrnonic presentations, the mean ratings displayed in Figure3 suggest that the changes intrcduced- by Simplificarion (1) reduced theperceived degree of key movement. For single voices, however, the op­posite effect was found. Changes introduced by Simplification (1) ap­peared to enhance or emphasize the sense of key change.

The reliability of this interaction was established by comparing ratingsof single voices of Sequence 10 (averaged across voices) wii:h ratings ofrhe full harmonic presentations of Sequence 10. Analysis of variance re­vealed a significant interaction of Version [original vs. sirnplified(1)] X Condition of Presentation (single voice vs. full harmony) [F(l,76) = 8.15, p < .01]. The effect is consistent ~ith the notion that thereare different processes by which key structure is abstracted from single

2. Analysis of variance conducred far judgments of original and simplified (1) harmonicpresenrations indicated a significant contrasr within rhe triple interaction of ModulationCondirion x Exarnple x Version [F(l, 38) = 4.35, p < .05]. In addition, judgments oforiginal and simplified (1) harrnonic sequences were significantly different with respect torhe presence of an asyrnmetry of modularion direcrion. Thus, the lack of correlarionberween differences far harmonic presentations and single-voice presentations was notmerely rhe result of correlating rwo randorn distriburions.

11

1

Four-Voice Harmony and Single-Yoices 437

7¡;,,¡

6 ~

12] Originaluz D Simplification (1)<~o> 5¡;,,¡~"" 4ooz¡::: 3<e::z 2-c¡;,,¡::;:

Single Volees Four - Voice Hanmony

PRESENTATION CONDITION

Fig. 3. Mean rarings of key disrance far original and simplified (1) versions of Sequence10, far harmonic and single-voice presentations.

voices and full harrnony: musical details that may reduce key movementin a harrnonic context may actually enhance key movement in singlevorces,

Conclusions

Judgrnents of key movement in original and simplified (1) sequencescorífirm earlier findings conceming the sensitivity of listeners to key changein short chorale excerpts. Results obtained with simplified (2) sequences(Thompson & Cuddy, 1989) may be generalized to their original sources.Moreover, furrhér evidence is provided that melody and harmony mayfollow different principies in the process of idenrifying key change. Asym­metry of modulation direction, when it occurs, emerges for four-voiceharmonic presentations only, and not for single voices. Moreover, effectsof local musical changes on judgments of harmonic sequences were notsimilar, and in sorne cases were in an opposite direction, to the effectsof those changes on judgments of single voices.

Comparison of judgrnents for four-voice harmony and single voicessuggest that melody and harmony are partially independent in the im­plication of key. [In a somewhat different context, Schmuckler (1989)found that melody and harmony contributed independently and additivelyto expectancy generation.] This parrial independence is not well modeled

Page 7: Percieved Key Movement in Four-Voice Harmony and Single Voices

438 William F. Thompson & Lola L. Cuddy

by a strict hierarchical systern, in which a melodic line implicares a keyby first implicating an underlying harmonic progression. The results areconsistent with a partially hierarchical systern, however, in which musicalrelations at each level can be evaluated with or without reference to otherlevels in the hierarchy. Although listeners can evaluare key relation­ships in a melodic line with reference to an implied harmonic progression,other aspects of melodic structure are available, and they may providelisteners with an important source of information about key and keymovement.3• 4

References

Bharucha, J. J. Anchoring effecrs in music: The resolurion of dissonance. Cognitive Psy­chology, 1984, 16, 485-518.

Bharucha, J. J. Music cognirion and perceprual facilirarion: A connecrionisr framework.Music Perception, 1987, 5, 1-30.

Cuddy, L.L, Cohen, A. J., & Mewhorr, D. J. K. Perceprion of structure in shorr melodiesequences. [ournal of Experimental Psychology: Human Perception and Performance,1981, 7, 869-883.

Cuddy, L. L., Cohen, A. J., & Miller, J. Melody recognirion: The experimental applicationof musical rules. Canadian [ournal of Psychology, 1979, 33, 148-157.

Cuddy, L. L., & Lyons, H. l. Musical parrern recognirion: A comparison of listening toand srudying tonal strucrures and tonal arnbiguiries. Psychomusicology, 1981, 1, 15-33.

Dowling, W. J. Scale and comour: Two componenrs of a rheory of memory for melodies.Psychological Review, 1978, 85, 341-354.

Krumhansl, C. L. The psychological represenrarion of musical pitch in a tonal conrext,Cognitive Psychology, 1979, 11, 346-374.

Krumhansl, C. L., Bharucha, J.]., & Castellano, M. A. Key disrance effecrs on perceivedharmonic srructure in music. Perception and Psychophysics, 1982, 32, 96-108.

Krumhansl, C. L., Bharucha, J.]., & Kessler, E. J. Perceived harmonic srructure of chordsin three related musical keys. [ournal of Experimental Psychology: Human Perceptionand Performance, 1982, 8, 24-36.

Krumhansl, C. L., & Kessler, E. J. Tracing rhe dynamic changes in perceived tonal or­ganizarion in a spatial represenrarion of musical keys. Psychological Review, 1982, 89,344-368.

Lerdahl, F. Tonal pirch space. Music Perception, 1988, 5, 315-349.Leuchrer, E. (Ed.). J. S. Bach (386 Chora/es), Buenos Aires: Recordi Americana, 1968.Schmuckler, M. A. Expecrarion in music: !nvesrigarion of melodic and harrnonic processes,

Music Perception, 1989, 7, 109-150.Thompson, W. F., & Cuddy, L. L. Sensiriviry to key change in chorale sequences: A

comparison of single voices and four-voice harmony. Music Perception, 1989, 7, 151-168.

3. This research was supporred by a Special Projects Grant from the Universiry ofQueensland to rhe firsr aurhor, and by research awards from rhe Natural Sciences andEngineering Research Council of Canada and the Advisory Research Cornmirree ofQueen's Universiry to rhe second aurhor.

4. We thank Fred Lerdahl, Beverly Cavanagh, and Keirh Hamel for music rheoreticalanalysis, discussion, and advice. Helpful commenrs and suggesrions were also providedby Caro! Krumhansl.

1

Music PerceptionSummer 1992, Vol. 9, No. 4, 439-454

() 1992 BY THE REGENTSOF THEUNIVERSITY OF CALIFORNIA

A (De)Composable Theory of Rhythm Perception

PETER DESAINNijmegen University, The Netherlands

A definition is given of expectancy of evenrs projected into rhe futureby a complex temporal sequence.The definition can be decomposed inrobasic expecrancy cornponenrs projecred by each time inrerval implicirin rhe sequence. A preliminary formularion of rhese basic curves isproposed and rhe (de)composition merhod is stared in a formalized,rnathemarical way. The resulting expecrancy of complex temporal pat­terns can be used to model such diverse topics as categorical rhythrnperception, dock and meter inducement, rhyrhrniciry, and the similariryof temporal sequences. Besidesexpecrancy projecred into the Iurure, theproposed measure can be projected back into the pastas well, generaringreinforcemenr of pasr evenrs by new data. The consistency of the pre­dictions of the rheory wirh sorne findings In categorical rhythm per­ceprion is shown.

lntroduction

Many incompatible theories about temporal perception and memoryexist, which explain a number of phenomena well, but fail to predictothers. A common rheoretical basis for such work would be desirable.Connectionism might be an attractive paradigm in the search for such abasis, but most of its models lack compositionality. This means that themodel as a monolithic whole might perform well, but it is impossible todecompose its complex behavior into meaningful smaller parts. Chan­drasekaran (1990) argues that the cornposability is a condition fer suc­cessful cognitive modeling, even in the connectionist paradigm. In Desain(1990), the behavior of a subsyrnbolic (connectionist) model of temporalquantization was described such that it could be cornpared with an in­compatible syrnbolicmodel from the traditional artificial imelligencepar­adigm. The paper concluded with an abstracrion of the behavior of thequantizer in the form of an "expectancy of events" with a temporal patternas prior context. Expectancy turned out to be (de)composable, whichmakes it possible to base a theory of perception of complex stimuli ona simple model for the perception of their constituting components. Be-

Requesrs for reprints may be senr to Perer Desain, NICI, Nijmegen Universiry, P.O.Box 9104, 6500 NE Nijmegen, The Nerherlands.

439

Page 8: Percieved Key Movement in Four-Voice Harmony and Single Voices

-sr-: .• :··:'\:-·---: ~· -=·-:7· ..••.;..:._.··-:..-:::::::'•--:=-....:...-:-~:.:...::...:: •• --.-·-:::_""2t_····":\() "":: i: li . r-:._,\íl (\ /) ¡7.:: ti.. , ; .-Y.'~-!:.. '.I': v..,;::..r.; .0~.;, ,.(,... '"0 P¿tio 3'2>C..... \ ,,,,,,

Music PerceptionSpring 1991, Vol. 8, No. 3, 217-240

© 1991 BY THE REGENTS OF THEUNtVERSln' OF CALIFORNIA

Music Perception and Sensory Information Acquisition:Relationships and Low-Level Analogies

.;_;)._'1.J/5~--,¡. -~ •• , , •

3 _3 .cn-•••••• ,u •••

Music Perception (ISSN 0730-7829) ispublished quarrerly in Fall, Wimer,Spring, and Summer.AHmarrers of an editorial narureshould be addressed to DianaDeursch, Editor, Music Perception,Departmenr of Psychology, C-009,Universiry of California, San Diego,Lajolla, CA 92093.Music Perception is indexed inPsyc!NFO (APA), Arts & HumaniriesCiration lndex (A&HCI), CurrentConrenrs/Arrs & Hurnaniries(CC/A&H), Social Sciences Cirarionlndex (SSCI), Currenr Conrents/Social& Behavioral Sciences (\:C/S&BS), andthe Repository of Muste' Lirerarure.The paper used in this publicarion meetschemínimum requirernenrs of AmericanNarional Standard far InformationSciences-Permanence of Paper forPrinred Library Marerials, ANSI239.48-1984. §Address subscription orders,changes of address, and businesscorrespondence (including requestsfar permissions and advertisinginquiriesv ro Joumals Manager,Universiry of California Press,Berkeley, CA 94720 (415-642-7485).Subscripnon cares in the U.S.A. are$35.00 far one year far individualsand $75.00 far one year farinsnrurions. Subscribers elsewhereshould add $5.00 posrage. Paymenrcan be made wirh UNESCO bookcoupons. Please allow 4 months fordelivery of firsr issue. Single copies are$8.75 far individuals and $18.75 farinsrirutions. Domesric daims fornonreceiprof issues should be madewirhin 90 days of the month ofpublication, overseas claims wirhin 180days. Thereafter, rhe regular back issuerare will be charged far replacernent.Overseas delivery is nor guaranreed.Second-class posrage paid ar Berkeley,California and addirional mailingoffices. Poscmasrer: Send addresschanges roMusic Perception,Universiry of California Press,Berkeley, CA 94720.

ERNST TERHARDTTechnische Universitdt Miinchen

Informarion processing is characrerized by condicional decisions on hi­erarchically organized levels, In biological sysrems, this principie is man­ifesr in rhe phenomena of contourizarion and caregorizarion, which are·more or less synonymous. Primary contourizarion= such as in rhe visualsystem= is regarded as rhe firsr srep of absrraction. Irs audirory equiv­alenr is formarion of spectral pirches. Hierarchical processing is char­acrerized by rhe principies of immediare processing, open end, recursion,disrribured knowledge, forward processing, auronomy, and viewback.In that concepr, perceprual phenornena such as illusion, arnbiguiry, andsirnilariry turn our to be essenrial and rypica]. Wirh respecr ro perceprionof musical sound, those principies and phenornena readily explain pirchcaregorizarion, tone affiniry, octave equivalence (chroma), roor, andtonaliry. As a panicular example, an explanarion of rhe tritone paradoxis suggesred.

Introduction

I regard ir a rnistake if the rheory of consonance is considered as theessenrial basis for the theory of music, and I had felt that I had ex­pressed rhis in the book wirh sufficienr clariry, The essenrial basis ofmusic is melody.

H. von Helmholtzforeword to 3rd ed., The Sensations o( Tone (1870)

Among music experrs there is wide agreement on the notion that amelody is more than justa sequence of tones. What rnakes a tone sequencea melody is harmonic and rhyrhrnic organization. So, ifa melody reallyis a melody, ir includes both harmony and rhythm. Instead of "includes,"one 'mayas well say "suggests." The harmonic and rhythmic implicarionsof a melody ordinarily form a background that is created by, and de­pendent on, the melody itself.

Requesrs far reprinrs may be senr to E. Terhardr, Electroacousrics and Audiocom­municarion, Technische Universirar München, P.O. Box 20 24 20, D-8000 München 2,Cerrnany.

217

Page 9: Percieved Key Movement in Four-Voice Harmony and Single Voices

218 Ernst Terhardt Music Perception and Sensory Information Acquisition 219

Here emerges an analogy between melody and figure (Gestalt). Whenin a visual pattern a set of elernents is seen as a figure, the remainder actsas background. It is impossible to decide what is the figure wirhout de­ciding what is the background. Ir is in rhis sense that a figure maycreate-or "suggest" -its background. Remarkably, it is the method ofsuggesting harmony by melody that provides for making harmonic irn­plications and progressions particularly rich and flexible, as was so strik­ingly demonstrated by J. S. Bach. The analogy drawn by Hofstadter (1979)between Bach's polyphonic compositions ·and M.C. Escher's famousfigure-background interactions intuitively fits well.

From these considerations one may draw two conclusions. The first isjust formal: As melody actually includes both harmony and rhythrn, it ismisleading to say that the basic cornponents of tonal music are melody,harmony, and rhythm. One should better say it is temporal pitch contour(also termed melodic contour, d. Dowling, 1978), harmony, and rhythm.

The second conclusion is about analogies. Analogies are helpful forundersranding. In fact, analogies are both a too] for, and the essence of,understanding. Drawing analogies means making special observationsmore general. While music is a medium that is borh abstraer and non­sernantic, analogies between musical structures and perceptual principieson the one hand, and visual Gestalt principies and the structure of lan­guage on the other hand, have often been discussed (e.g., Bregman &Campbell, 1971; Carterette, Kohl, & Pitt, 1986; Deutsch, 1969, 1982b;Hartmann, 1988; Helmholtz, 1954; Jackendoff & Lerdahl, 1982;Koehler, 1933; McAdams, 1985; Minsky, 1982; Stumpf, 1965; Wellek,1963).

Most (if not all) of those earlier considerations refer to what these daysare termed cognitive processes - as opposed to psychophysical ones. In­deed, on the cognitive levels-that is, higher levels of abstraction­analogies between perceptual processes in different sensory modes oftensuggest themselves readily. This may be attributed to the notion that withascending leve] of abstraction the particularities of sensory modes losesignificance.

However, ir is a serious drawback of singular high-level analogies thatthey can only to a limited extent be experimentally verified and adequatelymodeled. They are akin to metaphors rather than systematic, verifiableanalogies. The essential difference between metaphors and "real" anal­ogies is thar the former include a significantly higher degree ofarbitrariness-having more alternatives-than the latter. For instance, themetaphoric equivalence of a musical piece's tonic (in Hindemith's terms,the "tonal center," cf. Hindemith, 1940) to the vanishing point of a per­spective picmre, is intuitively appealing. Yet, from a scientific point of viewir appears arbitrary to an unsatisfactorily large extent, The same applies

to the aforementioned analogy between melody and figure-backgrounddistinction.

On low levels of the sensory hierarchy, where intermodal particularitiesare most pronounced, analogies do not so easily become apparent burrequire a high amount of abstraction from agrear variety of empírica! data·and observations. If successful, that effort is rewarded by providing borhan equivalent amount of information and a basis for better understandinghigh-level processes.

In an attempt to take advanrage of those ideas, the presenr study putsauditory perception of musical tones and chords into rhe general con­ceptual frame of sensory information processing. This requires critica!discussion of sorne pertinent concepts such as hierarchy, discretization,inforrnation, contour, illusion, ambiguiry, and sirnilarity. As a result, it willbecome apparent that a number of basic principies of tonal music suchas pitch categorization, tone affinity, chroma, root, and tonality readilyemerge frorn natural principies of sensory informarion acquisition. More­over, an explanation of the so-called tritone paradox will be suggesred.

Implications of Hierarchical Processing

Alrhough ir is apparent rhat sensory perception musr be hierarchicallyorganized, the main principies and implications of rhat notion as yet arevague. According to a common concept, a basic disrincríon is made be­tween "psychophysical" and "cognitive" processes. The former are as­sociated wirh basic sensory attribures such as pirch, loudness, and timbre,whereas the latter are seen as related to extraction of inforrnation. In theliterature, one can observe a tendency to regard psychophysical processesas a kind of trivial low-level interface to rhe only significant part, namely,cognitive processes. Although sorne aspects of rhis arritude are plausible,it is dangerous, as ir supports ignorance of essenrial interdependencies ofphysical stimulus characteristics and low-level sensory processes. Ignoringthose interdependencies would imply no less rhan throwing away an in­valuable key to understanding sensory information acquisition, includingperception of music. This is elucidated by the following considerations.

By inspecting sensory systerns and processes frorn the aspects of bio­logical evolution, it becomes apparent rhat any sensory system has been"designed" to enable an organism to respond rnost efficiently to externa!events, An elucidating treatise on this view was given, for example, byKonrad Lorenz (1959). Considering many examples, he arrived at theconclusion that "intelligence" and "knowledge" are distributed on alllevels of the hierarchy, such that a sensory sysrem rnost efficiently and"autornatically" extracts from a stimulus "what it rneans" rather than

Page 10: Percieved Key Movement in Four-Voice Harmony and Single Voices

220 Ernsr Terhardr Music Perception and Sensory Information Acquisition 221

what irs objecrive details are. He poimed out that in a highly developedsensory system, a huge amount of knowledge must be included even onlow levels; that is, knowledge about srructures, relationships, and con­straints of physical srimulus pararneters rhat carry information on externa!objecrs and events. Moreover, he pointed out rhat most of that knowledgehas emerged by biological evolution, that is, rhrough interaction wirh thephysical conditions of the externa! world. And he clearly expressed thenorion rhat rhis process essentially is equivalent to learning by tria! anderror by an individual organism. With these notions ir becomes apparentrhar, for example, Gestalt principies such as proxirniry, closure, and com­mon fare are not principies per se but have evolved through interactionwith the coriditions of rhe externa! world, that is, to respond optimallyto any externa! challenge. This implies that much can be learned aboutperception by srudying those physical conditions and constraints and theirpsychophysical effects. And it implies further that on low levels of thehierarchy a considerable arnount of active, "intelligem" processing mustgo on "autornarically," that is, unconsciously. Another implication is thatrhe question of whether the knowledge implemented in a sensory systemis innare or learned in rnany respects is of minor relevance- beca use ineither case ir has been acquired by trial-and-error imeracrion with theexterna! world.

Wirh rhe following list of principies, an attempt is made to express bothtypical characreristics and constraints of sensory hierarchical processing:

their results can be directly inspected from any higher level,not only from the next one.

The validity of these principies will be discussed in the subsequent secrions.

Discretization, Conditioned Decision, and Information

• Immediate processing: Sensory inforrnation processing beginsright at the lowest possible level, that of the peripheral senseorgan.

• Open end: The hierarchy does not have a definite number oflevels but is open ended.

• Recursion: The basic principies of inforrnation processing arethe same on all levels.

• Distributed knowledge: The "knowledge" necessary for op­tima! processing is distributed on all levels so that on each leve!the particular type of knowledge that is required for the job ·is available.

• Forward processing: Inforrnation processing is predominantlyforward, that is, from peripheral to central levels.

• Autonorny: On a given leve!, the input that comes from thepreceding leve! is processed according to the knowledge avail­able on that particular leve!. On a short-term scale (i.e., leavingout long-term learning processes), processing is not affectedfrom any higher leve!.

• Viewback: While (on a short-terrn time scale) decisions madeon a particular leve! cannot be changed from a higher leve!,

The most important aspect of cognitive processes is that they dependon decisions and are concerned with discrete objects. lt is this aspect thatdetermines the main difference between cognirive processes and contin­uous sensory attributes. While for instance in speech, timbre is a con­tinuous function of time, the continuous flow of the speech signa! is "au­tomatically" dissected into discrere units, for example, phonemes.Although auditory pitch varíes through a continuous low-high dimension,in music rhere exist discrete pitch categories ("pitch classes") that areorganized in tone scales.

The distinction berween continuous sensory attributes and discrete cog­nitive objecrs marches that between signa! and information-a cornmonconcept in communication theory. Physical magnitudes, such as soundpressure as a function of time, are virtually continuous; they are regardedas carriers of information and are termed signals. Generalizing this conceptinto the science of perception, such attributes as brightness and colordistriburion in vision and loudness, pitch, and timbre in hearing can beregarded as psychophysical signals.

Shannon's information theory <loes not claim to say what inforrnationis. Ir rather is confined to the quantitative aspects of inforrnation andevaluares them in terms of probabilities. In the qualitative sense, infor­mation may be characterized as "something that is dependent on deci­sions," namely, conditional decisions. Ir is by conditional decisions thatcaregories are assigned to signa! patterns. The caregories in turn are phys­ically and psychophysically represen red 'by new signals that are subject tomore decisions, and so on. Ir is thus typical of inforrnation processing thatfrom one step to the next, the shape of information-carrying signalschanges radically. Ordinarily, many details of the signal input to a par­ticular decision-making layer no longer exist in rhe ourput. And manydifferent input signals may be assigned to one and rhe same category, thatis, output.

With that qualitative definition of information, and on the basis ofevolution theory, one can "predict" that, for biological sensory sysrems,discretization and categorization must be a predorninant and quite naturalbehavior. In fact it is a prorninent (perhaps even rhe most rypical) aspecrof any living organism-from amoeba to human-that ir is a "decisionmachine" that is busy from the first to the lasr instant of its liferime. Andsensory systems are essential parts of that machinery.

Page 11: Percieved Key Movement in Four-Voice Harmony and Single Voices

222 Ernsr Terhardt Music Perception and Sensory Info:mation Acquisition

Primary Contour and Spectral Pitch

alrhough its sound signa! on traveling to the listener's ears is heavilyaffected by room acoustics. Moreover, the signa[ is corrupted by thesounds of other instruments. Yet there cannot be any doubt that con­siderable information on that particular instrument's audible character­istics is still present in the signa! the ear receives-and that the auditorysystem is capable of extracting it.

Equally fundamental and striking is the fact that the pitches of musicaltones are conveyed with high precision from musician to listener. As adaily-life experience this is so evident that one seldom-if ever-wonderswhy it is so and what its scientific implications are. Nevertheless, thisphenomenon is highly significant and deserves careful analysis.

The key to understanding these and other achievements of auditorycommunication is provided by the principie of contourization. It is basedon rhe notions:

• That there is at least one type of source-signal pararneter thatis not affected by transmission from source to listener, namely,spectral frequency.

• That the peripheral auditory system is an efficient Fourier­spectrum analyzer followed by a contourization mechanismthat "reads" discrete part-tone pitches from rhe continuousspecrral-intensity distriburion.

As a psychophysical representation of spectral frequencies, rhe part-tonepitches include most of the information that comes from sound sourcesand is carried by sound signals such as musical tones. In auditoryperception-and perception of music-spectral pitch plays the role ofprimary contour on which any auditory Gestalt entirely depends.

By simulating auditory extraction of specrral pitches on a computer, wehave verified that the contour-time patterns acrually include all aurallyrelevant information. In Figure 1 an example is shown of the so-calledpart-tone-time pattern. It includes the first three notes of the song "Sum­mertime," surig by a trained woman. Both the musical and text infor­mation included in that sample are represented by the frequency-rimecontours (part-rone amplirudes are ceded in line thickness). The musicalinformation (i.e., pitch classes, vibrare, intonation) is included in the timecourse of harmonic frequencies. The text information is included in en­hancement and suppression, respectively, of certain harmonics by vocal­tract resonances (forrnants) and in noisy and plosive clues.

Heinbach (1988) has demonstrated rhat from this type of part-tone-timepattern ariother audio signa! can be synthesized that is aurally alrnostindistinguishable from the original. This is convincing evidence for theconclusion rhar the contourized representation includes practically all au­rally relevant information. As the above example includes only one voice,

Even a superficial inspection, from this point of view, of the visual andauditory system reveals their "decisión machine character" to an over­whelming extent. Most remarkably, both in vision and audition, decisiónmaking actually begins right at the periphery, that is, in the eye's retinaand the ear's organ of Corti. Physiologically this is evident, for example,in the transformation of the stimulus into sequences of discrete neuralaction potentials (nerve impulses) that propagare on discrere nerve fibers.Psychophysically, ir is the phehomenon of contourization that providespertinent evidence. In this author's view, contourization provides a keyto a unifying concept of sensory inforrnation processing on any level.

Perception of a visual Gestalt entirely depends on the existence of pri­rnary con tours. Without contour there is no Cestalr. Formation of contourand synthesis of Gestalt are complementary, mutually dependent, activeprocesses. For the understanding of sensory information processing it isa key notion that even prirnary contour (i.e., on the retina level) is nottrivial in rhe sense that it is fully determined by the stirnulus alone. Whata stimulus essentially produces on the eye's retina is continuous brightnessand color distribution. Assignment of contours to such a distributionrequires active decisions on the part of the peripheral sensory system. Thattype of decision can be regarded as the first step of inforrnation processing,that is, cognirive abstraction. Ir is in this sense that cognition begins rightat the periphery (cf. the principie of immediate processing). And it is thisnotion= in conjunction with the principies of forward processing andautonomy-rhar explains to a considerable extent the enorrnous efficacyand speed of sensory information processing (e.g., Minsky, 1975). Visualconrours are so irnportant because they represent the inost typical andinvarianr characteristics of externa! objects. Formation of contours impliesabstraction from many details of the incoming stimulus-in particularthose that are dependent on intensity and color of illurnination-andextracrs the typical shape of externa! objects.

Most remarkably, it is auditory spectral pitch (i.e., the pitch of parttones) thar=-with respect to externa! "acoustical objects" -plays exactlyrhe same role. From rhe basic physical parameters of a sound-source signa!(i.e., amplitudes, phases, and frequencies of part tones), it is only thefrequencies that are transmitted with highest fidelity; amplitudes andphases ordinarily are to a considerable exrent corrupted.

Consider, for example, listening to a symphony in a concert hall. Onemay without difficulty distinguish one or the other individual instrument,

Page 12: Percieved Key Movement in Four-Voice Harmony and Single Voices

kHz54

3

1

Ernsr Terhardr . Music Perception and Sensory Lnformation Acquisition 225

2

"Sumrxiert 1 To evaluare adequately those analogous phenomena, it must be takeninto account that in the eye there is an extra spatial dimensionas comparedwith the ear. While on the eye's retina the three-dimensional externa!world is represemed by a rwo-dimensional continuous distribution of lightenergy, on the inner ear's cochlear partition ir is a one-dirnensional dis­tribution of sound energy. A visual contour is an abstraction of a line ina (two-dimensional) plane; the "line" itself is one dimensional. It is thusjust logical that its ·auditory equivalent (i.e., spectral pitch) is null­dimensional, that is, a "point" on the low-high dimension. Therefore, toa curvature or bending of a visual contour there corresponds a linear shiftof its auditory equivalent on the low-high dimension.

Second, ir is preciselythe role of spectral pitches as important carriersof information that throws an explanatory light on the precision anddurability of short-terrn memory for pitch (e.g., Rakowski, 1972). If inauditory communication, spectral pitches were of no particular signifi­canee, one could hardly understand why the auditory systern spends anyeffort to extraer them so efficiently and precisely, and why they are keptin short-term memory for a considerable time interval (which actually ison the order of a minute). This conceptual problem is immediately resolvedby rhe notion that spectral pirch is of high functional irnportance foracquisition of information from acoustic signals whose pararnerers aretime varianr, in particular, speech. The relevance of these notions for theperception of music is apparent: Perception of tonal music can hardly beimagined without the aforerrientioned characteristics of short-rerrn mern­ory for pitch.

A musical tone ordinarily is composed of a number of harmonic parttones of which the lower 8 to 12 evoke specrral pitches rhat correspondto rheir frequencies (Thurlow, 1959; Plomp, 1964; Terhardt, 1972). Thus,on rhe lowest leve!of the cognitive hierarchy an isolated musical tone mustbe regarded asan auditory Gesralt=-at'rnoiecule" rather than an "atorn"of music. While on higher levels of conscious perception the tone ordi­narily may appear as a holistic unir to rhe listener, by drawing attentionto the lowesr leve! one can hear the part-tone pitches too. This perceptualdualism may be regarded as evidence for the principies -of autonomy,forward processing, and viewback: On presentation of a musical tone, theforward processing hierarchy spontaneously and readily produces higher­level holistic representations of the perceived object (i.e., the tone), whilethrough the viewback channel the individual spectral pitches present onthe lowest leve! can be accessed as well.

~~A=-.:~~J' . -

.:;

·.·'·:,--:'.,;.....;o 1 -:--...,....:--:--:

o 1 2 3 s 4

Fig. l. Parr-tone parrern as a funcrion of time, of a solo soprano singer (firsr rhree notesof "Summerrime" by G. Gershwin). Parr-tone amplitudes are coded in line rhickness. Theinformarion displayed is sufficienr to synrhesize an audio signa! rhar is aurally almosrindisringuishable from the original. For rechnical reasons, only rhe frequency band 0-5kHz was analyzed. The diagram illusrrares rhe tonal informarion presenr on rhe firsr leve!of abstraction, rhar is, primary audirory contour.

ir should be noted that these results apply to polyphonic music and mul­rivoice speech, as well.

In addition to the analogy between visual prirnary contour and spectralpitch, there exist a number of psychophysical phenomena that stronglysupport the analogy and further reduce its arbitrariness. First, there aresorne accornpanying effects such as contrast enhancement (Mach bands;cf. Carterette, Friedman, & Lovell, 1969; Small & Daniloff, 1967; Sum­merfield, Haggard, & Foster, 1984; Viemeister, 1980); after-contours(Fastl, 1986; Wilson, 1970; Zwicker, 1964); and the type of "illusion"in which perceprions of shape, Iength, or direction of visual contours aresysrernatically differenr from corresponding objective pararneters, its au­ditory equivalenr is subjective shift of specrral pitch (e.g., by superirnposednoise), and octave enlargement of pure tones (Stumpf, 1965; Terhardt,1971, 1989; Walliser, 1969; Ward, 1954).

Secondary Contour and Virtual Pitch

The term "secondary contour" refers to contourization processes on thesecond leve! of the hierarchy; ir does not imply minor relevance. The

Page 13: Percieved Key Movement in Four-Voice Harmony and Single Voices

226 Ernsr Terhardt- -, .. '!·,e··:·,_:' . -, > ·r ,:1 ~ ·:- Music Perception and Sensory Information Acquisitian 227

existence of secondary conrours is evident borh in vision and audition. Invision, they ordinarily are termed "illusory contours" (for a review, seeParks, 1984). The choice of the term "illusory" is borh misleading andelucidating. It is misleading because it suggests interpretation of the phe­nomenon as a kind of artifact or even malfunction. It is elucidating becauseit reveals the principies of autonomy and viewback.

As illustrated by the example in Figure 2, virtual contours spontane­ously emerge from presentation of appropriate configurations of primarycon tours (autonomy). ·When viewed from a higher cognitive leve], it isrecognized that the virtual conrours are not "real." However, that rec­ognition does not change anything in what is seen. The viewback functionindeed is confined to just noticing an interpretation of the stirnulus thathas been auronomically established. As an essential principle of hierar­chical inforrnation acquisition, rhe viewback functiori's purpose and ad­vantage obviously is that ir enables drawing more conclusions on thehigher leve!, rhar is, after rhe autonomous low-level decision mechanismshave quickly and efficiently finished their job. Apparently this is one ofthe tricks by which sensory systems reconcile efficacy with flexibility.

As illustrated in Figure 2, the autonomous decision process on thesecond leve! creares borh a nurnber of virtual contours and a virtual Ge­stalt, namely, a white square that partly covers a black frame. The prorn­inence of that virtual percept naturally depends on the amounr of primaryinforrnation that is compatible wirh such an interpretation. In that sense,the second-level interpretation integrares the separare four black anglesinto a holistic object. So this is a visual example of the aforernentioneddualism of "synthetic" autonornous interpretation .and "analytíc" view­back. It appears conclusive that this example reveals another fundamentaland irnportant principle of sensory information acquisition.

Auditory perception of a musical tone can be explained by analogousprincipies, at least where musical pitch is concerned. A pertinent theorydoes already exist, namely, the virtual-pitch theory (Terhardt, 1972,

1974). Although this theory originally was not explicitly based on theintermodal analogies and general principies discussed here it readilv fitsinto them. Taking in to account the aforernentioned analogies berweenvisual prirnary contour and spectral pitch, and between visual virtualcontour and virtual pirch, the virtual-pitch theory turns out to be a naturalpart of the comprehensive concept of sensory information acquisition.

The significance of the virtual-pirch theory for music perception hasbeen found to extend far beyond the pítch of single musical tones. Thetheory includes explanations for such basic musical phenoma as toneaffinity and pitch ambiguity (e.g., octave equivalence), octave stretch andstretch of the tone scale, the root phenomenon (Rameau's "basse fon­damentale"), and equivalence of chord inversions. The theory, and itsalgorithrnic implernentations, can be recornmended as a too! for musicrheory (cf. Terhardt, 1974, 1978, 1979, 1982; Terhardt, Stoll, & See­wann, 1982a,b; Parncutt, 1988, 1989).

Both visual virtual contour and audirory virtual pitch can be regardedas sarnples of the irnmediate, forward processing, autonornous tendencyof low cognirive levels to extraer straightforwardly "what a stimulusrneans." Of course, this requires "knowledge," that is, use of certainreasonable criteria. In visual perception, rhose criteria are dependent onwhat type and configuration of objects ordinarily would produce the giventype of stimulus. And the same applies to auditory perceprion. As wasoutlined in· the virrual-pitch theory, ir is the human speech signa! thatprobably provides an irnportant reference for aura! evaluarion of tonalsounds. As by physical reasons, voiced speech elements are composed ofharrnonic part tones, the low-level mechanisms of the audirory systernoperare on the presumption that this is so fer any sound. According tothe theory, rhis is the reason why audirory creation of virtual pitch con­sisrently obeys the principle of "subharmonic coincidence derection" (Ter­hardr, 1972, 1974).

That behavior is assumed to have been acquired and setrled either aran early age by an individual or through biological evolurion. As wasdiscussed earlier, the relationships between subjective pirch shifts andinterval stretch suggest that the forrner is the case, that is, learning in earlylife. The assumption was made that development of the aura! mechanismthat creares virtual pitch is an essential part of the systern rhar processesspeech, that is, abstracts linguistic inforrnarion from the highly redundantspeech signa!. This conclusion is supported by experimental evidenceshowing that aura! capabiliries to norrnalize phoneric characterisrics ofspeech exisr in early infancy (Kuhl, 1979; Miller, Younger, & Morse,1982). Moreover, ir has been established that a human fetus can hearalready severa! rnonths before birth (in particular, the rnother's voice).With regard to general principles of biological development, ir is very likely

Fig. 2. Illustration of virtual conrours and virtual figures as an analogy ro virtual pirchand roor,

Page 14: Percieved Key Movement in Four-Voice Harmony and Single Voices

228 Ernst Terhardt1Í.:1ce:1:iT 1d·n.3

Music Perception and Sensory Infi;rmation AcquisiüonC..

229

that such acoustic stimulation has a pronounced conditioning effect onthe fetus's auditory system. Plasticiry of the pitch-evaluation mechanism­which of course is required for that type of low-level learning-was foundsrill to exist in adults (Hall & Peters, 1982; Hall & Soderquist, 1982).

Whether or not individual conditioning and learning is rhe basis ofaudirory cognitive achievements, it is not surprising that many of thoseachievernents exist already in early infancy, So it is not surprising thatevidence for borh virtual pitch perception and sense fer octave equivalencehas been found in young infanrs (Clarkson & Clifton, 1985; Demany &Armand, 1984). With regard ro the aforementioned intimare relationshipsbetween principies of pitch perception and basic musical phenomena, itis evident thar in rhe audirory system of very young-probably evennewborn-infants, the basic cognitive mechanisms to which tonal musicmay appeal are implemented.

Rigorously, rhe question of whether those mechanisms are innate orlearned has not been decided yet. However, for getting many basic insightsimo auditory inforrnation acquisirion and music perception, rhat questionis of minor relevance anyhow-as mentioned in rhe second section.

Concluding this section on secondary contour, it should be mentionedthat the tendency of any sensory system to extract "meaningful" second­level represenrations of primary contour configurations is so pronouncedthat it cannot be stopped if, and while, any stimulus is given. A strikingexample for this tendency was provided by Houtgast (1976). He dern­onsrrared that under certain experimental conditions subjects assign sub­harmonic virtual pitches even to single pure tones. For example, with a1000-Hz tone as stimulus, virtual pitches corresponding to 500, 333, 250,and 200 Hz were heard.

This finding illustrares borh the principie of subharmonic coincidence,where virtual pitch is concerned, and the fundamenta] "decision machine"characrer of living organisms. Obviously, evolution has in any living or­ganism very deeply implanted the principie that "any decision is (on theaverage) better rhan no decision." Validity of this principle can indeed beobserved on any leve] of perception and behavior. In music it can forinstan ce be found in rhe tendency to assign - in the conrext of tonalmusic-ro practically any pitch a certain pitch category, no matter howmuch the actual pitch deviates from "ideal" intonarion.

An apparent implication of these notions is ambiguity. In the aboveexample, rhe pirch of the lOOO"Hz tone was ambiguous such that eitherof the equivalent frequencies 1000, 500, 333, 250, and 200 Hz wereoffered. The 1000-Hz frequency indicares, on the primary level, what"really" was presentas a stimulus, while the other indicares, on the secondleve! "what ir reasonably could mean." The role of ambiguity, both ingeneral and with regard to music, deserves closer inspection, as follows.

Ambiguity and Similarity

Ambiguity has often been recognized as an irnporranr ingredient ofmusic (e.g., Bernstein, 1976; Thomson, 1983). Where sensory inforrnarionacquisition in general is concerned, arnbiguiry is both typical and essential.When one takes into account that in any case rhe effective stimulus of asensory organ can include only incomplete information on externa! objectsand events, it is apparent that the "meaning" of any given stimulus cannever be unambiguous.

In the hierarchical systern of conditioned decisions considered here,ambiguity implies that the number of "solutions" achieved on either levelis greater than one. From the present point of view this indeed is essenrial,as the solutions achieved on one level provide the input to the next. If thatinput did not have any alterriatives, there were nothing left to decide. Aswith "illusions," ambiguiry becomes noticed rhrough viewback, that is,inspection of ready-rnade solutions on lower levels, and drawing addi­tional conclusions on a high leve!.

One can make a distinction berween two basic sources of ambiguity.The first is insufficiency of structural information included in the stimulus.A pertinent exarnple was just discussed, that is, the case of reducing theaudirory stimulus to just one pure tone. The second source of ambiguityis content in the stimulus of contradictory structural information. In vi­sion, pertinent examples are provided by the class of "impossible figures,"for example, the Necker cube and most of M. C. Escher's graphics. Mostof rhe melodic, harmonic, and rhythmic ambiguity of tonal music is anal­ogous to visual impossible figures.

Even in the simple visual example shown in Figure 2, there is consid­erable ambiguity. First, the black angles may be seen just as what theyare: black bars on white background. That visual interpretation isachieved most easily when the black bars are narrow. Second, one maysee a white square that floats above a (supposedly closed) black frame.Third, one may see two white squares (one rotated by 45 degrees and .floaring above the orher) on a black background. Remarkably, rhe amountof ambiguity does not seem ro be sysremarically dependent on the prorn­inence of the virtual contours and figures, which in turn is governed bythe width of the black frame. With increasing width of the frame, prorn­inence of virtual squares increases, but ambiguity remains the same-oreven increases as well.

The same type of ambiguiry is involved in audirory perception of amusical tone. There is no such thing as "the" pitch of a complex tone.On the lowest level, there is a set of (ordinarily harmonic) spectral pitches.On the second level, virtual pirches are created. What is consciously per­ceived in rhe "spontaneous" or synthetic mode is a holistic tonal object

Page 15: Percieved Key Movement in Four-Voice Harmony and Single Voices

2.3b Music Perception and Sensory lnformation Acquisition 231

that is prirnarily characterized by virtual pitches. By viewback, the low­level spectral pitches become recognized also. The virtual-pitch theoryaccounts for that multiambiguity by assigning weights to the individualpitches, either spectral or virtual. This is illustrated by Figure 3, where thetheoretical pitch distributions of two harmonic complex tones are shown.The assumed fundamental frequencies are 440 Hz (upper diagram) and220 Hz (lower). The 440-Hz tone is "higher in pitch" than the 220-Hztone, not only in the sense that its predominant pitch is higher, but in the·sense that the whole pattern is higher. So when a melodic sequence ofmusical tones is considered, rhe pitch-time contour in rhe sense of Dowl­ing's (1978) concept should be discussed in terms of the correspondingsequence of pitch patterns rather than of single pitches.

As can also be seen in Figure 3, in the particular case of a 2: 1 fun­damental frequency ratio there appears a type of similarity between thetwo tones that is determined by identity of sorne pitches. When these twotones are played one after the other, a portian of pitches of the first willbe "echoed" by the second. By visual analogy one can express this bysaying that the second Gestalt shares a number of contours with the firstone. Ir is apparent that this effect will promote a tendency to perceive thesecond tone just as a replication of the first.

This is no less than a simple and straightforward explanation of octaveequivalence, that is, chroma. This explanation is very similar to that sug­gested already by Helmholtz (1954). However, an important new aspectis provided by rhe presence of virtual pitches in the patterns. WhereasHelmholtz had considered only rhe pattern of spectral pitches-whichextends from the fundamental frequency to severa! harmonics-thepresent pitch patterns include a considerable number of virtual pitches thatare below the tone's fundamental frequency. This enhances the chance ofhigher-level pirches of successive tones coinciding and thus provides con­siderable additional evidence for Helmholtz's conclusion.

Alrhough most pronounced for a 2: 1 frequency ratio, the effect ofsimilarity by coincidence of pitches applies to the ratio 2:3 as well, thatis, the fifth. For that ratio, the nurnber and weight of coinciding pitchesis less than for the octave. This accounrs borh for the existence of "fifthequivalence" and for the fact that it is less pronounced than octave equív­alence. Taking advantage of the described principies, one can design mod­els for quantirative evaluation of tone affinities-another contribution toa scientifically based music theory. The recent work of Parncutt (1988,1989) provides solutions of rhat type. .While rhe above considerations were made on the basis of data provided

by the virtual-pitch theory, the ambiguous second-level pitch patterns ofindividual musical tones have been experimentally verified. Figure 4 showsa number of pitch histograms that were obtained by pitch marches toharmonic complex tones with the fundamental frequencies indicated (from

PITCH-EOUIVALENT FREO. ••.

1

2 55 FFo1~:0Hz220 4~~ 880Hz

1 IVV

V V 5 51- o LJ__ _._.__..___._ __ ~-~---,

6 Fz Az 03 A3 A1. As E5

~2[' FF=2~0Hz tv

I

~ ~hv ¡v r ¡v 1 r 15 15A, 02 Az 03 A3 At. Es As

Fig. 3. Theorerical pirch parrerns, on rhe second leve! of abstracrion, of rwo harrnoniccomplex rones with fundamental frequencies 220 and 440 Hz. Virtual pirches: v; specrralpitches: s. Note ambiguiry of pirch and parrial coincidence of pirches in rhe rwo patterns.Calcularion of pirches and pirch weighrs as described by Terhardt et al. (1982a).

Terhardt, Sroll, Schermbach, & Parncutt, 1986). In those experiments,harmonic complex tones were binaurally presented through earphones,with 60 dB SPL, and 0.2 sec duration. After each presentation, apure toneof arbitrary frequency was presented, and the subject adjusted its fre­quency such that ir matched any spontaneously heard pitch of the previouscomplex tone. Eight subjects took part, and each subject did six marches.The histograms shown in Figure 4 represent rhe number of marches ac­cumulated within a continuously shifted window with a width of 0.2critica! bands. Abscissa is matching frequency, rhat is, pitch-equivalentfrequency. As expected, the highest peaks are at the complex tone's fun­damental frequency, The same type of ambiguity as theoretically predictedcan be seen. Far a very low fundamental frequency (60 Hz), a pronouncedtendency was found far alternative marches one octave higher. At higherfundamental frequencies, alternative marches to subharmonic frequencieswere found. By and large, the experimental data are well in line withtheoretical predictions (Figure 3).

The Tritone Paradox: Another Exercise in Ambiguiry of Pitch

Although the ambiguity of pitch of a "normal" musical tone is sufficientto explain octave equivalence (chroma) and fifth similarity, wirh a par-

Page 16: Percieved Key Movement in Four-Voice Harmony and Single Voices

r- 232 Ernsr Terhardt

A

900

tf3 ~¿Le ) \ =

n A 600::r:uf- t . /\ ~ 4-50<i::2u, t 11 " 300oó 1 11 ~ 240z

~ 1 J\.,, 180_,,..,

j ~' 120~ 1 "11 fund. frequency: 60 Hz

o 0.4 0.8 1.2 1.6kHz 20FREQUENCY OF MATCHING TONE

Music Perception m1g,~'fNs<f?,Jtfformatio11Acquisition

(1)

The pronounced arnbiguiry of "Shepard tones" has suggested a numberof experimenrs to find out how the auditory system behaves when thatrype of tone is used in a simple quasimusical conrexr (Deutsch, 1986,1988; Deursch, Kuyper, & Fisher, 1987; Deutsch, Moore, & Dolson,1984). One of the effects found is called the trirone paradox. Manifes­tation of this paradox starts out with the notion that for two successiveShepard tones that differ in pirch class by a tritone interval, ir is impossibleto find an objecrive criterion for deciding if the firsr is higher than rhesecond, or vice versa.

The first remarkable finding byDeutsch et al. was that subjects to whomsuccessive tritone inrervals of Shepard tones were presenred were wirhconsiderable consisrency able to make a decisión on whether rhe intervalwas "ascending" or "descending" in pirch heighr-alrhough ali subjectsdid not respond the same way. The second significant finding was that the"ascending/descending" decisions were consisrently dependenr on pitchclass. For example, one and rhe same subject would consistenrly hear rheinterval e - H as descending, however, rhe interval E - A# as ascending,That dependency of judgments on pitch class was termed the tritone par­adox, as on first sight rhere appears to be no simple psychophysical basisfor that rype of response.

The tritone paradox as a phenomenon is indeed striking, and one canhardly have a doubt that sorne kind of absolute pitch recognition mustbe involved. Moreover, the phenomenon suggesrs thar the subject's de­cisions must be dependent on cognitive processes. The following expla­nation of the tritone paradox will revea! that borh these conclusions aretrue. However, both "absolute pirch recognition" and "cognitive pro­cesses" turn out to play their role on a surprisingly low leve! of rhehierarchy.

The key to the explanation is rhe simple fact that pitch-height ambiguityof Shepard tones turns out to be distinctly limired. When pitch marchessuch as described above for "normal" complex tones are carried out withShepard rones, ir rurns out that a certain absolure region of pitch­equivalem frequencies is systemarically preferred, namely, the frequencyregion exteriding roughly from 200 to 1000 Hz, with a maximum of"preference" ar about 300 Hz. This was verified both theoretically (Ter­hardt et al., 1982b) and experimenrally (Terhardt et al., 1986). The sourceof this effect is rhe combined influence of spectral dominance (Ritsma,1967; Plomp, 1967), and subharmonic evaluation in formation of virtualpirch (Terhardt, 1972).

Figure 5 illustrates the relationship berween the latter type of "absolutepitch recognirion" and the tritone paradox. With the algorithm describedby Terhardt et al. (1982a), rhe virtual pitches were computed for Shepardtones. The pitch classes are de~ored on the abscissa, while rhe theoretical

Fig. 4. Accumulared distributions of pure-rone marches ro harmonic complex rones wirhthe fundarnenral frequencies indicared (from Terhardt et al., 1986). Nore ambiguity ofpirch,

ticular type of harmonic complex tone, ambiguiry of pitch can be madeparticularly pronounced. That rype of complex tone includes only har­monics rhe frequencies of which are defined by

where n = O, 1,2,3, ... , and (0 is a low base frequency, for example, inthe region below 100 Hz. The part tones either cover the enrire audiblefrequency range (so that the lowest and highest merely fall below thethreshold of hearing); or they are limited by a bandpass filter to a certainfrequency band (for details see Shepard, 1964; Deursch, 1986).

In that type of complex-tone stimulus, pitch inforrnation is reduced insuch a way that the auditory pitch-evaluation mechanism cannot rea­sonably assign to it one dominant virtual pitch. What is heard is rathera set of virtual pitches that are in an octave relationship to each other anddo not much differ in prominence. So,while the "chroma" or "pitch class"of such a complex tone is well defined, its height is not. In a number ofexperimenrs, Shepard (1964), Burns (1981), and Ohgushi (1985) havedernonsrrated severa! aspects of the "circulariry" of perceived pitch thatis rypical for that type of stimulus.

Page 17: Percieved Key Movement in Four-Voice Harmony and Single Voices

Music Perception and. Sen-'so:rfiñformation Acquisition 235

pitches pertinent to each tone are lined up.vertically. The theoretical prom­inence (pitch weight) of the pitches is indicated by the area of blacksquares. The ordinate is scaled in semitones, and the reference frequencyboth for the pitch classes on the abscissa and the semitione scale at theordinate is 440 Hz for A4. The Shepard tones were assumed to be com­posed according to Eq. (1) with base frequencies f0 of 16.35-32.70 Hz(depending on pitch class). The SPLs of part tones were assumed to be50 dB, and part tones were included from f0 up to maximally 6 kHz.

One can see in Figure 5 that, for example, the Shepard tone with pitchclass C (first item at the abscissa; base frequency i; = 16.35 Hz) producespitches at 24, 3·6,48, etc. semitones, corresponding to the pitch heightsC2, C3, C4, etc The area of squares indicares that the pitches C4 and C5

(262 and 525 Hz equivalent frequency) are most prominent. The secondpitch class, fl, was created with the base frequency t: = 23 .12 Hz, thatis, higher than that of C by a tritone ratio. One can see, however, thatthis is virtually irrelevant where theregion of most prominent pitches isconcerned-the pirches corresponding to the base frequency and its secondharmonic are more or less ignored borh by the ear and the theory.

To explain the tritone paradox on that basis-and within the generalapproach put forward in the present study-one merely needs take intoaccount that the pitch patterns shown in Figure 5 visualize 3: second-levelcognitive representation of the corresponding Shepard rones, Evaluationof whether rhe tritone intervals C--:. fl, C# - G, etc., ar~ ascending ordescending is thus a challenge to the third cognitive level. Although thecriteria for that evaluation are not a priori evidenr, visual inspection ofthe diagram (Figure 5)-silggests a reasonable solution, namely, to trace thedirection in which the rnost prorriinent pitches move from the first to thesecond Shepard tone. When (arbirrariiy) tlÍe two most prÓminent pitchesare selected, one arrives at the solution indicated by arrows. The lattejindeed reflect what was termed the tritone paradox: That the directionof perceived pitch height is dependent on pitch class.

While the phenomenon that is decisive for this explanation-namely,preference of a particular absolute pitch region-is induded iri thevirtual­pitch theory, individual variations are not, of course. It is not difficult tosee that a small deviation from the average, of the position and/or shapeof the "preference characteristics" of a particular subject, can consiclerablyaffect the "ascending/descending" judgments. As "preference character­istics" are justa parameter of auditory cognitive strategy, there may-indeedexist systematic individual differences. For example, with respect to theaforementioned theoretical relationships between virtual-pitch evaluationand speech perception, one may speculate that acoustic parameters of bothexterna! voices and one's own voice, by exposition in early life may have

84

••

• • • • •

\ \• • •• • • • •\" \•• • • •• •I• • • •. . . . .• • • • •

72

60

+s

36

2+

12F"e c"G G"

D D"A A"E

Fig. .5. Theoretical pitch patrerns of Shepard tones. Pitch class is indicared on the abscissa.Pirches pertinent to each tone are verrically lined up. Ordinate: Pirch-equivalenr frequency,expressed in semirones above C0 (16.35 Hz). Area of squares is proportional to calcularedpitch weight, that is, represents prominence. Note thar for ali Shepard tones, the rnosrprominenr pitches are in the height region of abour 36 to 60 .sernitones, correspondingto 131-525 Hz. On rhe abscissa the Shepard tones are arranged in pairs of trirone intervais.The existence of the "preference region" causes systernaric, pirch-class-deperidenr ascend­ing or descending of the two most prominent pitches wirhin each pair (arrows). Thisexplains the tritone paradox. Cornputarion as described by Terhardt et al. (1982a).

a conditioning effect. Whatsoever, in the light of the present explanationof the tritone paradox, it is not surprising that the experimental resultsfound by Deutsch et al. show systernatic intersubject differences.

Page 18: Percieved Key Movement in Four-Voice Harmony and Single Voices

e 236 .· Ernsr .Terhardt ·,;'~··>'.

Concluding Discussion

After the foregoing considerations of the inforrnation processing hi­erarchy and sorne of its low-level implications for the perception of music,ir would be just consequent to proceed wirh discussing rhird- and higher­level processes. Ir is essentially on the third leve! that temporally codedinformation comes into play, that is, rhe musical inforrnation included inmelodic contour, root progression, and rhythm. As on those topics muchexperimental and theoretical research as been done already (e.g., Deutsch,1982a), ir is an inreresting challenge to incorporare them into the presentgeneral approach. Of course, rhis by far would exceed the scope of thepresent study.

There exist a number of experimental investigations ·pertinent to theborderline between low-level, staric, and higher-level dynamic cognitiveprocesses, providing a kind of interface. Here appear of particular rele­varice:

• The "continuiry effect" and "pulsarion threshold" (e.g., Thur­low, 1957; Warren, Obusek, & Ackroff, 1972; Houtgasr,1974);

• Relationships between perceived rhythm and spectral- andvirrual-pitch patterns such as dernonstrared by van Noorden(1975);

• Virtual pitch evoked by nonsimultaneous harmonics (Hall &Perers, 1981).

Finally, sorne biological aspects of the present approach deserve to bernenrioned. Although the present approach does not give an imrnediateanswer to the question on a possible "survival value of music" (Roederer,1984), ir provides a pertinent message. Thar message essentiatly is thartbere exisr a number of fundamental principies of sensory informationacquisition that are decisive for survival:

• Immediare conditioned reaction to an externa] challenge.•As a provision of appropriate reaction: Immediate, aurono­mous processing of the information included in any sensorystimulus (i.e., abstraction).

- As the basic element of abstraction: Discretization (conrour­izarion) by conditioned decision.

•As a tool and provision for efficienr absrraction: Evolurion,acquisition, and utilization of distributed knowledge.

As ouclined in rhe present srudy, it is rhose survival-relevant principies thatgovern auditory perception of tonal music as well.

~-

Music Perception a'Yiar;§eizfoiyi'b;formationAcquisition 217

. ·These ñotit>hsparticularly ernphasize the ;levi~te'bfres'e'arcfl on tr;J~itperceptiorí by animals. The remarkable performance in perception of mu­sical stirnuli of, for example, pigeons (Porter & Neuringer, 1984) and

·.starlings (Hulse & Page, 1988) can be seen as a natural consequence ofrhe above principies. Those principies provide a conceptual link berweenresearch on animals and on humans.1• 2

References

Bernsrein, L. The unanstoered question, Cambridge, MA: Harvard Universiry Press, 1976.Bregman, A. S., & Campbell, ]. Primary audirory strearn segregarion and rhe perception

of order in rapid sequences of rones. [ournal o( Experimental Psychology, 1971, 89,244-249.

Burns, E. Circulariry in relarive pirch judgments for inharmonic rones: The Shepard dem­onsrration revisited, again. Perception & Psychophvsics. 1981, 30, 467-472.

Carrerette, E. C., Friedman, M. P., & Lovell,]. D. Mach bands in hearing. [ournal of tbeAcoustical Society of America, 1969, 45, 986-998.

Carrererre, E. C., Kohl, D. V., & Pirt, M. A. Similariries arnong rransforrned melodies: Theabsrraction of invarianrs, Music Perception, 1986, 3, 393-410.

Clarkson, M. G., & Clifron, R. K. Infanr pirch perceprion: Evidence for responding ro pitchcaregories and rhe missing fundamental. [ournal of the Acousticai Society of America,1985, 77, 1521-1528.' . ' .

Demany; L.,&Ármand, F. The perceprual realiry of tone chrorna in early infancy, Journal'" of the Acoustical Society. of America, 1984, 76; 57-66.Deu¡~~h,_;D,Music reccgnirion. Psycbological Reuiew, 1969, !li,_300-307. . . . ,Deutsch, D. (Ed.) The .psychology of music. New York: A.:adem1c,Press, 1982a....Deursch, D. Grouping mechanisms in rnusic.In D:Dc:urs<:h(Ed.), Tbe psjicholoi;i of!ndsic.

New York: ·Academic Press, 1982b. ··,;~'-·' - ;Deutsch, D. A musical paradox. Music Perception, 198§,, 3, 275-28Q. ,..Deutsch, D. The semitone paradox. Musk: Perception, i9SS, 6,~115,,-132. · , . , ,Deutsch, D., Kuyper, W. L., & Fisher, Y. The trirone paraqt\.x_:-frs'présence and form'ii(

disrriburiori 'iri a gerieral popularion. Music Perception, 1987, 5, ·79~92: · ·· ·Deursch, D., Moore, F. R., & Dolson, M. Pirch classes differwirh respecr ro heighr. lvfusic, Perception, 1984, 2, 265-271. . .... ,Dowlirig, W. J:Scale and conrour: Two cornponenrs of J rheory of rnernory for melodies.

Psycbological Reuieur, 1978, 85, 341-354. · ·Fasrl, Hz Auditory after-images produced by cornplex rones with a specrral gap. In Pro­' .ceedings of the 12th lnternational Congress on Acoustics. Bl:--5. Toronro: BeauregardPress, 1986: · ' · ··

Hall, J. W., & Pérers, R.W. Pitch for nonsimultaneous successiveharmonics in quier andnoise. [ournal of tbe Acouetical Sociéty 'o] América. 19SL 69, 509-513.

Hall.]. W., & Perers, R-.W. Changein rhe pirch of a cornplexrene following irs associarion_wirh ¡i, second cornplex rone. [ournal of the Acoustical Society of America, 1982, 71,'142-146; .

Hall, J:'W.,·& Soderquisr, D. R. Transient cornplex and pure rone pirch changes by ad­apration, [ournal of the Acoustical Society of Americe. 1982, 71, 665-670.

l. This srudy '".ªS worked out in rhe Sonderforschungsbereich 204 "Cehor", München,supported by rhe Deutsche Forschungsgemeinschafr.

2. Porrions of rhis paper were presented ar the Firsr Inrernarional Conference on MusicPerceprion and Cognition, Kyoro, Japan, Ocrober 1989.