the sound of rolling objects : perception of size and speed · the sound of rolling objects :...

138
The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002). The sound of rolling objects : perception of size and speed. Technische Universiteit Eindhoven. https://doi.org/10.6100/IR556897 DOI: 10.6100/IR556897 Document status and date: Published: 01/01/2002 Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim. Download date: 07. Sep. 2020

Upload: others

Post on 18-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

The sound of rolling objects : perception of size and speed

Citation for published version (APA):Houben, M. M. J. (2002). The sound of rolling objects : perception of size and speed. Technische UniversiteitEindhoven. https://doi.org/10.6100/IR556897

DOI:10.6100/IR556897

Document status and date:Published: 01/01/2002

Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:www.tue.nl/taverne

Take down policyIf you believe that this document breaches copyright please contact us at:[email protected] details and we will investigate your claim.

Download date: 07. Sep. 2020

Page 2: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

The sound of rolling objectsPerception of size and speed

Page 3: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

The work described in this thesis has been carried out under the auspices ofthe J. F. Schouten School for User-System Interaction Research.

c 2002 Mark Houben - Eindhoven - The Netherlands.

CIP-DATA LIBRARY TECHNISCHE UNIVERSITEIT EINDHOVEN

Houben, Mark M.J.

The sound of rolling objects: perception of size and speed / by Mark M.J. Houben. -Eindhoven: Technische Universiteit Eindhoven, 2002. -Proefschrift. -ISBN 90-386-1797-6NUR 962Keywords: Auditory event perception / Nonspeech audio / Everyday sounds /Psychoacoustics / Rolling objects

Printing: Universiteitsdrukkerij Technische Universiteit Eindhoven.

Page 4: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

The sound of rolling objectsPerception of size and speed

PROEFSCHRIFT

ter verkrijging van de graad van doctor aan deTechnische Universiteit Eindhoven, op gezag van deRector Magnificus, prof.dr. R.A. van Santen, voor een

commissie aangewezen door het College voorPromoties in het openbaar te verdedigen

op maandag 1 juli 2002 om 16.00 uur

door

Mark Mathieu Jeanny Houben

geboren te Weert

Page 5: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Dit proefschrift is goedgekeurd door de promotoren:

prof.dr. A. Kohlrauschenprof.dr. R.A. Lutfi

Copromotor:dr. D.J. Hermes

Page 6: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Acknowledgment

The research described in this thesis would not have been possible without thehelp and presence of several people, and I certainly would not have enjoyedit as much as I did.

First and foremost, I would like to thank my promotor Armin Kohlrausch,who manages to combine expertise and passion for research with scientificand social involvement.

Furthermore, I would like to thank Dik Hermes for giving me a little of hisprecious time, Berry Eggen, for enthusiastically initializing this project, RobertLutfi, Stephen McAdams, and Tammo Houtgast for valuable comments, JanEngel, Ralph van Dinther, and Martin McKinney for their assistance, andLuuk Franssen and Christophe Stoelinga for their scientific and musical con-tributions.

I would also like to thank Jeroen Breebaart for being a roommate I only couldhave hoped for: calm, bright, pleasant, and musical, Leon Luwijs (the lifeand soul of IPO) for his ubiquitousness and social skills, Willem-Paul ‘howto use statistics’ Brinkman, Arnout ‘talking encyclopedia’ Fisher, Steven vande Par, Hilde Keuning, members of the former IPO-band, the ‘thee-club’ atIPO and M-wing, and all those keeping up a good atmosphere despite allmanagemental misery.

Finally, I would like to thank my parents and brothers, and especially Olga,for their support and love.

v

Page 7: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

vi

Page 8: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Contents

1 General introduction 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Nonspeech audio . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Ecological approach to auditory perception . . . . . . . . . . . . 71.4 Previous studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.5 Research methodology . . . . . . . . . . . . . . . . . . . . . . . . 221.6 Scope and overview of this thesis . . . . . . . . . . . . . . . . . . 25

2 Perception experiments with recorded sounds 292.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.2 Recording method . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.3 Experiment I: Perception of the size of rolling balls . . . . . . . . 31

2.3.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.3.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.4 Experiment II: Perception of the speed of rolling balls . . . . . . 342.4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.5 Experiment III: Interaction between size and speed . . . . . . . . 372.5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382.5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.6 Auditory cues derived from the stimuli . . . . . . . . . . . . . . 412.6.1 Analysis of temporal properties . . . . . . . . . . . . . . . 422.6.2 Analysis of spectral properties . . . . . . . . . . . . . . . . 442.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

vii

Page 9: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Contents

2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Perception experiments with manipulated sounds 513.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523.2 Sound manipulation algorithm . . . . . . . . . . . . . . . . . . . 523.3 Pilot experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.3.1 Stimulus description . . . . . . . . . . . . . . . . . . . . . 553.3.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.4 Main experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.4.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.4.3 Statistical analysis and discussion . . . . . . . . . . . . . . 65

4 The influence of angular speed 734.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744.2 Amplitude modulation . . . . . . . . . . . . . . . . . . . . . . . . 754.3 Experiments on size and speed perception revisited . . . . . . . 764.4 Interaction experiment with amplitude modulation . . . . . . . 80

4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 804.4.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824.4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.5 Experiment with independent variation of angular speed . . . . 914.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 914.5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914.5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934.5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 964.7 Previous perception experiments revisited . . . . . . . . . . . . . 98

4.7.1 Perception experiment III with recorded sounds revisited 984.7.2 Perception experiments with manipulated sounds revis-

ited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.7.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5 General discussion 107

viii

Page 10: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Contents

Bibliography 115

Summary 123

Samenvatting 125

Curriculum vitae 127

ix

Page 11: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Contents

x

Page 12: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1General introduction

Sound is a ubiquitous and natural medium to obtain information about theworld we are living in. We are able to identify and locate many sound sources,from footsteps to car sounds and from thunder to breaking glass. However,sound is barely used in information and communication technology. In orderto create suitable auditory interfaces based on everyday sounds, we have tobetter understand their perception. The aim of this thesis is to obtain a clearerview on the auditory perception of the size and the speed of rolling objects.After a short discussion of nonspeech audio and an ecological approach toperception, an overview of relevant studies on auditory event perception isgiven. This is followed by a description of the research methodology we usein this thesis. According to this methodology we investigate the relationshipbetween the perceived and the physical properties of an acoustic source event,with the aim of discovering acoustic structures within the sound that listenerscan use in their judgments.

Page 13: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

1.1 IntroductionThe world around us is full of sounds. Some beautiful, some irritating, somefascinating, some familiar, some peculiar, but about all informative. The tele-phone rings. Someone is whistling. The computer rattles. A car is passing by.These are all examples of sounds we may encounter in daily life and which in-form us about the world we are living in. By listening to sounds, we are able toextract information about the sound source, the location, and the environmentin which the sound is produced. For example, when walking in an alley, ourown footsteps inform us about the condition of the ground we are walking on(e.g. gravel, asphalt, mud). The echo of our footsteps may reveal the size ofthe alley. Another sequence of footsteps may betray a person hastily walkingbehind us. A lark singing above us indicates that we are not walking in thecity center. A wide spectrum of ‘noises’ may provide additional information:the sound of a car, voices, music, a dustbin being emptied, etc. Our world isfar from silent.

Often, the collection of all sounds is divided into three classes: speech, mu-sic, and everyday sounds. The main emphasis in auditory research has beenon speech. However, most sounds arriving at our ears are nonspeech sounds.Of these sounds, music has received most the attention of the research scien-tist, and only little research has been done on everyday sounds. Furthermore,although nonspeech sounds are a familiar and natural medium to get infor-mation, they are barely used in information and communication technology.Nonspeech audio is becoming a standard feature of most new ‘multimedia’computers, enabling people to compose music or download sound samplesfrom the internet. But with the exception of some ordinary bleeps, buzzes andaudio samples as acoustic alerts, the use of sound as information source isvirtually absent.

1.2 Nonspeech audio

Nonspeech audio promises to be beneficial in user interfaces. For instance,nonspeech sound can be used to reinforce the visual modality. If visual in-formation is accompanied by auditory information that matches or supple-ments the visual content, performance can be improved and usability in-creased (Brown et al., 1989; Brewster et al., 1994). In addition, by using theauditory modality, overload of the visual modality can be prevented, and theoverall cognitive load in general reduced (Baecker and Buxton, 1987; Brew-

2

Page 14: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.2 Nonspeech audio

ster, 1997). Furthermore, auditory displays have the potential to convey infor-mation that is difficult to visualize, for instance information about changingevents or the internal mechanisms of complex objects. In this case it may bemore appropriate to use sound (which has a dynamic character) rather than toinvent visual metaphors (which are static by nature). Since mostly the humanperceptual system does not remain conscious of steady-state sounds (Buxton,1989), continuous sounds do not have to be actively monitored but will onlydraw attention when they change. Another striking aspect of audio messagesis that they are not restricted to directional attention of the user (Brewsteret al., 1995). Unlike visual objects, sound is received regardless of where oneis looking. Sound and vision are complementary modes of information whichcan be described by saying that sound exists in time and over space, and visionexists in space and over time (see table 1.1) (Gaver, 1989). Finally, auditory in-terfaces may help to make the increasingly spatial ‘windows-based’ interfacesof current systems accessible to visually impaired users (Edwards, 1989; Poll,1996; Mynatt, 1997). Besides these functional advantages, non-speech audioincreases feelings of direct engagement and enjoyment. For example in inter-active games, addition of non-speech audio not only improves performance(Buxton, 1989), but also makes the game more realistic and entertaining.

Table 1.1: The complementary character of sound and vision (Gaver, 1989).

Time Space

Sound

Sound exists in time

� Good for display of changingevents

� Available for a limited time

Sound exists over space

� One does not need to face thesource

� A limited number of messagescan be displayed at once

Vision

Visual objects exist over time

� Good for display of static ob-jects

� Can be sampled over time

Visual objects exist in space

� Must face the source

� Messages can be spatially dis-tributed

The area of auditory interfaces is growing. Three main areas of research canbe distinguished. One area is audification (the use of sound to display infor-mation) for the blind or visually impaired. Two examples are audification ofthe topological arrangement of objects (Rigas and Alty, 1997), and the creationof an auditory perception of algebra (Stevens et al., 1994). A second area ofresearch is sonification, sometimes called auralisation or audiolisation, whichdenotes data representation and interpretation through audio. For this pur-

3

Page 15: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

pose, for example, tones are associated with graphs (Blattner et al., 1994; Flow-ers and Hauer, 1995), algorithms (DiGiano and Baecker, 1992; Jameson, 1994;Vickers, 1999), or multi-dimensional data (Bly, 1994). The approach of usingdata values to construct a waveform that is directly translated to the audibledomain, is sometimes called audification. Examples are listening to the wave-form of an electroencephalogram or a seismogram (Hayward, 1994; Dombois,2001), listening to the course of stock values (Neuhoff et al., 2000), or listeningto radiation levels with a Geiger counter. A third area of research are auditorymessages. Auditory messages have been used throughout history, even be-fore the invention of electricity, for instance post-horns, church bells, militarybugle calls, and drums. Examples of auditory messages of today’s world arealarms, telephone rings, computer beeps, sounds providing ‘walk’ and ‘don’twalk’ information for visually impaired pedestrians, and warning bleeps of abacking vehicle. Two main types of auditory messages in user-computer inter-faces are auditory icons, and earcons. Basically auditory icons use recorded,processed or synthesized ‘real sounds’ (everyday listening) whereas earconsare based on abstract musical patterns (musical listening) 1.

EarconsThe concept of earcons were worked out by Blattner et al. (1989). They areabstract, musical tones that can be used in structured combinations to cre-ate auditory messages to which arbitrary meanings are assigned. The usermust learn the mapping between the earcon and the action or object it repre-sents. Blattner et al. define earcons as “nonverbal audio messages used in theuser-computer interface to provide information to the user about some com-puter object, operation or interaction” (p. 13). The elemental parts are shortsequences of tones called motives. By combination and modification of mo-tives, more complex units with different but related meanings are obtained.The most important features of motives are rhythm, pitch, timbre, register,and dynamics. Guidelines for the creation of earcons can be found in Brew-ster (1994a) and Brewster et al. (1995).

Auditory iconsAuditory icons, developed by Gaver (1986), are natural, everyday soundsof the world around us. They can either be used as a stand-alone auditory

1This nomenclature is, however, not universal; sometimes earcons are included when re-ferring to auditory icons, and vice versa. In this thesis, the distinction between auditory iconsand earcons is made explicitly.

4

Page 16: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.2 Nonspeech audio

human-computer interface or as an addition to a visual or haptic interface.Examples of real-world sounds are footsteps, the closing of doors, breakingof glass, bouncing of balls, rain, birdsong, and percussion sounds. A greatadvantage of real-world sounds is that they sound familiar, since humans arelife-long acquainted with them. Gaver stated that research on sound is con-strained by a traditional psychophysical understanding of sound and hearing.He suggests not to map sound and data by dimensions of sound itself, suchas pitch, timbre, loudness, or duration (musical listening), but by dimensionsof the sound’s source, such as type of interaction, material, surface condition,velocity, force, elasticity and asymmetry of the objects involved (everyday lis-tening). His basic idea is that sound conveys information about materials in-teracting at a location in an environment. In everyday life, people listen to theproperties of sources that create sound (everyday listening) rather than to themusical properties (musical listening). Gaver puts it this way (Gaver, 1994,p. 418):

“Auditory icons are an application of a new approach to sound andhearing that stresses the experience of hearing events in the world,rather than sounds per se.”

Based on ideas of Vanderveer (1979) and Warren and Verbrugge (1984), Gaversuggests the use of natural sounds in the way they are encountered in thenatural environment (Gaver, 1993a; Gaver, 1993b).

Often, similar auditory percepts can be obtained by different physical events.Blattner et al. (1994, p. 455) refer to these sounds as “naturally occurring audi-tory homonyms”. In film industry these similarities are used to make specialeffects, for instance, a sonic impression of thunder can be created by oscil-lating a plate of metal. This implies that a naturally occurring sound pre-sented in isolation may become ambiguous if information from other soundsor senses as well as environmental modification clues are missing (Deutsch,1983). In this way auditory icons may lose their advantage of intuitivity overmore abstract sounds like earcons. Complementing the auditory icon witha visual icon or event, helps to make the association easier to understand.Moreover, no exact copy of a real-world sound has to be used to induce an as-sociation. Blattner et al. (1994) use this idea for creating earcons. Gaver thinksof auditory icons as caricatures of naturally occurring, everyday sounds. Co-hen (1994) used sounds to notify users of background processes. The soundswere very realistic but not used in the way suggested by the natural envi-ronment. Gaver thinks of auditory icons as cartoon sounds: caricatures ofnatural sounds that may not really sound like the sonic events they portray,

5

Page 17: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

but that capture their essential features. They allow representation of dimen-sional data and conceptual objects. If necessary, metaphors should be used(e.g. a hiss representing a snake, decreasing pitch representing falling). Whenusing sound in some other areas, for instance in the world of interactive com-puter games, films, and virtual reality, more natural sounds, with hardly anydistinction between the computer generated sound and its equivalent in reallife, are desired. If, for example, one wants to synthesize the sound of walkingin interactive multi-media applications (Eggen, 1995), it may not be sufficientto simply generate a sequence of taps, although these are often associated withwalking.

One of the first interfaces that incorporated auditory icons was the Sonic-Finder (Gaver, 1989). It was an extension to the Finder, a filemanager on theMacintosh computer, which played sampled sounds modified according toattributes of the relevant events. Most of these sounds were parameterized,but the ability to modify sounds was limited. For this reason Gaver workedon a system of parameterizing auditory icons (Gaver, 1993a; Gaver, 1994). Amore elaborated description as well as a review of other implementations ofauditory icons can be found in Gaver (1994) and Brewster (1994b).

Earcons versus auditory iconsThough people are better able to associate real world signals (auditory icons)to functions, in some studies they appeared to prefer abstract musical sounds(earcons) in terms of pleasantness and appropriateness (Jones and Furner,1989; Sikora and Roberts, 1997). However, the real world signals for the func-tions were selected from sound libraries whereas the musical sounds were de-signed by a professional sound designer or musician. Furthermore, as Sikoraand Roberts noticed, in long term use, abstract sounds may become concreteenough to be perceived as annoying at some point.

Barras (1997) gives an overview of approaches to auditory display design andcalls the use of earcons the ‘syntactic approach’ (focus on the organization ofauditory elements into more complex messages), and refers to auditory iconsas the ‘semantic’ approach (focus on the metaphorical meaning of sounds).

Experiments investigating the effect of auditory icons and earcons in a visualcategorization task indicated that in a single task experiment, adding earconsincreased reaction times (Van Esch-Bussemakers, 2001), whereas adding au-ditory icons decreased reaction times (Bussemakers and de Haan, 2000). In adual task experiment in which subjects not only had to categorize visual stim-

6

Page 18: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.3 Ecological approach to auditory perception

uli but also had to remember the sum of a series of numbers, the addition ofboth auditory icons (Lemmens et al., 2001) and earcons (Lemmens et al., 2000)decreased reaction times compared to the visual-alone condition.

1.3 Ecological approach to auditory perception

In everyday listening, people listen to the properties of sources that createsound rather than to the musical properties of the sounds (Gaver, 1993b). Theobject of auditory perception is the sound-producing source, not the sounditself (Fowler, 1990; Fowler, 1991). “We can hear an approaching automobile,its size and its speed. We can hear where it is and how fast it is approach-ing. And we can hear the narrow echoing walls of the alley it is driving along.These are the phenomena of concern to an ecological approach to perception.”(Gaver, 1993b, p. 8). This ecological approach to auditory perception is basedon our perception and action in a natural environment and was initiated byJames Gibson. According to Gibson, the information available to us is suf-ficient to account for what we perceive (Gibson, 1986). It does not have tobe supplemented from our past experience or from our innate mental oper-ations. Gibson proposes that the environment consists of affordances, whichare potentials for behavior that can be directly perceived in the environment.Certain objects afford opportunities for action. For humans, a chair affords sit-ting. An apple affords eating. A cup affords drinking. A door with a handleaffords pulling. A book affords reading. The notion of affordances stresses thereciprocal relationship between an organism and its environment: The envi-ronment offers an organism an affordance; the organism must be capable ofperceiving and using it. According to Gibson, the perceptual apparatus of anorganism has evolved to be directly sensitive to affordances. Their propertiesare specified in stimulus information and are directly picked up by tuning tostructures in, in the case of vision, the ambient array such as shadows, texture,color, convergence, symmetry and layout that determine what is perceived.Gibson calls these structures invariants (Gibson, 1966). According to Gibson,perception is a direct consequence of the properties of the environment anddoes not involve any form of sensory processing.

Gibson worked mainly in the field of visual perception, but also consideredthe other senses including the auditory system from the perspective of directperception of invariants in the world, unmediated by memory, inference, andcomputation (Gibson, 1966).

7

Page 19: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

1.4 Previous studies

In order to create suitable auditory interfaces based on everyday sounds, wehave to better understand how people perceive everyday sounds. Severalstudies have investigated auditory event perception in a systematic way, withthe aim of discovering acoustic properties of sounds that listeners can use toperceive properties of their sources. In this section an overview of relevantstudies is given in chronological order. Subsequently, in Section 1.5, a schemeof research methodology is described and the studies that are discussed in thepresent section will be placed within this framework.

Warren and Verbrugge (1984) studied bouncing and breaking events. Theacoustic characteristics of bouncing may be described as “a single dampedquasi-periodic pulse train in which the pulses share a similar cross-sectionalspectrum”, whereas the acoustic characteristics of breaking may be describedas “an initial rupture burst dissolving into overlapping multiple dampedquasi-periodic pulse trains, each train having a different cross-sectional spec-trum and damping characteristic” (Warren and Verbrugge, 1984, p. 706). Ina first experiment, subjects were presented with recordings of bouncing andbreaking glass bottles, and had to judge whether the object had bounced, bro-ken, or neither of the two. Performance was very good (about 99% correct),demonstrating that natural sound provides sufficient acoustic information forlisteners to categorize the events of bouncing and breaking. In another exper-iment, examples of bouncing and breaking were constructed from recordings.Individual recordings of bouncing of four pieces of glass from a broken bottlewere combined such that their successive impacts were either synchronous,with the same rhythmic pattern as bouncing, or completely independent, tosimulate breaking. In this way, average spectral differences between the twosound classes were eliminated. The breaking patterns were preceded by aninitial noise burst, taken from the original rupture that produced the fourpieces of glass. Results showed very good performance (about 89% ‘correct’),indicating that temporal patterning provides sufficient information for listen-ers to categorize bouncing and breaking events. However, the temporal pat-terning and initial noise were confounded in the experiment and for this rea-son, the two experiments were repeated with the initial noise removed fromboth natural and constructed cases of breaking. The results were nearly identi-cal to the previous experiments, indicating that the initial burst was not neces-sary for categorization of the two events, and that the variation in the tempo-ral patterning of pulse onsets alone was sufficient. For a study of the influenceof spatiotemporal patterns on visual and auditory perception of elasticity in

8

Page 20: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.4 Previous studies

bouncing objects, see Warren et al. (1987).

Halpern et al. (1986) examined the unpleasantness of a chilling sound. Al-though this study investigated the unpleasantness of a sound instead of thehuman ability to perceive properties of the sound source, it is included be-cause of the very similar method and unexpected results. In a first experiment,subjects had to judge the unpleasantness for a number of different sounds,such as jingling keys, a blender motor, and scraping metal. The sounds werematched in duration (3 s) and amplitude (equal maximum value). The re-sults showed agreement between subjects regarding the unpleasantness ofthe sounds. The sound judged to be most unpleasant was that produced byslowly scraping a three-pronged garden tool over a slate surface, a sound verysimilar to the sound of fingernails scratching across a blackboard. The spectro-gram of this chilling sound revealed several prominent harmonics, the lowestat 2.8 kHz. The amplitude waveform showed an aperiodic temporal structurewith a rapidly fluctuating amplitude envelope. To investigate the contributionof spectral content to the sound’s unpleasant character, the authors removedenergy from different frequency regions by either highpass or lowpass filter-ing. The sounds were matched in amplitude by equalizing their RMS value.Subjects had to rate the unpleasantness of the filtered sounds and were toldhow the stimuli were created before listening. Results showed that decreas-ing the lowpass filter cutoff frequency from 8 to 3 kHz had no effect on theunpleasantness ratings. Increasing the highpass filter cutoff frequency from 2to 6 kHz, the sound lost some of its unpleasantness, with a large drop in un-pleasantness between 3 and 4 kHz. Apparently, removal of lower frequencies,not of the highest ones, lessened the sound’s unpleasantness. In this exper-iment the sounds were matched by their level, but still may be perceived asnot equally loud. A third experiment tested the possibility that unpleasant-ness had been confounded with loudness. Subjects listened to a selection ofstimuli from the previous experiment, presented at two sound pressure levels10 dB apart, and had to judge the loudness. An intensity decrease of 10 dB re-sulted in an estimated loudness drop between 41% and 50%, confirming thatsubjects were estimating the loudness. Sounds presented at the same soundpressure level showed no difference in the estimated loudness, indicating thatloudness differences could not have influenced the unpleasantness ratings.In a final experiment, the contribution of temporal fine structure was evalu-ated by presenting subjects with four different stimuli: the original sound, ademodulated version of the original (the original sound divided by its tem-poral envelope contour), an unmodulated synthesized sound (sum of threesinusoids corresponding to the first three prominent harmonics of the original

9

Page 21: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

sound), and a modulated synthesized sound (the sum of three sinusoids mul-tiplied by the temporal envelope contour of the original sound). The subjects’unpleasantness ratings of the original sounds were much higher than thoseof the synthesized sounds, indicating that the latter did not mimic the origi-nal chilling sound very well. No differences were found between the originaland demodulated original sounds and between the unmodulated synthesizedand modulated synthesized sounds, indicating that temporal envelope struc-ture did not contribute to the unpleasantness of the sounds. It is still unclearwhy this sound is so unpleasant for human listeners. The authors wonder“whether it mimics some naturally occurring, innately aversive event” (p. 80),and think of warning cries or vocalizations of some predator. But, “regardlessof this auditory event’s original functional significance, the human brain ob-viously still registers a strong vestigial response to this chilling sound” (p. 80).

Repp (1987) investigated hand-clapping as a sound-generating activity of twoarticulators. He recorded the individual clapping of 10 men and 10 women,and analyzed the spectral content of the average clap of each person. For datareduction, a principal components factor analysis was conducted. It resultedin four significant factors representing prototypical spectral shapes. By lin-ear combination of these four prototypical shapes, the spectrum of each clapcould be approximated. However, it is unclear what physical properties un-derly each factor. There was no significant sex effect for any of the four factors,which implies that men and women clapped similarly, and, because the handsof the men were significantly larger than those of the women, that hand sizehad no influence on the sound of claps. It appeared that the observed varia-tions in hand configuration accounted for about half of the spectral variabilityamong individuals. Other factors, not analyzed by Repp, such as hand cur-vature, tightness of the fingers, fleshiness of the palms, and striking force,may contribute to the unexplained variation in clap spectrum. All clappers,who knew one another, participated as listeners in a perception experiment inwhich they had to identify the clapper (including themselves) from a list ofparticipants. Overall clapper recognition was very poor. Self-recognition wasmuch higher. The subjects were very consistent in identification of a clapperas male or female, though they were often wrong. This may reflect generalsex stereotypes, rather than actual perception of sex differences in clapping.For example, fast and soft sounds with high resonance frequencies were as-sociated with claps produced by women. Apparently, acoustic characteristicswere associated with gender characteristics. In another experiment, listen-ers were rather good at identifying the hand configuration if all sounds weremade by one and the same person. However, identification of hand config-

10

Page 22: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.4 Previous studies

uration of all clappers mixed was very poor. Repp concluded: “sound ema-nating from a natural source, particularly one involving parts of the humanbody, conveys perceptible information about the configuration of that source”(p. 1108). However, as noted by McAdams (1993, p. 177), the results indi-cated that acoustic characteristics specific to a particular hand configurationare not invariant across clappers. Although this was an exploratory study, itis interesting with respect to its methodology in that properties of the acousticsource event (hand configuration and size) are related to auditory perceptions(gender of clapper) through acoustic structures (spectral shape).

Gaver (1988) studied people’s abilities to perceive the material and lengthsof struck wooden and metal bars. He found that sounds made by vibratingwood decay quickly in amplitude, with low-frequency partials lasting longerthan high ones, while the sounds made by vibrating metal decay slowly, withhigh-frequency partials lasting longer than low ones. Decreasing the length ofa bar increases the frequencies of the partials it produces, so short bars makehigh-pitched sounds and long bars make low ones. However, the effects oflength may interact with the effects of material. The frequencies of the partialschange monotonically with length, but the amplitude and decay time of a par-tial with specific frequency depends on material. Gaver found that people arebetter in judging the material than judging the length of a bar. He developed aphysical model to aid interpretation of the acoustic analysis and to synthesizenew impact sounds. However, Gaver did not test these synthesized sounds inperception experiments.

Freed (1990) studied the auditory correlates of perceptual ratings of mallethardness for percussive sound events. He recorded sounds made by strik-ing metal cooking pans of four different sizes with six percussion mallets ofvarious hardness: metal, wood, rubber, cloth-covered wood, felt, felt-coveredrubber. From a simulated peripheral auditory representation of the acousticsignal, he derived four ‘timbral predictors’: a measure of overall energy (themean spectral level), rate of decay (the slope of the spectral level over time),spectral distribution (the mean over time of the spectral centroid), and rateof spectral evolution (the ‘time-weighted average’ spectral centroid). Onlyacoustical parameters were used as timbral predictors, not mechanical param-eters of the source, such as striking velocity. The subjects’ task was to ratethe perceived mallet hardness on a unidimensional perceptual scale. Priorto the experiment, sound examples of the hardest and softest mallets werepresented. As Freed points out, subjects may have used these examples tolearn the rating task by association, rather than by aurally acquiring informa-

11

Page 23: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

tion about the sound-producing system. Results showed that subjects wereable to focus on the mallet-related aspects of the sound and ignore the pan-related aspects, such as pitch. In Freed’s words (p. 319): “This is a rather strik-ing demonstration of the listener’s ability to separate out interacting dimen-sions of a stimulus in order to make an environment-oriented judgment”. Thetimbral predictors correlated fairly well with perceived mallet hardness, thestrongest predictor being the mean over time of the spectral centroid. How-ever, although the perceived mallet hardness was independent of pan size, thetimbral predictors were not. So Freed did not search for an acoustic structurethat varied only with mallet hardness but did not vary with pan size.

Li et al. (1991) investigated the ability of subjects to perceive the gender ofa human walker. In a first experiment recordings of humans walking on ahard surface were used as stimuli to investigate whether subjects were ableto judge the gender of a walker. The results showed that subjects were ableto identify far above chance the gender of the walkers on the basis of acous-tic information present in the waveforms. From an anthropomorphic analysisit was concluded that, for the male and female walkers in the study, the sig-nificant anthropomorphic differences were in weight, height, and shoe size.These all correlated well with the ‘maleness’ judgments of the subjects. Sevenmeasures of spectral shape were applied to the spectrum of the heel strikephase of walking. As the presence of a toe strike phase seemed to be inde-pendent of the actual gender of the walker, Li et al. posed that the presenceof toe strikes may relate to walker identification rather than gender identifica-tion. To evaluate the relationship between acoustic source events and acous-tic structure, these seven measures were treated as predictors to discriminatemale and female categories. A principal component analysis conducted onthese measures resulted in two principal components, one describing the po-sition and shape of the spectral peak, and one describing the contribution ofhigh-frequency components. To examine the mapping between acoustic struc-ture and perception, a multiple regression analysis was conducted to regressthe maleness judgments on the two derived principal components. The re-sults indicated that “male walking spectra can be characterized by low centraltendency with high positive skewness and kurtosis, and rapid spectral risingand falling, while the female walking spectra can be characterized by highcentral tendency with low skewness and kurtosis, and slow spectral risingand falling” (p. 3045). Durational analysis did not show an effect of the tem-poral organization of walking sounds on the maleness judgments. However,as Eggen (1995) noted, this may be due to errors in the analysis, because theduration of the stance phase was systematically shorter than the duration of

12

Page 24: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.4 Previous studies

the swing phase, which is only possible if the walker was actually running.So this suggests that subjects may use two classes of information to identifya walker’s gender: the information about the position and the shape of thespectral peak and the contribution of high-frequency components. A secondexperiment positively verified this by crudely and systematically altering thespectral mode (the frequency that carries most of the spectral energy) andslopes for two walking stimuli previously judged to be relatively neutral interms of perceived gender. A frequency equalizer was used for the manipula-tions, allowing only a rough manipulation of spectral mode (equalization peroctave band) and a limited change of spectral slopes. A shift of spectral modeto lower or higher frequencies resulted in a considerable decrease or increasein maleness judgment, respectively. Unexpectedly, the maleness judgment forsounds with a shallow low-frequency slope was higher than for a steep slope.In a final experiment, the effect of shoe characteristics on the maleness judg-ment was demonstrated by using recordings of three female and one malewalker wearing male shoes of two different manufacturers. The perception ofthe gender of a walker was sometimes reversed by wearing shoes of the op-posite gender. This suggests that listeners are able to identify the gender of awalker on the basis of acoustic information present in the waveform inducedby multiple factors like anthropometric differences among walkers (weightand height) and differences in shoes (size). This multiplicity of factors showsthat it is difficult to investigate source characteristics separately. The authorsnote that in order to be able to vary specific parameters systematically, syn-thesized walking sounds are needed.

Lutfi and Oh (1997) studied the auditory discrimination of material changesin a struck-clamped bar. The principles of theoretical acoustics were appliedto synthesize the sounds of a struck iron and glass bar, rigidly clamped at oneend. The result is an inharmonic sum of damped sinusoids whose individ-ual acoustic parameters (frequency, intensity, and decay modulus) are, for afixed geometry and fixed driving force, uniquely determined by the materialcomposition of the bar (mass density and elasticity). As only the first threepartials were audible, only these were synthesized. Amplitude decreased inproportion to frequency at -6 dB per octave, decay modulus decreased in pro-portion to frequency cubed, and frequency ratios between the three partialswere constant. Target sounds of iron and glass were synthesized, using theirvalues of mass density and Young’s modulus of elasticity, as well as devi-ations from these ideal sounds by small (lawful and unlawful) changes inacoustic parameters. The sounds were presented pairwise, and subjects hadto indicate which interval contained the target sound, that is, either iron or

13

Page 25: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

glass in different trial blocks. Subjects, who were musicians, were extensivelytrained and received feedback about the correctness of their responses. Cor-relations between the acoustic parameters and the responses showed that thelisteners failed to make optimal use of the information in the sound. This islargely due to the listener’s tendency to put a disproportionate high value onchanges in signal frequency compared to decay and intensity. Lutfi and Ohfound no difference in performance between lawful and unlawful (indepen-dent) perturbed conditions. As the perturbations were small, this is in agree-ment with results from their previous experiments which showed that “ratherlarge deviations from lawful variation are required before listeners can dis-tinguish lawful from unlawful perturbations in these sounds (Lutfi and Oh,1994)” (p. 3652).

Lakatos et al. (1997) investigated the extent to which listeners could distin-guish the geometric features of struck bars on the basis of their auditory at-tributes. In two experiments, subjects listened to recordings of struck solidmetal (experiment 1) and wooden bars (experiment 2), that varied in bothwidth and height at constant length. The sounds were presented pairwise,accompanied by two pairs of visual depictions on a computer screen repre-senting the exact width to height ratios of the bars. Listeners had to select thecorrect visual ordering corresponding to each sound pair. Lakatos et al. (1997,p. 1183) noted that “this task measured not discriminability but rather a rel-ative identification consistency across bars”. To familiarize subjects with theevent type, they were, prior to the experiment, given the opportunity to strikesample bars of dimensions different from those used in the experiment. Thedata of 5 out of 60 (experiment 1) and 10 out of 60 subjects (experiment 2) wereexcluded from the analysis because their preference for one ordering abovethe other within a pair, averaged over all pairs, did not exceed 75%. Per-cent correct scores, Pc, below 50% were flipped (i.e. 100%�Pc), transformedinto measures of dissimilarity, and analyzed with multidimensional scaling.A two-dimensional solution was found for the metal bars. One dimensioncorrelated strongly with the width to height ratios of the bars. The seconddimension correlated strongly with the spectral centroid of the 7 to 8 largestpeaks in the spectrum calculated for the attack portion of the sounds. A plotof the multidimensional scaling solution revealed a clustering of bars into twogeneral categories of ‘plates’ and ‘blocks’ with a diagonal boundary indicatingthat it depended equally on differences in width to hight ratio and values ofpeak spectral centroid. The results for the wooden bars were less consistent.The multidimensional scaling analysis resulted in a one-dimensional solutionwhich correlated moderately well with the width to height ratios of the bars.

14

Page 26: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.4 Previous studies

The authors also searched for the characteristic vibration modes of bars byacoustic analysis of the spectrum of the original recorded sounds and addi-tional measurements of accelerometers directly on the bars. Fourier analysisof the sounds showed that the presence of three relevant vibration modes (twotransverse bending modes and a torsional mode about the longitudinal axis)was generally in good accordance with their theoretical frequencies. However,in the sounds made by wooden bars, these modes were sometimes weak oreven absent because of the high damping of wood. The torsional vibrationalmode as well as the ratio of the two transverse modes in the bars correlatedstrongly with the width to height ratio and thus with the subjects’ matchingperformance in the experiments. The authors suggest that these modes serveas potential cues for perceiving the spatial dimensions of the bars. However,Houix et al. (1999) have shown that the full invariant structure of a bar is notabstracted when striking position is varied. No account is given for the factthat subjects also discriminated the metal bars on the basis of their spectralcentroid.

Carello et al. (1998) investigated the ability to perceive by sound the pre-cise length of wooden rods dropped onto a hard surface. Wooden rods weredropped onto a linoleum floor, and listeners, who did not see the rods, hadto indicate the length of the rods by moving an adjustable surface away froma desk until the rod would just fit between the surface and the desk. Rodsof different lengths but same diameter and material were used. The resultsshowed that listeners were able to scale objects appropriately, without anystandard of comparison (no foreknowledge of the size range or the numberof objects). Though the length of the rods was consistently underestimated,the correlation between the actual and perceived lengths was very high. Ananalysis of acoustic structure was performed; signal duration, amplitude, andspectral centroid were measured to determine whether listeners might haveused these to judge the length of the rods. Correlations revealed no plausi-ble relation between the measures and perceived length. No other acousticstructure was measured, but it was suggested that the object’s inertia tensoris relevant for the perception of that object’s length. So it is not yet clear whatacoustic structure was involved in these judgments.

Hermes (1998) studied the perceived material of synthetic impact sounds. Thesounds consisted of a sum of seven exponentially decaying partials with fre-quencies equally spaced on a logarithmic scale covering one third of an octave.Both the center frequency and the decay time of the partials were systemati-cally varied. The decay time was equal for all partials within a sound. In

15

Page 27: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

a free-identification experiment, subjects could listen to each sound as oftenas wanted and were asked to write down the material of the object produc-ing the sound. All words used by the subjects were counted. The responsesfalling into the five categories most often mentioned, that is, ‘wood’, ‘metal’,‘glass’, ‘plastic’, and ‘rubber/skin’, were counted as a function of the centerfrequency and decay time. Contour plots showed that ‘glass’, ‘metal’, and‘wood’ were well-defined areas in the frequency – decay time space. ‘Glass’and ‘metal’ sounds have high frequencies, with longer decay times of the par-tials for ‘metal’ than for ‘glass’. ’Wooden’ sounds are lower in frequency andhave short decay times. The labels ‘rubber/skin’ and ‘plastic’ were used forlow and very low sounds, respectively. In a follow-up, forced-choice experi-ment, the same stimuli were presented, and listeners had to categorize eachsound as belonging to one of the five best-defined categories resulting fromthe previous experiment. They could not repeat the sounds and had to re-spond very quickly as to promote intuitive responses. Results showed that theclassification of ‘glass’, ‘wooden’, ‘metal’, and ‘rubber’ sounds resembled theresults of the first experiment quite well, except for a slightly smaller regionfor ’metal’ and a larger region for ’rubber’. The results for ‘plastic’ differedconsiderably from the previous free-identification experiment. Instead of lowcenter frequencies, the responses were given in the mid-frequency region. Pre-sumably listeners judged sounds that could not easily be categorized as beingmade by ‘plastic’. Related to material perception, Roussarie et al. (1998) syn-thesized the sound of vibrating bars with a physical model resulting in a sumof exponentially decaying sinusoids, similar to the sounds used by Hermes,but physically restrained. The material density and internal damping factorwas varied and listeners had to rate the dissimilarity between pairs of sounds.The only conclusion drawn was that listeners are sensitive to changes in ma-terial density and internal damping properties.

Kunkler-Peck and Turvey (2000) examined people’s ability to determine thelengths and widths of a plate based on the sound it made when struck bya pendular hammer. In one experiment, participants were asked to recordeither the height or width of an unseen, rectangular steel plate after it wasstruck. The height ranged from about 48 to 91 cm and the width from 25 to48 cm. While the participants did not precisely reproduce the object’s actualdimensions, they did accurately perceive its shape. For example, squares wereperceived as equal in length and width while long, thin rectangles were per-ceived as unequal in length and width. In addition, the perceived dimensionswere in the vicinity of the actual dimensions and were ordered properly. Evenif the material of the plates varied between steel, Plexiglass and wood, subjects

16

Page 28: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.4 Previous studies

perceived the shapes of the plates accurately regardless of the material. How-ever, it should be noted that although the relative dimensions were perceivedcorrectly, the true physical dimensions were not and depended on material.In another experiment, subjects heard an unseen, steel object being struck andwere then asked to identify it as circular, triangular or rectangular. The re-sults showed that the participants accurately identified the correct shape at alevel well above chance. If participants were asked to discern the shape andmaterial composition of three different plates made of three different materi-als, they accurately identified the shape at a level well above chance and werenearly perfect in identifying the material composition of the struck plates. Fur-thermore, the results revealed that there was an unexplainable tendency forparticipants to associate a particular material with a particular shape (woodwith circle, steel with triangle, and Plexiglass with rectangle). The dimensionsof the steel plate were judged as larger than the corresponding dimensions ofthe wooden plate, and the wooden plate larger than Plexiglas. As noted byGygi (2001), this is possibly due to the louder sound emitted by steel platessince the striking force of the pendular hammer was constant. The findingsof the study suggest that unique structures within the acoustic pattern allowlisteners to perceive one dimension separate from another. The authors cal-culated the resonance modes of the plates and found a strong correlation ofperceived dimensions with the first three even modes. Listeners may detectthe modal frequencies associated with length separately from those associatedwith width, and vice versa. According to the authors (p. 293), the modal fre-quency spectra “provide a very good first approximation to the sound struc-ture that specifies shape. The modal vibrations are a candidate structure forinformation in Gibson’s specificational sense. [...] Auditory shape perceptionmay map uniquely to the modal frequency spectra.” However, the spectralacoustic cues are probably correlated with spatial acoustic cues induced by alarge plate. The authors remarked that a possible cue may be the interauraltime or intensity difference arising from the edges of the vibrating plates. Theauthors did not search for an acoustic structure specifying the shape of theplate.

Klatzky et al. (2000) investigated the perception of material from synthesizedcontact sounds. Using algorithms of Van den Doel and Pai (1998), sounds ofclamped bars struck at an intermediate point were synthesized. The resultingset of modal frequencies varied in fundamental frequency. Damping, whichonly depends on material, was simulated by a frequency-dependent rate ofdecay, as proposed by Wildes and Richards (1988). In the first two experi-ments, subjects judged the similarity of synthesized sounds with respect to

17

Page 29: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

material. In one of the two experiments, sounds were simulated with constantstriking force (equal initial amplitude of the waveforms), with the result thatthe total energy of the sound and its duration depended on the rate of decay.In the other experiment, the initial sound amplitude was varied in such a waythat the total energy was constrained within a small range. Furthermore, thesound duration depended on fundamental frequency, decay, and initial am-plitude. The similarity responses were subjected to a two-dimensional mul-tidimensional scaling algorithm. In the solution found, one axis correlatedhighly with the decay parameter and the other axis with the frequency pa-rameter of the sounds (the contributions of the two parameters to the simi-larity judgments were almost orthogonal). Both differences in decay and fre-quency, but not in total energy or sound duration, affected similarity judg-ments. Decay differences contributed more to the similarity judgments thanfundamental frequency (for the range of stimulus differences in these exper-iments). Another experiment in which subjects judged the similarity of syn-thesized sounds with respect to length, instead of material, showed a decreasein the contribution of decay. This indicates that in the previous experimentssubjects did not arbitrarily compare the sounds, but probably judged similar-ity. In a final, classification experiment, subjects had to assign the sounds toone of four material categories (‘glass’, ‘wood’, ‘steel’, and ‘rubber’). This ex-periment is very similar to that of Hermes (1998) mentioned above, withoutthe least well established category ‘plastic’. The results showed an influenceof frequency and especially of decay. Decay parameters associated with eachcategory were consistent with data reported by Wildes and Richards (1988).The conclusion that decay plays a larger role than frequency in material per-ception, contradicts the conclusion of Lutfi and Oh (1997) that subjects exces-sively listened to frequency. However, Lutfi and Oh used a model with decayproportional to frequency cubed instead of linearly proportional to frequencyas in the study by Klatzky et al. (2000). In the study of Hermes (1998), whichshowed a clear percept of material, decay did not depend on frequency butwas equal for all partials. Gaver (1988) found that the sounds made by vi-brating wood decay quickly, with low-frequency partials lasting longer thanhigh ones, while the sounds made by vibrating metal decay slowly, with high-frequency partials lasting longer than low ones. However, just as Klatzkyet al. (2000), he proposed a model with a decay rate which is linearly pro-portional to frequency and which includes a factor corresponding to material.Although these studies with simplistic sum-of-decaying-sinusoids sounds un-derestimate the complexity of the relation between decay rate and frequencyas well as their dependence on material properties (see, for instance, Chaigne

18

Page 30: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.4 Previous studies

and Lambourg 2001), a linear function is a good approximation for many ma-terials.

Cabe and Pittenger (2000) studied the human sensitivity to acoustic informa-tion from vessel filling. In a first experiment, subjects listened to recordingsof filling, emptying, and constant-level events produced by pouring water ina vessel, and had to classify the sounds as belonging to one of these threesplashy event types. They were able to identify the events accurately. In anext experiment, subjects had to fill a vessel to the brim or to their preferreddrinking level somewhat below the brim. In a first session, subjects only heardthe sounds created by filling. In a second session, they held and saw the ves-sel resulting in haptic, visual, and auditory information. No feedback on theirperformance was given. Results showed that subjects distinguished the twotarget levels and were able to control vessel filling with reasonable consistency.Addition of haptic and visual information to auditory information improvedboth accuracy and consistency of control. According to the authors (p. 316),“participants used changing fundamental resonant frequency (FRF) as infor-mation to control vessel filling”. In a third experiment, blind and sighted sub-jects filled three different vessels (differing in width and height), each at twomaximum flow rates, to the brim. Only auditory information was available.No feedback on fill accuracy was given although some subjects noted that theycould hear when overflows occurred. Although some subjects showed largeconstant errors with large variations, most (blind and blindfolded sighted)subjects filled the vessels close to the brim. Many overflows occurred, in spiteof instructions to avoid them, especially for smaller vessels and faster flow.The authors state that “pouring real water into real containers produced a verymessy acoustic signal. Consequently, naive participants’ ability to control flowso accurately is surprising.” (p. 318). However, it may well be that subjectswere able to do the filling task precisely because of this ‘messy’ splashy sound.The authors claim that the acoustic information for vessel filling is the funda-mental resonant frequency, but the real recordings of filling sounds are richin acoustic properties and may contain a lot more information than only FRF.Moreover, FRF is probably not a reliable cue for deducing the water level asthis frequency and its rate of change depends on the radius of the vessel and isnot absolutely defined, even not if the vessel is full. It is doubtful whether theFRF formula presented by the authors is still valid for small column lengths.From this formula the authors derived an acoustic pattern of change specify-ing the time remaining until a vessel becomes full (analogous to optical tau)to support the availability of an invariant for time-to-full based on FRF. Thesensitivity of subjects to the acoustic time-to-full was tested in a fourth exper-

19

Page 31: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

iments. Blindfolded subjects listened to a single vessel being filled with waterto three levels below full, at three possible rates, and said stop at that mo-ment in time they thought the vessel would have been completely filled to thebrim if the water had continued to flow. Results showed that subjects werereasonably accurate in their estimated time to full, except for slow rises in thewater level, which led to underestimates. Rapid rises in the water level ledto the most accurate estimates. The authors advocate the search for tau-likevariables for additional classes of events.

Lutfi (2001) investigated the auditory discrimination between hollow andsolid bars. The method is similar to the one he used earlier in the study of au-ditory discrimination of material changes in a struck-clamped bar, describedabove (Lutfi and Oh, 1997). However, in the earlier study, acoustic variationwas introduced by perturbing the physical attribute to be discriminated (ma-terial), whereas in this study a physical attribute (bar length) unrelated to theone to be discriminated (hollow/solid) is perturbed. Therefore no ambiguityregarding the two classes to be discriminated exists; the bar is either hollow orsolid. Using the theoretical equations describing the motion of a bar, rigidlyclamped at one end and struck at the other, the sound of hollow and solid iron,wooden, and aluminium bars were synthesized. Since in real impact soundsonly the first three partials are audible, only these were synthesized. The val-ues of the hollow inner radius of the bar were chosen individually for eachlistener and condition, because, according to the author, the average perfor-mance levels must lie between 70% and 90% correct, in order to ensure reliableestimates of listener decision weights. The sounds were presented pairwise tolisteners. Within each block of trials, one sound was that of a hollow bar, theother of a solid one with the same physical attributes except for a slight dif-ference in length (and, of course, a difference in inner radius). Subjects hadto choose the hollow bar and received feedback about the correctness of theirresponse. Before each block of trials, subjects were informed about the mate-rial of the bars and were given a visual depiction of a real bar of the same sizeas those synthesized. Combinations of frequency, amplitude, and decay ofa partial uniquely differentiate hollow from solid bars, allowing the bar to beidentified as hollow or solid without error. (In the earlier study discriminationof material without error was not possible). Moreover, only two of the threeparameters are needed to determine the hollowness of the bar. By regres-sion of the listeners’ responses on the frequency, intensity, and decay of theindividual partials of the sounds, the listeners’ decision weights were deter-mined. Results showed that about half of the listeners determined hollownessby focussing on frequency and decay consistent with the analytic solution,

20

Page 32: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.4 Previous studies

and that the other half gave predominant weight to frequency in accordancewith the results of Lutfi and Oh (1997). Lutfi demonstrated that consider-ing human sensitivity, an optimal listening strategy, dictated by an analyticsolution for hollowness, results at best in only a small performance advan-tage over judgments based on frequency alone. Furthermore, frequency is themost salient cue, and thus the author expected that listeners would often makejudgments based on frequency alone instead of based on an optimal combina-tion of acoustic parameters. In his discussion, the author emphasizes a num-ber of methodological aspects in which his study differs from other studies onperception of sound source properties. First, all relevant information for thespecific task is contained in the equations for motion used to synthesize thesounds which allows correlating the listeners’ responses to all independentsources of information. Second, the conditions were chosen to yield optimalperformance levels (between 70% and 90% correct responses). Third, the ef-fect of limited human sensitivity on source identification was considered (anddemonstrated).

Related studiesSeveral related studies have investigated the free identification of a broadrange of everyday sounds. For instance, Vanderveer (1979) studied the freeidentification and classification of complex acoustic events such as jinglingkeys and knocking. The general conclusion is that these sounds can easily beidentified as belonging to classes of sound producing events. However, it isnot clear wether the results “demonstrate a direct auditory perception of thephysical cause of the event, or a more likely post-auditory semantic recon-struction based on recognizable acoustic characteristics of the source materi-als and knowledge of the ways they are excited” (McAdams, 1993, p. 179).Another example is a study of Ballas (1993), investigating factors that are in-volved in the identification of brief everyday sounds. Results indicated thatacoustic variables, ecological frequency, causal uncertainty, and sound typi-cality played a role in the identification of the sound.

Besides auditory event perception, other issues in auditory research are of in-terest for the perception of everyday sounds. For example, no mention hasbeen made of sound localization, the human ability to localize sound in space(Blauert, 1997). Also, the field of auditory scene analysis (Bregman, 1990), con-cerned with the perceptual process of systematic grouping of acoustic compo-nents into auditory streams, each of which may be perceived separately, hasnot been discussed. Similarly, the study of perceptual phenomena such as

21

Page 33: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

loudness perception, pitch perception, and masking (Moore, 1997), which arepart of the field of psychoacoustics, has been left out, although this knowledgeis indispensable for the generation and implementation of good nonspeechsounds.

1.5 Research methodologyThe methodology we will use in this thesis arises from an ecological approachto auditory event perception. A schematic overview is given in Figure 1.1(Li et al., 1991). Three interlinked stages are identified: The auditory percep-tion of an acoustic source event, mediated by an acoustic structure. This is illus-trated in Figure 1.1 by the three boxes and the arrows connecting them. Inorder to perceive properties of acoustic source events, those events must pro-duce sounds with an acoustic structure that must be recovered by the listener.Moreover, relevant acoustic structures must be mapped to auditory sourceattributes. These acoustic structures that the listener uses in identifying theacoustic source will be referred to as ‘cues’.

Also depicted in Figure 1.1 is a strategy for investigating the perception ofsource events, which consists of an evaluation of the pairwise relationshipsbetween the three stages, referred to as analysis 1, 2, and 3, and indicated bythe solid brackets in Figure 1.1. In the first stage, we investigate the relationbetween acoustic source events and auditory perception, that is, to what ex-tent a listener is able to identify and discriminate characteristics of the sourceevent. In the second stage we analyze the acoustic waveforms for perceptu-ally relevant acoustic structures produced by the acoustic source events. Inthe last stage we evaluate the link between identified acoustic structures andlisteners’ perception. For this, sounds may be manipulated and presented tolisteners to investigate the relationship between acoustic structures and au-ditory perception. Analysis 1 concerns what McAdams (2000) has called the‘psychomechanics’ of sound sources, which may be distinguished from anal-ysis 3 which is the domain of psychoacoustics.

In summary, we try to find the relationship between perceptual characteristicsof acoustic source events and the physical properties of their sources, withthe aim of discovering acoustic structures of sounds that humans can use inidentifying sound-producing sources.

It should be stressed that this strategy for investigating the perception ofsource events is not a sequence of three analysis stages, but a cycle. After

22

Page 34: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.5 Research methodology

'

&

$

%acousticsourceevent

'

&

$

%acousticstructure

'

&

$

%auditory

perception- -

analysis 1

analysis 2 analysis 3

Figure 1.1: Research methodology (Adapted from Li et al. 1991).

the evaluation of acoustic structures (analysis 3), new insight should lead tonew ideas about the link between the acoustic source event and acoustic struc-ture (analysis 2), or even about the relation between acoustic source event andauditory perception (analysis 1).

The acoustic source event may be a real event (possibly recorded) or a synthe-sized virtual event obtained by, for example, solving a set of partial differentialequations or using a computer model (e.g. a finite element model) or a physi-cally inspired model like that of Gaver. In the latter case the acoustic structureis generated directly and it may be argued whether an acoustic source eventis involved at all. The listener conceives an auditory image of the (real or vir-tual) acoustic source event, which may aptly be called an auditory event. Thesynthesis of the sound source is successful if the listener cannot distinguishbetween the real and virtual event, that is, if the auditory image provoked bythe synthesized sound equals that provoked by the real event.

Analysis 1: acoustic source event – auditory perceptionThe studies discussed in the previous subsection all demonstrate the humanability to identify properties of acoustic source events. All studies started toexamine the relationship between acoustic source events and perceptual cate-gorization of the stimuli.

In most of the studies discussed in the previous section, perception experi-ments with real (live or recorded) sounds were carried out. However, in thestudies of Lutfi and Oh (1997), Hermes (1998), Klatzky et al. (2000) and Lutfi(2001), synthesized everyday sounds were used as a starting point. If param-eters that are involved in the synthesis are directly linked to source parame-ters, the perception experiments may be considered as pertaining to analysisstage 3.

23

Page 35: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

Analysis 2: acoustic source event – acoustic structureSome of the studies mentioned in the previous section derived ‘timbral predic-tors’ from acoustical representations of the sounds (Repp, 1987; Freed, 1990;Li et al., 1991). These measures can be used to classify the sounds along thevarious dimensions of the sound-producing event. However, to obtain insightinto the relationship between the physical properties of the sound source andthe acoustic structure of the sound wave (analysis 2), a physical (mechanical)analysis of the sound production process itself is needed. Gaver (1994) ar-gued that such an analysis may reveal what source properties people are ableto perceive, as well as which acoustic information is related to these proper-ties. For instance, Warren and Verbrugge (1984) observed differences in tem-poral patterning between bouncing and breaking events. Lakatos et al. (1997)and Kunkler-Peck and Turvey (2000) analyzed characteristic vibration modesof struck plates and bars, respectively, and posed these as acoustic structuresspecifying the objects’ geometry. However, Houix et al. (1999) showed that lis-teners judge the sound of metal bars struck at varying positions on the basisof the most prominent pitch heard in the sounds.

Analysis 3: acoustic structure - auditory perceptionThe link between identified acoustic structures and their perceptual relevancecan be investigated in different ways. One possibility is to correlate the iden-tified acoustic structure with the results of the perception experiments per-formed in analysis stage 1 (acoustic source event – auditory perception). Manystudies discussed in the previous section perform a correlational analysis(Repp, 1987; Freed, 1990; Li et al., 1991; Lakatos et al., 1997; Carello et al., 1998).However, often the measures are not independent. Several of the studies per-formed a factor analysis to reduce the number of factors. But still perceptionexperiments with independent variation of factors should be conducted toverify that each factor specifies correctly its presumed source property andthis source property only.

Another possibility is to manipulate the recorded sounds with digital signalprocessing techniques. The presumed acoustic structure is a legitimate oneif after manipulation and presentation to listeners the percept of the corre-sponding source characteristic changes (preferable in a predicted manner),while other source characteristics remain the same. Examples are the stud-ies of Warren and Verbrugge (1984) and Li et al. (1991).

A third possibility is analysis-by-synthesis. Instead of recording and manip-ulation of sounds, sounds are synthesized directly. If the existing knowledge

24

Page 36: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.6 Scope and overview of this thesis

about a particular sound-producing event is sufficient to synthesize it, it has toprovoke a percept of that event. Several studies discussed in the previous sec-tion used this approach (Gaver, 1988; Lutfi and Oh, 1997; Hermes, 1998; Lutfi,2001; Klatzky et al., 2000). Based on the physics of the event, Gaver devel-oped algorithms for generating realistic sounds (Gaver, 1993a; Gaver, 1994).Several parameters could be adjusted and were directly linked to perceptu-ally relevant source properties like size and material of the objects and typeof interaction. Some of the resulting sounds were very realistic (e.g. singleimpacts, bouncing and breaking sounds), others (e.g. rolling sounds and ma-chine sounds), were less successfully implemented. Gaver did not test thesesynthesized sounds in perception experiments to verify the link between theparameters and the corresponding property of the source event.

1.6 Scope and overview of this thesisThe aim of this thesis is to better understand how rolling sounds are perceived.Are listeners able to identify or discriminate physical properties of the rollingevent? If so, which features in the sound do people listen to and how dothey relate to the perceived sound source properties? We concentrate on theauditory perception of the size and the speed of rolling balls. The advantageof these ordinal properties over categorical properties like the material type isthat they can be varied on a continuous scale. A practical advantage is thatballs different in size are easy to obtain and can easily be rolled at differentspeeds, whereas material type, for instance, is hard to vary without covaryingthe surface roughness. Wooden balls rolling over a wooden plate were usedbecause their surfaces are less smooth than those of, for instance, metal balls.This results in rolling sounds that are louder, richer in information and, inour opinion, more characteristic of rolling. Another advantage of wood oversmooth surfaces like metal and glass is that, because of the coarse surface ofwood, impurities (e.g. dust) have less effect on the sound. If, for instance,an almost perfectly spherical steal ball rolls over an extremely smooth glasssurface, an impurity may be heard as one or several ticks in an otherwise‘smooth’ and soft sound. Although we use spheres as rolling objects, it islikely that rolling cylinders produce similar rolling sounds and provoke thesame kinds of percepts.

We have chosen to study rolling sounds for two reasons. On the one hand,it will contribute to our fundamental understanding of the relation betweenthe physical attributes of an acoustic source event and the human perception

25

Page 37: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

of these attributes. The sound of a ball rolling on a certain surface is interest-ing from a scientific viewpoint because of its continuous quasi-regular charac-ter caused by the revolving surface, in combination with a stochastic elementcaused by irregularities of the interacting surfaces. In contrast to the class ofsingle impact sounds (e.g. a mallet hitting a wooden bar), which is often stud-ied and synthesized, the physics underlying the sound of rolling is not wellunderstood and highly complex (Stoelinga, 2001).

On the other hand, several possible applications come to mind. For instance,in user-interfaces, the rolling ball can be used as a metaphor for cursor move-ment resulting from moving a mouse or a trackball. By adding sound to atrackball with force feedback, the tactile, visual, and auditory modality arecombined. If we are able to synthesize parameterized rolling sounds, it canbe used to convey information by mapping physical attributes of the rollingprocess, such as speed, size, material and surface condition, to information tobe conveyed. In this way the ease-of-use and performance of users in generalmay be improved and visually disabled users in particular may benefit fromthe additional auditory information. It will also enable the study of multisen-sory interaction by presenting the same rolling object simultaneously in differ-ent modalities. Rolling sounds can also be used in advanced multi-media ap-plications, animated movies, and interactive environments, like video gamesand virtual reality.

In Chapter 2 of this thesis, perception experiments with real original record-ings of rolling balls are described (analysis 1). In experiment I (Section 2.3), theperception of the size of rolling balls is investigated, whereas in experiment II(Section 2.4), the perception of the speed of rolling balls is studied. ExperimentIII (Section 2.5) combines variations in size and speed to investigate interac-tion effects when judging the size or speed of rolling balls. In Section 2.6 someavailable auditory cues within the sounds are examined (analysis 2). Conclu-sions are summarized in Section 2.7.

Chapter 3 of this thesis reports perception experiments with manipulatedsounds. Sounds were analyzed, manipulated and resynthesized and used inperception experiments to investigate whether listeners use spectral or tempo-ral information to judge the size and speed (analysis 3). Section 3.2 describesthe sound manipulation algorithm. The influence of this algorithm on thesound was tested in pilot experiments which are presented in Section 3.3. Themain perception experiments are reported in Section 3.4.

In Chapter 4 amplitude modulation is applied to rolling sounds to investi-

26

Page 38: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1.6 Scope and overview of this thesis

gate the influence of angular speed on the perception of size and speed ofrolling balls (analysis 3). Section 4.2 describes amplitude modulation. Thesame experiments on the perception of size and speed as described in experi-ment I, Section 2.3, and experiment II, Section 2.4, were carried out using thesame sounds but the sounds were now provided with amplitude modulation.This is described in Section 4.3. Section 4.4 reports a perception experiment inwhich both size and linear speed were varied, and amplitude modulation witha rate matching the natural angular speed was added. Section 4.5 describesperception experiments in which the angular speed was varied independentlyby adding amplitude modulation at three frequencies (one of which was thenatural frequency). It is followed by a discussion in Section 4.6. In Section 4.7,the results of perception experiments that were reported in the previous chap-ters, are reanalyzed in terms of differences in angular speed.

The thesis concludes with a general discussion in Chapter 5.

27

Page 39: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

1 General introduction

28

Page 40: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2Perception experiments with

recorded sounds

This chapter reports three experiments investigating the link between acousticsource events and auditory perception. In experiment I, listeners were askedto discriminate differences in the size of wooden rolling balls on the basis ofrecorded sounds. Results showed that they are able to choose the larger ballfrom paired sounds. In experiment II, the auditory perception of the speedof rolling balls was examined. Although listeners are able to discriminate thesounds of rolling balls with different speed, some of them reverse the label-ing of the speed. In experiment III, the interaction between size and speedwas tested. Results indicated that if the size and the speed of a rolling ball arevaried, listeners generally are still able to discriminate size and speed, but thejudgment of speed is influenced by the variation in size. Investigation of audi-tory cues (acoustic structures) that listeners may use in their decisions showeda conflict in available cues (centroid of specific loudness) when varying bothsize and speed, which is in accordance with the interaction effect.

Page 41: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

2.1 IntroductionThe first step into investigating the auditory perception of the size and thespeed of rolling balls is to determine whether listeners are able to discriminatethe size and the speed of rolling balls. In addition, we would like to knowhow much this ability depends on the physical difference of the attribute thatlisteners have to discriminate. This is analysis 1 (the relation between acousticsource event and auditory perception) in the research methodology depictedin Figure 1.1.

First, it was examined whether subjects are able to discriminate differences inthe size of rolling balls by listening to recorded sounds of wooden balls rollingover a wooden surface. Based on informal listening, we selected ‘smooth’sounds with no conspicuous amplitude modulation (which may depend onthe shape of the ball, that is, the deviation from perfect sphericity) or roughticks (which are irregularly distributed in time and possibly caused by bounc-ing of the ball). Sounds of rolling balls different in size but rolling at thesame speed were presented pairwise and listeners had to choose the soundproduced by the larger ball. This perception experiment is reported in Sec-tion 2.3. In a following experiment, reported in Section 2.4, it was investigatedwhether listeners are able to discriminate differences in the speed of rollingballs. Smooth sound recordings of wooden balls equal in size and rolling atdifferent speeds were used. In a final experiment, reported in Section 2.5, theinteraction between size and speed was tested. Subjects listened to pairwiserecordings of wooden balls rolling over a wooden surface, varying in both sizeand speed. In one session subjects had to decide which of the two sounds in apair was produced by the larger ball, whereas in another session the task wasto decide which of the two sounds was produced by the faster rolling ball.

If subjects are able to correctly associate the recorded sounds with the size orspeed of the rolling balls, the results can be used to search for perceptuallyrelevant acoustic structures (auditory cues) produced by the rolling ball. Theinvestigation of how size and speed affect auditory cues that may be used bylisteners, is started in Section 2.6. This is done by analyzing the temporal andspectral content of the sounds that were presented to listeners in the first twoexperiments, as a function of the size and speed of the rolling ball. This isanalysis 2 (relation between acoustic source event and acoustic structure) inour research methodology depicted in Figure 1.1.

In Section 2.7, the conclusions are summarized. Parts of this chapter are alsopresented in Franssen (1998), Houben et al. (1999a), and Houben et al. (1999b).

30

Page 42: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.2 Recording method

2.2 Recording method

Sound recordings were made of wooden balls rolling over a wooden plate,which was positioned on a supporting table. To reduce the sound producedby the table, a layer of felt was placed between the plate and the table. Byreleasing the ball from various heights using a gutter, a range of initial speedswas obtained. The recordings were performed in the center of a sound ab-sorbing room. The microphone (B&K 4003) was positioned halfway along therolling path, that is 50 cm away from the gutter in the rolling direction. Thedistance from the center of the rolling path was 42 cm vertically and 40 cmhorizontally. The sounds were recorded on DAT-tape. Due to the suddenchange in slope from gutter to plate, a single tick is heard at the beginning ofthe sound signal. A metal strip was placed at the end of the plate to inducea second tick. The mean speed was calculated based on the time between thetwo ticks in the sound signal. A total of 21 balls (beech with a mean densityof 710 kg/m3) were used to record the sounds. For each of the 7 different di-ameters, ranging from 22 to 83 cm, we had 3 different balls. Each of these 21balls was recorded at 7 different speeds, and all recordings were repeated 3times. This resulted in a total of 441 (7x3x7x3) recorded sounds. Since, dueto the production process, wooden balls are not perfectly spherical, the axisof rotation was chosen such that the rolling sound contained as little ampli-tude modulation as possible. For the repeated recordings, this axis was keptconstant.

2.3 Experiment I: Perception of the size of rolling balls

2.3.1 Method

Sound recordings of wooden balls with mean speeds of 0.75 m/s and diame-ters of 22, 25, 35, 45, 55, 68, and 83 mm were used. The selected sounds weremoderate and smooth, that is, no conspicuous amplitude modulation or roughticks were audible within the sounds. The stimuli were presented pairwise.Pairs consisted of balls adjacent in size resulting in 12 pairs (6 combinations,each in two orders). The stimuli were cut out of the middle of the recordedsounds, thus leaving out the onsets and offsets. The duration of the stimuliwas 800 ms and they were presented with 700 ms silence between them. Thestimuli were faded in and out over 10 ms by means of a Hanning window. Tosuppress size cues caused by sound level differences, the levels of the stimuli

31

Page 43: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

Figure 2.1: Experimental setup.

were equalized to RMS values corresponding to 80 dB SPL1. In this way, sub-jects cannot simply judge the size by listening to the loudness of the sounds,because only temporal and spectral cues are available to them.

Eight subjects participated in the experiment. They were seated in a sound-proof booth and listened to the stimuli with Beyer DT 990 headphones. Theywere not familiar with the sounds. After listening to each pair of sounds, thesubjects had to decide which of the two sounds in a pair was created by thelarger ball (2I2AFC procedure). They did not receive any feedback about thecorrectness of their responses. It was not mentioned in the instructions that thespeed was the same for all rolling balls. The 12 stimulus pairs were presentedfour times in random order and were preceded by 10 test pairs covering therange of stimuli. Figure 2.1 depicts the experimental setup.

2.3.2 Results

The two permutations of the same stimulus combination (e.g. 35-45 and 45-35), were considered together, resulting in 8 repetitions for every combination.

1In general, sounds produced by large balls are louder than sounds produced by smallballs, just as one would expect. However, in the case of a thin plate, the loudness level ofa large ball is lower than that of a small rolling ball. This counterintuitive phenomenon iscaused by the reduced radiation of low frequencies for thin plates (Stoelinga, 2001).

32

Page 44: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.3 Experiment I: Perception of the size of rolling balls

Figure 2.2 represents the medians and quartiles of the individual percentagecorrect responses (Pc) values for the various stimulus pairs. The upper andlower boundaries of the 95% confidence interval for guessing (determinedfrom the binomial distribution) correspond to Pc values of 61% and 39%, re-spectively. This means that the listener’s responses deviated significantly fromguessing, if the Pc was higher than 61% or lower than 39%. The latter case, notencountered here, would arise when the listener is able to discriminate the sizeof a rolling ball, but consistently mistook the smaller ball for the larger one, or,in other words, systematically reversed the labeling.

22−25 25−35 35−45 45−55 55−68 68−83 0

50

100

Pc

(%)

diameter of first and second ball (mm)

Figure 2.2: Results of experiment I. Percentage correct responses, Pc, is represented as a func-tion of the stimulus pair. The horizontal dashed line is the upper boundary of the 95% confi-dence interval for guessing. Medians are shown by dots and quartiles by horizontal bars. Thenumber pairs on the abscissa denote the two diameters of the rolling ball in the stimulus pair.

2.3.3 Discussion

Figure 2.2 shows that subjects are generally capable of discriminating betweenthe sounds of rolling balls with different sizes. The percentages of correct re-sponses, Pc, reveal that the task was neither too easy (which would resultin saturated responses approximating 100% correct) nor too difficult (whichwould result in an expected percentage of correct responses of 50%). Only forthe stimulus pair comprising diameters 22 mm and 25 mm Pc was not signifi-cantly above chance. All subjects performed near chance for this stimulus pair,which implies that no reverse labeling occurred. The relative difference in sizebetween the two balls in this particular stimulus pair (14%) is the smallest ofall stimulus pairs and probably too small to be perceived by the subjects.

33

Page 45: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

2.4 Experiment II: Perception of the speed of rollingballs

2.4.1 Method

Sound recordings of wooden balls with a diameter of 45 mm, recorded as de-scribed in Section 2.2, were used. Seven representative samples were chosenwith mean speeds of 0.36, 0.50, 0.63, 0.71, 0.79, 0.87, and 0.93 m/s. The inten-sity cue - balls rolling quickly produce a louder sound than balls rolling slowly- was eliminated by adjusting the stimuli to an equal overall RMS value cor-responding to 80 dB SPL.

First, a pilot experiment was conducted in which all eight subjects of the pre-vious experiment participated. The same experimental set-up and procedurewas used as in experiment I, except that pairs consisted of the stimuli bothnearest and second nearest in speed to the other stimulus in the pair, resultingin 22 pairs altogether. The stimulus pairs were presented four times in ran-dom order and were preceded by 7 test pairs covering the range of stimuli.No prior information was given about the fact that the size of the balls wasthe same for all stimuli. The results of this pilot experiment showed that thevariation in performance across subjects was very large. The mean Pc valuesof individual subjects covered a range of 50% to 80%. Closer inspection ofthe data indicated that several subjects obviously reversed the labeling - thesounds could be discriminated but instead of the faster ball, the slower onewas chosen, resulting in a Pc value clearly below chance. In addition, somesubjects indicated that they found it difficult to make a distinction betweenthe size and the speed of the ball, not realizing that the size was kept constant.

For this reason, the main experiment was conducted in a slightly differentway. Six naive subjects, who had not participated in the earlier experiments,were presented with all possible stimulus pairs composed of two differentstimuli, that is, the entire domain of stimulus pairs, minus the main diago-nal, was covered, resulting in 42 pairs. Within a session, these 42 stimuluspairs were presented four times in random order. Each subject participatedin four sessions. The two permutations of the same combination were treatedtogether, resulting in 32 repetitions for 21 combinations. The stimuli as wellas the experimental set-up remained the same as in the pilot experiment. Thedifference between this experiment and the pilot experiment was that, beforegiving a response, the subjects could repeat the stimulus presentation as of-ten as they wanted. No feedback was given about the correctness of the re-

34

Page 46: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.4 Experiment II: Perception of the speed of rolling balls

sponses. It was now explicitly mentioned that the size of the balls was equalfor all sound samples.

2.4.2 Results

The results per subject are shown in Figure 2.3. The squares denote stimuluspairs (both permutations of the same combination considered together). Thepercentage of correct responses, Pc, of 32 presentations of the stimulus pairin question is printed within these blocks. The mean Pc value per subject isprinted in the top-left corner of the panels. The upper and lower boundariesof the 95% confidence interval for guessing, per pair, correspond to Pc val-ues of 65% and 35%, respectively. A Pc value below the lower boundary of35% indicates that the subject has not simply failed to perceive a difference(which would result in a Pc value close to 50%) but systematically reversedthe labeling.

mean Pc: 95%

84 97

88

100

97

100

100

100

97

81

100

97

97

97

78

100

94

100

100

97

91 subject 2 mean Pc: 11%

0 0

9

0

19

34

3

0

3

13

0

3

13

31

72

3

3

0

6

13

16 subject 3

.50 .63 .71 .79 .87 .93speed (m/s)

mean Pc: 51%

59 38

44

44

41

53

34

41

41

66

34

19

25

41

22

53

75

72

91

97

91 subject 5

.50 .63 .71 .79 .87 .93speed (m/s)

mean Pc: 85%

66 84

69

91

97

81

100

100

88

59

97

94

97

88

69

100

94

94

91

75

59 subject 6

.36

.50

.63

.71

.79

.87

spee

d (m

/s)

mean Pc: 93%

59 94

91

100

100

91

100

100

97

63

100

100

97

97

100

100

100

100

100

100

66 subject 1

.50 .63 .71 .79 .87 .93

.36

.50

.63

.71

.79

.87

spee

d (m

/s)

speed (m/s)

mean Pc: 76%

31 38

50

75

91

78

69

81

81

56

94

94

91

75

69

84

100

97

91

91

53 subject 4

Figure 2.3: Results of experiment II. Percentage of correct responses is depicted per subjectand per stimulus combination. In the top-left corner of the panels the mean percentage ofcorrect responses is given.

2.4.3 Discussion

All subjects except subject 5 (see Figure 2.3) could clearly discriminate be-tween rolling balls with different speeds. Performance was better than in the

35

Page 47: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

pilot experiment. A clear trend can be observed in the data of all subjects: er-rors primarily occur for stimulus pairs close to the diagonal implying a smalldifference between the speeds in a pair. One subject (subject 3) reversed the la-beling of the sounds, resulting in percentage correct responses close to 0%. Hisinverted responses are similar to the responses of subjects 1, 2 and 6. After theexperiment this subject was asked (deliberately opposite to the instructions)whether he found it difficult to choose the slower rolling ball from the twosounds and he ‘corrected’ the question by saying that he had chosen the fasterball without great difficulties, thus affirming that he understood the instruc-tions given to him.

These results were confirmed by five other subjects who participated in thisexperiment for only one session (i.e. 8 repetitions of 21 stimulus combina-tions). The average Pc of two of the subjects was between 75% and 95% andthus significantly above chance. The average Pc of the other three subjectswas about 50%. Two of them produced very low Pc values for specific stim-ulus pairs, compensated by high Pc values for other pairs, indicating that forthose two subjects labeling was the problem rather than discrimination.

Inspection of the change of the results over the four sessions indicates that theperformance of subjects 1 to 4 and 6, who can clearly discriminate betweenballs rolling with different speeds, remained the same. In other words, theycould discriminate speed during the entire length of the experiment. How-ever, the performance of subject 5 declined enormously during the four ses-sions, from a mean Pc value of 80% to 30%, indicating an increase in reverselabeling of the sounds. The average Pc value of 51% should therefore not beinterpreted as reflecting chance performance. Obviously subjects are able todiscriminate between rolling balls with different speeds, but some subjects re-versed the labeling. Supplying feedback about the correctness of the responsemight resolve this reverse labeling, but then we would not know whether theparticipants would indeed base their judgments on the perception of speed.

In all the experiments, the loudness cue was largely eliminated by equalizingthe levels of the stimuli to RMS values corresponding to 80 dB SPL. If thestimuli were not equalized in intensity level, subjects would have been able touse differences in sound level to discriminate between differences in speeds,simply because balls rolling fast produce a louder sound than balls rollingslowly. This was tested by an experiment identical to the main experimentdescribed above, with the difference that the original sound pressure levelsof the stimuli were retained. Six subjects, who had not participated in any ofthe earlier experiments, were presented with 32 repetitions of the 21 stimulus

36

Page 48: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.5 Experiment III: Interaction between size and speed

combinations. The results are shown in Figure 2.4. The average Pc of thesubjects was between 89% and 98% with a mean of 94%. This result confirmssubjects’ increased ability to correctly judge the speed of rolling balls if theloudness cue is available.

mean Pc: 95%

78 97

88

94

97

100

100

100

100

78

100

100

100

91

94

97

97

97

94

100

88 subject 2 mean Pc: 90%

78 91

84

100

91

88

100

97

88

50

97

94

94

88

81

97

100

100

100

97

88 subject 3

.50 .63 .71 .79 .87 .93speed (m/s)

mean Pc: 98%

94 97

97

100

100

97

100

97

100

88

100

100

100

97

100

100

100

100

100

100

100 subject 5

.50 .63 .71 .79 .87 .93speed (m/s)

mean Pc: 91%

94 100

38

100

81

88

100

100

94

81

100

100

94

100

50

100

100

100

94

94

100 subject 6

.36

.50

.63

.71

.79

.87

spee

d (m

/s)

mean Pc: 89%

94 97

81

81

97

97

88

91

84

84

88

94

88

88

75

91

94

81

97

91

84 subject 1

.50 .63 .71 .79 .87 .93

.36

.50

.63

.71

.79

.87

spee

d (m

/s)

speed (m/s)

mean Pc: 95%

100 97

84

97

100

100

100

97

97

78

97

97

91

94

84

97

97

97

97

100

97 subject 4

Figure 2.4: Results of experiment II with original sound levels retained. Percentage of correctresponses is depicted per subject and per stimulus combination. In the top-left corner of thepanels the mean percentage of correct responses is given.

2.5 Experiment III: Interaction between size and speed

2.5.1 Method

Sound recordings of wooden balls rolling over a wooden surface were used.The sounds were recorded as described in Section 2.2. Both size and speed ofthe balls were varied. Eight sounds were used: s1v1, s2v1, s3v1, s1v2, s3v2,s1v3, s2v3, and s3v3. Here, v1, v2 and v3 denote velocities of approximately0.40, 0.60 and 0.90 m/s, and s1, s2 and s3 denote sizes of 25, 45 and 83 mm indiameter.

This experiment was carried out by twelve subjects who had not participatedin the preceding experiments. The stimuli were presented pairwise. The stim-ulus pairs were presented 8 times in random order (both permutations of the

37

Page 49: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

same combination treated together) and were preceded by 4 test pairs cover-ing the range of stimuli. The subjects were seated in a sound-absorbing room.A loudspeaker was used to reproduce the sounds recorded on DAT. The stim-uli could not be repeated and the subjects did not receive any feedback on thecorrectness of their responses. The experiment was divided into two sessionswhich only differed regarding the task subjects had to perform and the stimu-lus pairs presented. One task was to decide which of the two sounds in a pairwas produced by the larger ball. The other task was to decide which of thetwo sounds in a pair was produced by the faster rolling ball. The order of thetwo sessions was counter-balanced over the listeners.

To limit the duration of the experiment, not all stimuli were used. The stimuliused in the size judgment task were s1v1, s2v1, s3v1, s1v3, s2v3, and s3v3,that is, three levels of size and two levels of speed. Pairs were constructedby combining two stimuli with different sizes resulting in 12 combinations.This is visualized in the left part of Figure 2.5. The stimuli used in the speedjudgment task were s1v1, s3v1, s1v2, s3v2, s1v3, and s3v3, that is, two levelsof size and three levels of speed. Pairs consisted of stimuli differing in speed,resulting in 12 combinations. This is visualized in the right part of Figure 2.5.

Figure 2.5: Stimulus pairs used in the size judgment task (figure to the left) and speed judg-ment task (figure to the right). Solid lines between stimuli denote pairs that were presented.

2.5.2 Results

Figure 2.6 shows the results of judging the size of the rolling balls averagedacross subjects. The percentage of correct responses is depicted per stimuluspair, where the two stimuli in a pair are indicated by the stimulus in the titleof the panel and the corresponding stimulus positioned at the abscissa. Blackcircles denote stimulus pairs with no difference in speed (only the size differs).Grey crosses denote stimulus pairs for which the two stimuli differ in size aswell as in speed. Outwards (from right to left for the two left panels, from leftto right for the two right panels), the difference in size between the two stimuli

38

Page 50: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.5 Experiment III: Interaction between size and speed

in a pair increases. The lower and upper boundary of the 95% confidenceinterval for guessing correspond to Pc values of 42% and 58% respectively,and are depicted by horizontal dashed lines. All pairs occur twice in differentpanels to facilitate comparison between pairs.

In Figure 2.7 the results of judging the speed are depicted. Black circles de-note stimulus pairs equal in size but with different speeds, and grey crossesdenote stimulus pairs for which the two stimuli differ in both size and speed.Outwards (from right to left for the two left panels, from left to right for thetwo right panels), the difference in speed between the two stimuli in a pairincreases. Again all pairs are shown twice for reasons of symmetry.

0

50

100

s3v1 s2v1s3v3 s2v3

s1v1

Pc

(%)

0

50

100

s3v1 s1v1s3v3 s1v3

s2v1

0

50

100

s2v1 s1v1s2v3 s1v3

s3v1

0

50

100

s3v3 s2v3s3v1 s2v1

s1v3

Pc

(%)

0

50

100

s3v3 s1v3s3v1 s1v1

s2v3

0

50

100

s2v3 s1v3s2v1 s1v1

s3v3

Figure 2.6: Results of experiment III, size judgment task. Percentage of correct responses av-eraged across subjects is depicted per stimulus pair, comprising the stimulus in the title of thepanel and the corresponding stimulus positioned at the abscissa. v1, v2, and v3 indicate thevelocity of the ball (from slow to fast), s1, s2, and s3 indicate its size (from small to large). Thehorizontal dashed lines indicate the boundaries of the 95% confidence interval for guessing.

2.5.3 Discussion

Figure 2.6 shows that all Pc values are above the upper boundary of the 95%confidence interval for guessing (Pc of 58%) for the size judgment task. AllPc values lie around 90% except for the pair comprising stimuli s2v1 and s3v3with a Pc value of 68% (see bottom right panel and top middle panel). Note

39

Page 51: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

0

50

100

s1v3 s1v2s3v3 s3v2

s1v1P

c (%

)

0

50

100

s1v3 s1v1s3v3 s3v1

s1v2

0

50

100

s1v2 s1v1s3v2 s3v1

s1v3

0

50

100

s3v3 s3v2s1v3 s1v2

s3v1

Pc

(%)

0

50

100

s3v3 s3v1s1v3 s1v1

s3v2

0

50

100

s3v2 s3v1s1v2 s1v1

s3v3

Figure 2.7: Results of experiment III, speed judgment task. See Figure 2.6 for a description.

that the larger ball in this pair also has a greater speed. When the differencein size is larger (pair comprising s3v3 and s1v1 in the bottom right panel) orwhen the difference in speed is eliminated (pair comprising s3v3 and s2v3in the bottom right panel), the subjects’ ability to identify the larger ball issignificantly better (p < 0.05) compared to pair (s2v1,s3v3). The differencebetween pair (s2v1,s3v3) and (s2v1,s1v1) in the top middle panel, is the onlyother significant difference in Pc between depicted pairs (p < 0.05). So subjectsare generally able to discriminate the size of rolling balls, even when the speeddiffers, though this may decrease the performance.

As can be seen by comparing Figures 2.6 and 2.7, judgment of the speed ofrolling balls is not as easy as judgment of the size of rolling balls for the pre-sented sounds. Increasing the difference in speed (from right to left withinthe two left panels and from left to right within the two right panels) mostlyincreases Pc slightly, but this is only significant (p < 0.05) for the differencebetween the pairs denoted by black circles in the bottom left panel (pair com-prising s3v1 and s3v2, and pair comprising s3v1 and s3v3). Increasing the dif-ference in size, from no difference (black circles) to a difference (grey crosses),significantly diminishes performance levels in the top left panel, left part oftop middle panel, bottom right panel, and right part of bottom middle panel- for which the faster ball is also the larger ball - and improves performance

40

Page 52: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.6 Auditory cues derived from the stimuli

levels slightly in the other half of the figure (only significant for the differ-ence between pairs (s3v1,s3v2) and (s3v1,s1v2) in the bottom left panel) - forwhich the faster ball is also the smaller ball. Evidently, an increase in speed ac-companied by a decrease in size improves the discriminability of speed, whilesubjects have more difficulties discriminating the speed when both speed andsize are increased. Pc values below the lower boundary of the 95% confidenceinterval for guessing (42%) could result from reverse labeling as observed inexperiment II (Section 2.4.3). However, as Pc values that are this low only oc-cur if an increase in speed is accompanied by an increase in size, it is likelythat subjects do not reverse the labeling of the sounds but attend to auditorycues that are affected by changes in both speed and size. In the next section,the stimuli will be analyzed for possible auditory cues that subjects could usein their decision.

2.6 Auditory cues derived from the stimuli

The perception experiments in this chapter showed that listeners are able tohear the difference between small and large rolling balls, and between slowand fast rolling balls, on the basis of recorded sounds. The interaction effectencountered in experiment III, when both the size and the speed of a rollingball are varied, may be caused by the fact that changes in these two physicalparameters similarly affect the auditory cues used by the listeners in doingtheir task. Auditory cues may be found in both the temporal and the spectraldomain. Except for experiment II repeated with the original sound levels re-tained (Section 2.4.3), the loudness cue was largely eliminated by equalizingthe sound pressure levels of the stimuli and was therefore not available to thelisteners.

Two possible temporal cues may come to mind. First, a ball that is not per-fectly spherical produces a quasi-periodic rolling sound which comprises am-plitude modulation. Since this amplitude modulation may inform the lis-tener about the shape of the ball, that is, the deviation from perfect sphericity,sounds in which no conspicuous amplitude modulation could be heard (basedon informal listening) were selected for the perception experiments reportedin this chapter. Therefore we do not expect to find a clear variation of ampli-tude modulation with the size or the speed of the rolling ball.

A second temporal cue may be a kind of rough ticks, irregularly distributedin time. Especially when the balls are small and roll rapidly, rough ticks can

41

Page 53: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

be heard due to the fact that the ball looses contact with the plate and thenbounces on for a while. These rather unpredictable transitions from rolling tobouncing are well audible. Although the sounds that were used in the percep-tion experiments were relatively ‘smooth’ (on the basis of informal listening),that is, without such extreme bouncing ticks, a complete lack of ticks is in-feasible, because the presence of small impacts caused by irregularities of theinteracting surfaces is an intrinsic part of the stochastic rolling process. Theseimpacts may lead to a sensation of roughness. In Section 2.6.1, the auditoryroughness of the stimuli used in Experiments I and II is evaluated.

Spectral cues may also be involved in the perception of the size and the speedof rolling balls. Several measures describing the shape of the spectrum, suchas the relative proportion of high and low frequency components and theoverall slope of the spectrum, may be of importance. In Section 2.6.2 wepresent a rough measure of the spectral content of a sound, which is a kindof center of gravity of the spectrum, and apply it to the stimuli used in Experi-ments I and II. This measure is related to the ‘brightness’ of a sound (Grey andGordon, 1978) and its choice is motivated by the informal observation thata small ball sounds brighter than a large one, and a fast rolling ball soundsbrighter than a slow one.

2.6.1 Analysis of temporal properties

If a rolling ball bounces, clear peaks can be heard and seen in the waveformof the sound. Even if the ball does not clearly bounce, peaks in the waveformmay help to distinguish one sound from another. A sound with many highpeaks sounds ‘harsher’ than a smooth sound with hardly any peaks. Fig-ure 2.8 shows two example waveforms taken from experiment II, both nor-malized to have equal RMS value. The top panel shows the waveform of arolling ball with a speed of 0.63 m/s, and a diameter of 45 mm. The bottompanel shows a waveform of a ball with the same size but rolling at a speed of0.79 m/s. The peaks in the waveform of the faster ball in the bottom panel areof higher amplitude than those of the slower rolling ball in the top panel.

Several temporal measures were tried. For instance, a tentative measure ofthe ‘spikiness’, the number of peaks per second above a certain level, wascalculated. Only a weak positive correlation between the number of peaks andthe speed could be observed for the sounds used in the experiments, possiblybecause by reducing the temporal content to a single measure of ‘spikiness’a lot of information such as the regularity of the spikes may be lost. It was

42

Page 54: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.6 Auditory cues derived from the stimuli

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8−10

−5

0

5

10

time (s)

ampl

itude

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8−10

−5

0

5

10

time (s)

ampl

itude

Figure 2.8: Two example waveforms of a rolling ball. The diameter was 45 mm for both soundsources, and the speed was 0.63 m/s and 0.79 m/s for the balls in the top and bottom panel,respectively.

also investigated whether a quasi-periodicity induced by the ball’s imperfectsphericity could be detected within the sounds. Listening to the sounds, afaint amplitude modulation could be heard but this was not reflected by theautocorrelation functions of the sounds (neither broadband nor narrowband)or amplitude envelopes.

A rolling sound has a stochastic character caused by irregularities of the in-teracting surfaces, which can be thought of as a sequence of small impactsoccurring at a rate well above 10 Hz. This may lead to a sensation of rough-ness (Terhardt, 1974). Therefore, the auditory roughness of the sounds wascalculated with an algorithm based on the model of Daniel and Weber (1997).The results for the stimuli used in experiments I and II are shown in Figure 2.9.The gray squares depict the values for the stimuli of experiment I (perceptionof size) and are referred to by the abscissa on top of the figure. Values forthe stimuli of experiment II (perception of speed) are depicted by black circles(abscissa at the bottom of the figure).

The roughness increases with decreasing size and with increasing speed. Thisbehavior corresponds with what we would expect, because the smaller theball, the easier it will move vertically resulting in an impact when it hits thesupporting surface again, and the faster the ball rolls, the more impacts persecond will be generated. In order to judge whether differences in rough-

43

Page 55: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

20 40 60 80 100

0.45

0.5

0.55

0.6

0.65

diameter (mm)

0.2 0.4 0.6 0.8 10.45

0.5

0.55

0.6

0.65

speed (m/s)

roug

hnes

s (a

sper

)

Figure 2.9: Auditory roughness for the stimuli used in experiment I (gray squares) and II(black circles).

ness can be used by subjects in the discrimination task, we have to relate thevalues in Figure 2.9 to psychophysically measured just noticeable differences(jnds). According to Zwicker and Fastl (1999), the jnd for roughness amountsto 17%. The maximum difference in roughness between adjacent stimuli usedin experiment I (size judgment) is about 12%, and the maximum difference inroughness between two arbitrary stimuli used in experiment II (speed judg-ment) is about 18%. So, except for the difference between a very slow and fastball in experiment II, the variation in the auditory roughness due to variationin size or speed is below the threshold of detectability for the stimuli used inour experiments.

In summary, a salient temporal cue varying with size or speed was not found.As mentioned before, this may be due to the fact that the sounds used in theexperiments were deliberately chosen on the basis of the absence of distinctamplitude modulation and excessive ticks.

2.6.2 Analysis of spectral properties

Both variations in size and speed of the rolling ball induce changes in spec-tral shape of the sound, including changes in the proportion of low and highfrequency components, and changes in spectral tilt, which is the overall slopeof the spectrum. Figures 2.10 and 2.11 show the power spectral densities forfrequencies below 12 kHz of the stimuli used in experiments I and II, respec-

44

Page 56: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.6 Auditory cues derived from the stimuli

tively. The power spectral density is calculated by dividing the signal intohalf overlapping intervals of 5.3 ms (256 points with a sample frequency of48 kHz), windowing by a Hanning window of the same length and averagingthe squared magnitudes of the discrete Fourier transforms of the intervals.

0 4 8 12−60

−40

−20

0

20

frequency (kHz)

pow

er s

pect

ral d

ensi

ty (

dB)

size

22 mm25 mm35 mm45 mm55 mm68 mm83 mm

Figure 2.10: Power spectral density of the stimuli used in experiment I (perception of size).Different line types and shades of gray denote different diameters of the rolling ball, as indi-cated by the legend.

It can be seen that the spectral content of the sound changes with the size of therolling ball as well as with its speed. Furthermore, size and speed influencethe shape of the spectrum in their own way.

To simplify comparisons of spectral changes induced by changes in size orspeed of the rolling ball, the spectral content of a sound is expressed in onenumber. Often a measure of the spectral mean of the sound is used: the cen-ter of gravity of the power spectrum, sometimes called the first moment ofthe spectrum or the spectral centroid. Basically, the spectral centroid is de-fined as the ‘average’ frequency with all the frequency components weighedby their amplitude. It is a physical measure closely correlated with the per-ceptual ‘brightness’ (Grey and Gordon, 1978), and is often used in the field ofmusic cognition and computer music. If two sounds have a radically differentcentroid, they are generally perceived as different in timbre (McAdams et al.,1999). There is some evidence suggesting that the spectral centroid may be

45

Page 57: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

0 4 8 12−60

−40

−20

0

20

frequency (kHz)

pow

er s

pect

ral d

ensi

ty (

dB)

speed

0.36 m/s0.50 m/s0.63 m/s0.71 m/s0.79 m/s0.87 m/s0.93 m/s

Figure 2.11: Power spectral density of the stimuli used in experiment II (perception of speed).Different line types and shades of gray denote different rolling speeds, as indicated by thelegend.

one of the principal physical variables that maps onto the timbral dimensionfor steady-state, instrumental signals (Kendall and Carterette, 1996).

Since we suppose that the listeners’ judgment is based on auditory cues, thecentroid of specific loudness instead of the spectral centroid will be used. Spe-cific loudness, which is the loudness per critical band, takes the frequencyspecific absolute threshold of our ears into account, and is a kind of ‘loud-ness density’. The calculation is based on a model for computing loudness byMoore et al. (1997). Figure 2.12 depicts a block diagram for calculating thecentroid of the specific loudness. The first block describes the transfer fromthe eardrum through the middle ear. A fixed filter transforms the spectrumat the eardrum into the effective spectrum reaching the cochlea. The secondblock calculates the excitation pattern by passing the spectrum reaching thecochlea through a set of auditory filters. The frequency axis is first scalednearly logarithmically by transforming Hz to ERB rate (Equivalent Rectangu-lar Bandwidth). The ERB rate is a value on the ERB-rate scale which is relatedto the representation of sound in the auditory system. The auditory filters areuniformly spaced on this scale. The relation between the ERB rate, �, and the

46

Page 58: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.6 Auditory cues derived from the stimuli

?

spectrumat eardrum

fixed filter

?

spectrumat cochlea

auditoryfilter

?

excitationpattern

specificloudness

calculation

?

specificloudness

centroidcalculation

?

centroid ofspec. loudness

Figure 2.12: Block diagram of sequence of stages used for calculating the centroid of the spe-cific loudness.

frequency, f , in kHz is given by (Glasberg and Moore, 1990)

� = 21:4 10log(4:37f + 1): (2.1)

The third block describes the transformation from excitation pattern to specificloudness N 0, which is the loudness per ERB. In the final stage, the centroid(center of gravity) of the specific loudness is calculated by

R� N 0(�) d�RN 0(�) d�

: (2.2)

The denominator, the integral over the specific loudness contour, is an esti-mate of the overall loudness of a given sound in sones.

This is a similar formula as proposed for sharpness by Zwicker and Fastl(1999). They defined sharpness as the weighted first moment of the critical-band rate distribution of specific loudness. Instead of transforming the fre-quency axis into the ERB-rate scale it is transformed into the closely-relatedcritical-band-rate scale (with the unit ‘Bark’). Furthermore, the formula em-ployed by Zwicker and Fastl includes an extra weighting factor which is

47

Page 59: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

critical-band-rate dependent and takes into account that sharpness of narrow-band noises increases unexpectedly strongly at high center frequencies. Thisfactor equals unity below 16 Barks (3 kHz) and increases to a value of fourat the end of the critical-band rate near 24 Bark (16 kHz). So Equation 2.2can be seen as a combination of the spectral centroid, which is related to theperceived ‘brightness’, and the definition of ‘sharpness’ proposed by Zwickerand Fastl.

The results of this analysis for the stimuli used in experiments I and II areshown in Figure 2.13. The gray squares depict the values for the stimuli ofexperiment I (perception of size). The abscissa located on top of the figuregives the diameter of the ball for these stimuli. Values for the stimuli of ex-periment II (perception of speed) are depicted by black circles. The abscissalocated at the bottom of the figure gives the speed of the ball for these stim-uli. The centroid is clearly influenced by size as well as speed, though thelatter influence is smaller. Furthermore, the centroid generally increases withincreasing speed and decreases with increasing size.

20 40 60 80 100

10

12

14

16

diameter (mm)

0.2 0.4 0.6 0.8 110

12

14

16

speed (m/s)

cent

roid

of

spec

ific

loud

ness

(er

b)

Figure 2.13: Centroid of the specific loudness (in ERB) for the stimuli used in experiments I(gray squares) and II (black circles).

As psychoacoustical reference for the discriminability of these values, we takethe data by Schorer (1989), who measured the jnd for the cut-off frequency ofa lowpass noise with a rather shallow roll off of 24 dB per octave. For a cut-off frequency of 1 kHz, he found a jnd of 25 to 30 Hz, corresponding to about0.2 ERB. This value is smaller than the difference in the centroid between mostof the sounds shown in Figure 2.13. So, the variation in the spectrum due to

48

Page 60: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2.6 Auditory cues derived from the stimuli

variation in size or speed is well audible.

We used the centroid of the specific loudness because we suppose that lis-teners use auditory cues when judging the size or speed of rolling objects.However, the listener’s judgment does not necessarily have to be based ona loudness representation. Possibly it is based on an earlier sensory repre-sentation meaning that the centroid of the power spectrum could predict thesubjects responses better than the centroid of the specific loudness. There-fore we calculated the correlation between the difference in centroid within apresented stimulus pair and the mean percentage correct responses averagedover subjects. Differences in three different centroids were evaluated: the ra-tio of the centroid of the power spectrum in Hz, the difference in centroid ofthe power spectrum in ERB, and the difference in centroid of the specific loud-ness. This resulted in Pearson correlations of 0.49, 0.42, and 0.58, respectively,for experiment I2, and 0.15, 0.26, and 0.74, respectively, for experiment II. Ap-parently, the difference in centroid of specific loudness is better able to predictthe responses of the listeners than differences in centroids based on the powerspectrum.

2.6.3 Discussion

In this section, we searched for auditory cues that subjects may use when judg-ing the size and the speed of rolling balls. A plausible temporal cue was notfound. Although the auditory roughness of the sound varied properly withsize and speed, this variation was not large enough to be well audible. As apossible spectral cue varying with the size and the speed of a rolling ball, thecentroid of specific loudness was deduced. An increase in the centroid of spe-cific loudness is induced by a decrease in size or an increase in speed. Thus,if the centroid is indeed a cue, and it is the only cue listeners use for judgingthe size as well as the speed, it can explain the interaction effect when varyingboth size and speed. Furthermore, the influence of speed on the centroid ofspecific loudness is smaller than the influence of size which agrees with thefindings of experiments I, II, and III, namely that for the chosen stimuli dis-crimination of size is easier than discrimination of speed. However, only bymanipulation of the centroid of specific loudness combined with perceptionexperiments, can it be determined whether subjects use the centroid as a cuefor their judgments of size or speed. A high correlation between a proposed

2A repetition of this experiment on the perception of size will be described in Section 4.3,page 77. The percentage correct responses in this experiment result in much higher correla-tions of 0.76 (spectrum in Hz), 0.73 (spectrum in ERB), and 0.77 (specific loudness).

49

Page 61: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

2 Perception experiments with recorded sounds

cue within the stimuli and subjects’ preference for the stimuli, does not nec-essarily mean that subjects use that cue. Besides, one should bear in mindthat the centroid is only a rough measure of spectral shape and, for example,does not take the spectral tilt into account. It is very well possible that sub-jects attend to one or more specific spectral features that also covary with thecentroid of specific loudness.

2.7 ConclusionsExperiment I showed that subjects are able to discriminate the sounds ofrolling balls of different sizes, on the basis of recorded sounds. Experiment IIshowed the ability of subjects to discriminate between the sounds of rollingballs with different speeds. However, some subjects reversed the labeling ofthe speed. This was eliminated when the sounds were not equalized in soundpressure level, but were presented with the same differences in sound level asrecorded. Experiment III indicated that when both the size and the speed of arolling ball are varied, subjects generally are still able to discriminate size andspeed, but an interaction effect can be observed in the results. Investigation ofauditory cues showed a conflict in available cues when varying both size andspeed, which is in accordance with the interaction effect. However, whetherlisteners indeed use these cues in their judgments has yet to be affirmed bymanipulation and perception experiments. So far it is unclear whether sub-jects use spectral cues or temporal cues or both. On the basis of the acousticalanalysis, in which no plausible temporal cue was found, it is expected thatfor discriminating the size and speed of rolling balls, spectral cues are moreimportant than temporal cues. This will be further investigated in Chapter 3.

Due to the reverse labeling exhibited by some subjects and the interaction be-tween size and speed, one should be careful when using sounds of rolling ballswith different sizes and speeds as a stand-alone signal rather than integratingthem with other types of feedback. One might expect that in a multimodalapplication, where multiple cues are available, these ambiguities are easilysolved.

50

Page 62: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3Perception experiments with

manipulated sounds

Perception experiments in Chapter 2 showed that subjects are generally ableto discriminate differences in size and speed of wooden rolling balls on the ba-sis of recorded sounds. In this chapter, we investigate whether listeners basetheir judgments on spectral cues or temporal cues. Recorded sounds weremanipulated by merging the temporal characteristics of one sound with thespectral characteristics of another. Perception experiments showed that if lis-teners had to choose the larger ball from two sounds, they had a preferencefor the spectral content of a large ball. If listeners had to choose the faster outof two sounds, they preferred the spectral content of a small ball, and, to alesser degree, the spectral content of a fast rolling ball. The temporal cues inthe sounds were of minor importance for the range of stimuli used in theseexperiments, possibly because sounds with much amplitude modulation andexcessive ticks were excluded from the experiments.

Page 63: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

3.1 IntroductionIn the previous chapter it appeared that an interaction exists between the per-ception of the size and the speed of rolling balls. Generally speaking, smallballs are confused with fast rolling balls. This interaction also emerged by in-vestigating temporal and spectral cues subjects could use when judging thesize and speed of rolling balls. To obtain more information about this inter-action effect, we focussed on what kind of acoustic information within thesounds of rolling balls subjects use to judge the size and speed. For this pur-pose, recorded sounds were manipulated by combining different aspects ofdifferent sounds. Possibly subjects use spectral cues, such as the relative pro-portion of high and low frequency components, to judge size, and temporalcues, such as the number of ticks per second, to judge speed. Therefore thesounds were manipulated by combining the temporal characteristics of onesound with the spectral characteristics of another. In this way, we could, forinstance, construct a sound with the temporal characteristics of a fast rollingball and the spectral characteristics of a slowly rolling ball. These modifiedsounds served as a basis for perception experiments, which might help to un-ravel the perceptual cues for size and speed. This is analysis 3 in our researchmethodology depicted in Figure 1.1.

First, the sound manipulation algorithm with which the stimuli were synthe-sized is described (Section 3.2). Second, pilot experiments are presented (Sec-tion 3.3). These experiments were conducted to test the influence of the manip-ulation algorithm. Third, perception experiments are reported (Section 3.4) toinvestigate the interaction effects and to find out whether subjects use spectralcues or temporal cues when discriminating between different sizes or speedsof rolling balls. These latter experiments are also presented in Houben et al.(2001).

3.2 Sound manipulation algorithmThe general approach consists of combining the spectral properties of onesound, s1, with the temporal properties of another sound, s2, as illustratedin Figure 3.1.

First the two sounds were filtered with a Gammatone filterbank (Pattersonet al., 1995) with 32 channels regularly spaced on an ERB scale from 20 Hz to24 kHz (half the sample frequency) resulting in s1;c and s2;c with channel indexc = 1 � � � 32. The spectra of the impulse responses of adjacent channels crossed

52

Page 64: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.2 Sound manipulation algorithm

Figure 3.1: Illustration of the sound manipulation algorithm. The signals are plotted in thetime domain. Signals s1 and s2 are two recording sounds. Signals s1;c and s2;c are theGammatone-filtered signals in the channel with index c for signal s1 and s2, respectively.Signal s12;c is the new signal in the cth channel, synthesized by substituting the temporal en-velope of s1;c by the temporal envelope of s2;c. Signal s12 is obtained by summing s12;c overall channels, and combines the spectral characteristics of s1 and the temporal characteristicsof s2.

at – 3 dB, so as to obtain a sound close to the original sound when summingthe channels.

Per channel the temporal envelopes, e1;c, of the signals s1;c were calculated byusing the Hilbert transform:

e1;c =qs21;c + fH(s1;c)g2; (3.1)

in which H(:) denotes the Hilbert transformation. The temporal envelopes,e2;c, of signals s2;c were calculated in the same way.

The new signals per channel, s12;c, were synthesized by substituting the tem-poral envelopes of signal one by the temporal envelopes of signal two (left

53

Page 65: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

0 0.1 0.2 0.3

0 0.1 0.2 0.3

ampl

itude

0 0.1 0.2 0.3time (s)

0 2 4 6 8

0 2 4 6 8

0 2 4 6 8frequency (kHz)

S1

S2

S12

Figure 3.2: Visualization of the results of the sound-manipulation algorithm. Signals s1, s2,and the resulting signal s12 are shown in both the temporal and the spectral domain.

fraction) but maintaining the mean levels of signal one (right fraction):

s12;c = s1;c �e2;c

e1;c�e1;c

e2;c; (3.2)

with e1;c and e2;c the mean values of envelopes e1;c and e2;c, respectively1.

The new sound was obtained by summing all signals in the channels and com-pensating for the group delay, �c, of the Gammatone filters:

s12(t) =32Xc=1

s12;c(t+ �c); (3.3)

in which t denotes time. The delays, �c, were determined by calculating the

1In Equation 3.2, the preservation of overall spectral information was based on the meanenvelope. Alternatively one could have attempted to preserve the spectral energy levels bycomputing the root of the mean squared envelope values. We analyzed the difference be-tween these two approaches in terms of a perceptual distance measure (Rao et al., 2001) andfound that the difference in excitation patterns for the two computational approaches is smallcompared to the differences between the original stimuli used in this experiment (and, for themajority of the stimuli, below the threshold of detectability).

54

Page 66: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.3 Pilot experiments

center of gravity of the energy of the impulse response per channel hc(t):

�c =

R T0 hc

2(t) t dtR T0 hc

2(t) dt: (3.4)

It is expected that s12 combines the spectral characteristics of s1 and the tem-poral characteristics of s2. An example is shown in Figure 3.2. The left panelsshow broadband time functions of signals s1, s2, and s12. The time functionof s12 resembles the one of s2, but because of temporal smearing they are notidentical. The right panels show broadband spectra of signals s1, s2, and s12.The spectrum of s12 resembles the one of s1.

3.3 Pilot experimentsThe pilot experiments reported in this section are used to check whether sub-jects can identify the size and speed of rolling balls equally well before and af-ter using the Gammatone filter algorithm. If not, it is pointless to do the mainexperiments in which sounds will be Gammatone filtered and manipulatedbefore the channels are summed to construct sounds with changed temporaland spectral characteristics.

3.3.1 Stimulus description

Sound recordings of wooden balls rolling over a wooden surface were used.These are a subset of the sounds recorded for the previous experiments. Vari-ation of size and speed, both at two levels, resulted in four sounds:

s1v1 (small and slow), with a diameter of 35 mm and speed of 0.57 m/s,

s1v2 (small and fast), with a diameter of 35 mm and speed of 0.83 m/s,

s2v1 (large and slow), with a diameter of 55 mm and speed of 0.61 m/s,

s2v2 (large and fast), with a diameter of 55 mm and speed of 0.86 m/s,

in which s and v denote size and velocity, respectively, and 1 and 2 denotethe two different levels with 1 the slower or smaller ball and 2 the faster orlarger ball. In this experiment, we did not select stimuli on the basis of theirtemporal smoothness. As a result, the stimuli were not completely ’smooth’.In the sounds produced by the small ball with a diameter of 35 mm, irregularticks can be heard which are not present in the sounds created by the largerball with a diameter of 55 mm. Figure 3.3 displays a part of the waveforms ofs1v2 and s2v2 in the top and bottom panel, respectively. The sound in the top

55

Page 67: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

panel contains several peaks, which correspond to the audible ticks. Thesepeaks may be caused by bouncing of the ball, and from now on they will bedenoted by bouncing. In the sound produced by the large ball rolling slowly,some amplitude modulation can be heard which is not present in the othersounds.

0 0.05 0.1 0.15 0.2 0.25

−5

0

5

time (s)

ampl

itude

0 0.05 0.1 0.15 0.2 0.25

−5

0

5

time (s)

ampl

itude

Figure 3.3: 250 ms excerpts of s1v2 (top panel) and s2v2 (bottom panel).

Two perception experiments were conducted to test the influence of the fil-tering operation. In one experiment, the original recordings were used. In theother experiment, these same original recordings were passed through a Gam-matone filterbank and resynthesized. The sounds were resynthesized withoutfurther manipulation, but the Gammatone filterbank induced small changesin the sound. In other words, the sound manipulation algorithm of the previ-ous paragraph was applied to just one sound (s1 = s2) instead of two differentones.

3.3.2 Method

The stimuli were presented pairwise. The duration of the stimuli was 800 mswith 700 ms silence between them. The stimuli were faded in and out over10 ms by means of a Hanning window. The sound levels of the stimuli werenot equalized but did not vary much, the largest difference in RMS value being3.0 dB. A matrix of pairs can be generated which, without the diagonal, has12 cells. The 12 stimulus pairs were presented five times in random order. So

56

Page 68: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.3 Pilot experiments

the entire stimulus set consisted of 60 pairs. Disregarding the order within apair, 10 repetitions of 6 pairs are obtained (e.g. 5 times the permutation ABand 5 times the permutation BA results in 10 times the combination AB). Tensubjects participated in both experiments. They were seated in a soundproofbooth and received the stimuli over Beyer DT 990 headphones. They werenot familiar with the sounds and had not participated in any of the previousexperiments.

Both experiments are divided into two sessions, with only a difference in task.In one session, which will be denoted as SIZE, subjects had to decide whichof the two sounds in a pair was produced by the larger rolling ball, and in theother, which will be denoted as SPEED, subjects had to decide which of thetwo sounds in a pair was produced by the faster rolling ball. In each sessionthe entire stimulus set of 60 pairs was presented. The subjects did not receiveany feedback about the correctness of their responses.

The order of the two experiments (original or Gammatone filtered sounds) aswell as the order of the two different tasks (choosing the larger or the fasterball) were counter balanced. The experiments were preceded by four test pairseach.

3.3.3 Results

Figure 3.4 shows the results for SIZE (experiment in which subjects had tojudge the size). The results for the experiment with original sounds are shownin the left panel, and the results for the experiment with filtered sounds inthe right panel. The percentages correct responses, Pc, averaged over all thesubjects are plotted in a matrix layout in which every cell denotes a stimuluspair.

Figure 3.5 shows the results for SPEED (experiment in which subjects had tojudge the speed). Percentage correct responses averaged over all subjects forthe experiment with original sounds and filtered sounds are plotted in the leftand right panel, respectively. Please note that v1s2 and v2s1 are exchanged,compared to Figure 3.4, to enhance the uniformity of the response matrices.

The cells with values between parentheses denote the pairs with no differencein size for SIZE, and the pairs with no difference in speed for SPEED. Thesevalues are the percentage of times subjects have chosen the fastest ball for SIZE,and the percentage of times subjects have chosen the largest ball for SPEED.

57

Page 69: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

s1v1 s1v2 s2v1 s2v2

s2v2

s2v1

s1v2

s1v1

89 94 (55)

84 94

(21)

original sounds

s1v1 s1v2 s2v1 s2v2

s2v2

s2v1

s1v2

s1v1

84 97 (51)

87 95

(14)

filtered sounds

Figure 3.4: Results of the experiment with original sounds (left panel) and filtered sounds(right panel), for SIZE. The panels depict the results averaged over all subjects. Values be-tween parentheses, denoting the pairs with no difference in size, represent the percentage oftimes subjects chose the fastest ball.

s1v1 s2v1 s1v2 s2v2

s2v2

s1v2

s2v1

s1v1

49 70 (11)

94 95

(35)

original sounds

s1v1 s2v1 s1v2 s2v2

s2v2

s1v2

s2v1

s1v1

57 59 (8)

95 98

(45)

filtered sounds

Figure 3.5: Results of the experiment with original sounds (left panel) and filtered sounds(right panel), for SPEED. The panels depict the percentage correct responses averaged overall subjects. Values between parentheses, denoting the pairs with no difference in speed,represent the percentage of times subjects chose the largest ball.

3.3.4 Discussion

Figure 3.4 shows that subjects are able to identify the larger ball. The per-centage correct responses, Pc, is above 70% which is the upper boundary ofthe 95% confidence interval for guessing. Looking at individual responses(not shown) reveals that only one subject had great difficulties identifying thelarger ball, resulting in an average Pc of 50%.2

Figure 3.5 shows that for the pairs (s1v1,s1v2) and (s2v1,s1v2) subjects arevery good at identifying the faster ball resulting in an average Pc near 100%.However, subjects have difficulties identifying the faster ball when the pairs(s1v1,s2v2) and (s2v1,s2v2) are presented resulting in an average Pc of about53% and 60%, respectively. This bad performance may have two causes. First,the small ball bounces. This is likely to introduce an extra cue for determin-

2This subject said after the experiments that he had difficulties identifying the size becauseof the differences in material of the balls. Unfortunately he did not notice that the instructionsexplicitly mentioned the use of recordings of wooden balls rolling over a wooden surface.

58

Page 70: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.3 Pilot experiments

ing the speed, because the faster the ball the more bounces per second can beheard. Pair (s1v1,s1v2) may benefit from this. The large ball does not bounceand it may therefore be more difficult to judge the difference in speed forpair (s2v1,s2v2). Moreover, when presenting pair (s1v1,s2v2), the slower ballbounces and the faster one does not and thus subjects may be misled by theirintuition of the faster the rolling ball, the more it bounces. This may result inincorrect responses. Results for pair (s2v1,s1v2) may be improved by bounc-ing of the fastest ball whereas the slower ball does not bounce. Second, the sizeand speed cues may interact. Roughly speaking, the brightness of the soundof a rolling ball decreases with its size and increases with its speed, as alreadymentioned in Section 2.6.2 of Chapter 2. So if the speed increases while thesize decreases (pair (s2v1,s1v2)), the change in timbre caused by the decreasein size will reinforce the change in timbre caused by the increase in speed.This may result in an unchanged or even improved identifiability of speed.If, however, both speed and size increase (pair (s1v1,s2v2)), the changes intimbre may counteract and diminish the identifiability of speed.

The values between parentheses in Figure 3.4 denote the pairs with no differ-ence in size while that was the very thing the subjects had to judge. Because ofthe lack of difference in size, ‘Pc’ of pair (s2v1,s2v2) is about chance for mostof the subjects, just as expected. However, pair (s1v1,s1v2) has a ‘Pc’ valuesignificantly below chance for most of the subjects. As the ‘Pc’ for those pairsdenote the percentage of times subjects have chosen the fastest ball, it indi-cates that if subjects have to choose the larger ball from two sounds created byrolling balls with the same small size, most of them have a preference for theslower ball.

The values between parentheses in Figure 3.5 denote the pairs with no dif-ference in speed while that was just what the subjects had to judge. ‘Pc’ ofpair (s1v1,s2v1) averaged over subjects is slightly below chance. This is dueto very low values for several subjects and high values for the others. ‘Pc’of pair (s1v2,s2v2), denoting the percentage of times subjects have chosen thelargest ball when they had to select the fastest ball while there was no differ-ence in speed, is near 0%. This indicates that subjects have a preference for thesmaller ball if they have to choose the faster ball from two sounds created byballs rolling with the same high speed.

To summarize, if the task was to judge the size, and both balls were (equally)large, subjects had no preference. If both balls were (equally) small, subjectsdenoted the faster ball as the smaller one. If the task was to judge the speed,and both balls were (equally) slow, subjects had no preference. If both balls

59

Page 71: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

Table 3.1: �2-test to compare the observed (O) and expected (E) frequencies for SIZE.

stimulus pair O E(O�E)2

E

(s1v1,s1v2) 35 100 42.250

(s1v1,s2v1) 171 200 4.205

(s1v1,s2v2) 173 200 3.645

(s1v2,s2v1) 189 200 0.605

(s1v2,s2v2) 191 200 0.405

(s2v1,s2v2) 106 100 0.360

�2 = 51.470

were (equally) fast, subjects denoted the smaller ball as the faster one. Thisconfusion of a small ball with a fast ball agrees with the findings in Section 2.6,in which a conflict in available cues was found when simultaneously increas-ing size and decreasing speed or vice versa.

Comparison of the left panels with the right panels in Figure 3.4 and 3.5 showsthat differences in results between the experiments with original signals andthe experiments with Gammatone filtered signals are minimal. This was veri-fied by means of a statistical test.

A MANOVA with repeated measurements on two within factors (soundtype and stimulus pair) showed a highly significant difference between pairs(F(5,5) = 86.9, p < 0.001) for SIZE. The difference between the two types ofsounds (original and filtered) was non-significant (F(1,9) = 0.646, p = 0.44).The interaction between the stimulus pair and type of sound was also non-significant (F(5,5) = 0.516, p = 0.76). A repeated measurements analysis forSPEED showed that, again, only the differences between pairs are significant atthe 0.1 percent level (F(5,5) = 243, p < 0.001). Both the difference between thetwo types of sounds (F(1,9) = 0.235, p = 0.64) and the interaction between thestimulus pair and type of sound were non-significant (F(5,5) = 1.26, p = 0.40).This indicates that the perceptual distances between the four original soundsin both the size judgment task and the speed judgment task are not signifi-cantly changed by Gammatone filtering. Probably the subjects used the samecues in both the experiment with original recordings and the experiment withfiltered sounds.

�2-tests were conducted to compare the observed frequencies (averaged oversubjects and type of sound) and expected frequencies. The results are listed

60

Page 72: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.4 Main experiments

Table 3.2: �2-test to compare the observed (O) and expected (E) frequencies for SPEED.

stimulus pair O E(O�E)2

E

(s1v1,s2v1) 80 100 4.000

(s1v1,s1v2) 189 200 0.605

(s1v1,s2v2) 106 200 44.180

(s2v1,s1v2) 193 200 0.245

(s2v1,s2v2) 129 200 25.205

(s1v2,s2v2) 19 100 65.610

�2 = 139.845

in Table 3.1 and 3.2 for the size and speed judgment task, respectively. Thevalues of �2 are 51.5 and 140, respectively. For df = 5 these values are highlysignificant, the value required for significance at the 0.1 percent level being20.5. So the responses averaged over subjects deviate significantly from ex-pected responses. Inspection of the individual squares of the stimulus pairsreveals that pair (s1v1,s1v2) contributes most to the �2 for SIZE, and that forSPEED pairs (s1v1,s2v2), (s2v1,s2v2), and (s1v2,s2v2) contribute most.

3.4 Main experimentsIn the experiments reported in this section, recorded sounds were manipu-lated by combining the temporal characteristics of one sound with the spectralcharacteristics of another. These modified sounds were presented to listenersto investigate whether they use spectral or temporal cues when discriminatingbetween different sizes or speeds of rolling balls.

3.4.1 Methods

The same four recordings as in the pilot experiment are used:

s1v1 (small and slow), with a diameter of 35 mm and speed of 0.57 m/s,

s1v2 (small and fast), with a diameter of 35 mm and speed of 0.83 m/s,

s2v1 (large and slow), with a diameter of 55 mm and speed of 0.61 m/s,

s2v2 (large and fast), with a diameter of 55 mm and speed of 0.86 m/s.

New sounds were synthesized by combining the spectral content of one

61

Page 73: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

sound with the temporal content of another sound (see Section 3.2). In thisway 16 stimuli were obtained from the four original recordings. The stimuliwere presented pairwise over headphones to 10 naive subjects, who hadnot participated before, seated in a soundproof booth. Only comparisonsbetween different stimuli were made, resulting in 240 pairs of stimuli (theentire stimulus set minus the diagonal). These pairs were presented inrandom order and preceded by 8 test pairs. The pairs were only played onceand no feedback about the correctness of the responses was given.

The experiment was performed twice, with a difference in task: In one set,which will be denoted as SIZE, subjects had to decide which of the two soundsin a pair was produced by the larger ball. In another set, which will be de-noted as SPEED, subjects had to decide which of the two sounds in a pair wasproduced by the faster rolling ball.

Four parameters with two levels describe each stimulus:

sizeSpec: the size (small or large) of the rolling ball providing the spectral con-tent (depicted by a gray circle),

veloSpec: the velocity (slow or fast) of the rolling ball providing the spectralcontent (depicted by a gray arrow),

sizeTemp: the size (small or large) of the rolling ball providing the temporalcontent (depicted by a black circle),

veloTemp: the velocity (slow or fast) of the rolling ball providing the temporalcontent (depicted by a black arrow).

3.4.2 Results

The homogeneity of the responses among subjects was checked by construct-ing an intercorrelation matrix using Pearson coefficients. This matrix revealedthat for SIZE all ten subjects correlated uniformly and significantly with oneanother with respect to their responses. The intercorrelation matrix for SPEED

revealed that the responses of one out of ten subjects did not correlate sig-nificantly at the 0.01 level with the results of any of the other subjects. Theresponses of this single subject were therefore not taken into account in thefurther analysis.

Figures 3.6 and 3.7 visualize the results for SIZE and SPEED, respectively. Thesymbols on the abscissa of the panels denote the parameters that are consid-ered. In the top-left panel, stimulus pairs are sorted according to the size of

62

Page 74: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.4 Main experiments

0

0.2

0.4

0.6

0.8

1

.52 .58 .6

+ + +

sizeTemp

0

0.2

0.4

0.6

0.8

1

.56 .45 .53

+ + +

veloSpec

0

0.2

0.4

0.6

0.8

1

.57 .46 .56

+ + +

veloTemp

0

0.2

0.4

0.6

0.8

1

.49 .87 .65

+ + +

sizeSpec

pref

eren

ce fo

r se

cond

stim

ulus

Figure 3.6: Results for SIZE, averaged over all subjects. The symbols on the abscissa denote theparameters that are considered (sizeSpec, sizeTemp, veloSpec, and veloTemp, in the top-left,top-right, bottom-left, and bottom-right panel, respectively). The first and third bar in eachpanel depict the proportion of times (scaled to 1) that subjects chose the second stimulus ofthe stimulus pair while for both stimuli the value of the concerned parameter was low (smallor slow) or high (large or fast), respectively. The second bar depicts the proportion of timessubjects chose the stimulus with the higher value of the parameter in question. The height ofeach bar is printed in white within the bar.

0

0.2

0.4

0.6

0.8

1

.45 .58 .46

+ + +

sizeTemp

0

0.2

0.4

0.6

0.8

1

.39 .69 .51

+ + +

veloSpec

0

0.2

0.4

0.6

0.8

1

.45 .51 .45

+ + +

veloTemp

0

0.2

0.4

0.6

0.8

1

.49 .24 .37

+ + +

sizeSpec

pref

eren

ce fo

r se

cond

stim

ulus

Figure 3.7: Results for SPEED, presented in the same way as in Figure 3.6 for SIZE.

63

Page 75: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

the ball providing the spectral content (sizeSpec) and averaged over the otherthree variables. The same goes for the bottom left panel, the top right panel,and the bottom right panel which are sorted according to veloSpec, sizeTemp,and veloTemp, respectively. So every panel contains all data but sorted dif-ferently. The first bar and third bar in each panel depict the proportion oftimes that subjects chose the second stimulus of the stimulus pair while forboth stimuli the value of the concerned parameter was the same (small/slowor large/fast, respectively). The second bar depicts the proportion of timessubjects chose the stimulus with the higher value of the parameter in ques-tion. Note that percentage correct responses cannot be calculated because wedo not know which response is correct. The order of stimuli is not regarded -pair AB and BA are treated together - so this bar represents the same amountof data as the first and third bar together. The height of each bar is printed inwhite within the bar.

The second bar of all four panels in Figures 3.6 and 3.7 is shown per subjectin Figures 3.8 and 3.9, for SIZE and SPEED, respectively. With other words, perparameter only the pairs with a difference in that parameter are considered,and the proportion of choosing the stimulus with the higher value is shown.The abscissa labels denote the parameters that are considered. From left toright:

Gray circles: stimulus pairs with a difference in size of the rolling ball whichprovides the spectral content (difference in sizeSpec).

Gray arrows: stimulus pairs with a difference in velocity of the rolling ballwhich provides the spectral content (difference in veloSpec).

Black circles: stimulus pairs with a difference in size of the rolling ball whichprovides the temporal content (difference in sizeTemp).

Black arrows: stimulus pairs with a difference in velocity of the rolling ballwhich provides the temporal content (difference in veloTemp).

The ordinate denotes the number of times (in proportion to the total pairsexamined) that subjects chose the higher value (larger or faster ball) of theparameter considered. Dots with different shades of gray are used for dif-ferent subjects. For better distinction between subjects, solid lines are drawnbetween data points of the same subject. Medians of subjects are shown byblack crosses. Vertical bars denote the interquartile range. A proportion be-low 0.5 simply means that subjects preferred the lower value (small or slow) ofthe parameters considered. The larger the deviation from 0.5 (no preference),

64

Page 76: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.4 Main experiments

that is, the closer to 0 or 1, the stronger the preference for the lower or highervalue, respectively.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

pref

eren

ce fo

r hi

gher

val

ue

Figure 3.8: Results for SIZE. The abscissa labels denote the four parameters. From left to right:sizeSpec (gray circles), veloSpec (gray arrows), sizeTemp (black circles), and veloTemp (blackarrows). Per parameter only the pairs with a difference in that parameter are considered, andthe proportion of choosing the stimulus with the higher value is shown. This corresponds tothe second bar of all four panels in Figure 3.6. Proportions are shown per subject by lines withdifferent shades of gray. Medians are shown by crosses and quartiles by horizontal bars.

3.4.3 Statistical analysis and discussion

If subjects only attend to the spectral cues induced by size when judging thesize of rolling balls, and they do this perfectly, the results would be 1 for thesecond bar in the top left panel of Figure 3.6, and 0.5 for all the others. It can beseen that the second bar of the top left panel is indeed largest (about 0.9) andthat the values of the other bars lie around 0.5. Figure 3.8 reveals that whensubjects had to choose the larger ball, they chose the sound with the spec-tral content of a large ball. The influence of the spectral content induced byspeed (veloSpec) as well as the temporal content of the sound (sizeTemp andveloTemp) was much less. To search for statistically significant effects withinthe data, two procedures were followed. First, to get an overall impression ofimportant effects, t-tests were conducted. Second, to test interaction effects, abinary logistic regression was applied to the data.

65

Page 77: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

pref

eren

ce fo

r hi

gher

val

ue

Figure 3.9: Results for SPEED, presented in the same way as in Figure 3.8 for SIZE. The datapoint that deviates greatly from all the other data points for sizeSpec (gray circles) belongs tothe subject with responses that did not correlate with the results of any of the other subjectsand for this reason was left out of the analysis.

T-tests were conducted to determine if the data in Figure 3.8 differed fromchance (a proportion of 0.5). The probability of observing the given result sup-posing that subjects merely guess, is given per parameter in the second row ofTable 3.3. These levels were compared to a significance level of 0.05/4 = 0.0125(Bonferroni adjustment) to ensure that after doing all four tests, the chance ofdetecting at least one significant difference from guessing while there was nodifference is limited to 5 percent. In the table, values below 0.0125 (imply-ing a result significantly different from guessing) are italicized. For SIZE, onlysizeSpec is statistically significant at the Bonferroni adjusted 5 percent level,

Table 3.3: T-test probabilities to determine if the data in Figure 3.8 (SIZE) and Figure 3.9(SPEED) could differ from chance (a proportion of 0.5). Values that are significant at the Bon-ferroni adjusted 5 percent level (0.0125) are italicized.

sizeSpec veloSpec sizeTemp veloTemp

SIZE <0.001 0.017 0.046 0.11

SPEED <0.001 <0.001 0.041 0.55

66

Page 78: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.4 Main experiments

indicating that subjects were able to choose the larger ball by attending mainlyto the spectral cues induced by size. The temporal content of the manipulatedsounds as well as the spectral content induced by speed did not significantlyinfluence discrimination of size.

Figures 3.7 and 3.9, with the results for SPEED, show, again, that sizeSpec ismost important. The proportion “faster” judgments of sizeSpec is close to 0,which indicates a preference for the spectral content of a small ball when sub-jects were asked to choose the faster ball. Additionally, subjects had a prefer-ence for the spectral content of a fast ball, as the proportion “faster” judgmentsfor sizeSpec is above chance. Results of t-tests, shown in Table 3.3, confirmedthat sizeSpec and veloSpec are significant at the Bonferroni adjusted 5 percentlevel.

To analyze the results of the experiments more closely, in particular with re-spect to interaction effects, a binary logistic regression was applied to the data.This type of regression was chosen because in our experiments the responsevariable was binary (preference for first or second stimulus of a pair, 1 or 0)whereas in linear regression the response variable is assumed to be normallydistributed and may lead to predictions of the response variable taking val-ues other than 0 or 1. The response variable is transformed from the observedprobability (0 � p � 1) into a variable with values between –1 and +1 byusing the logit transformation of p:

logit(p) = lnp

1� p; (3.5)

which is the natural logarithm of the odds ratio (Cox and Snell, 1989). Binarylogistic regression models the logit transformation of the observed probabilityas a linear function of the explanatory variables. In our experiment we havefour parameters describing the stimuli, namely sizeSpec, veloSpec, sizeTemp,and veloTemp, for simplicity denoted as par1 to par4. These parameters cantake the values – 1 (‘low’, i.e. small or slow) and 1 (‘high’, i.e. large or fast),denoting two categories, not physical values. A basic model for a stimuluscharacterized by parameters par1 to par4 is assumed to be of form

logit(p) = �1 � par1 + �2 � par2 + �3 � par3 + �4 � par4 + �12 � par12

+�13 � par13 + : : :+ �123 � par123 + : : :+ �1234 � par1234; (3.6)

with par12 = par1 �par2, par123 = par1 �par2�par3, etc. Because the stimuli werepresented pairwise, a set of explanatory variables for main effects was con-structed by taking the difference in parameter values between the first (par1;sto par4;s) and second stimulus (par1;t to par4;t) resulting in dpar1 to dpar4 with

67

Page 79: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

values –2, 0, or 2. In the same way, explanatory variables for interaction effectswere constructed by taking the difference in multiplied parameter values be-tween the first and second stimulus. The complete model can now be writtenas

logit(pst) = �1 � dpar1 + �2 � dpar2 + �3 � dpar3 + �4 � dpar4

+�12 � dpar12 + �13 � dpar13 + : : :

+�123 � dpar123 + : : :+ �1234 � dpar1234; (3.7)

with pst the proportion of subjects that chose the first stimulus when presentedwith the pair consisting of stimulus s and stimulus t, and dpar1 = par1;s – par1;t,and dpar12 = par12;s – par12;t = par1;s � par2;s – par1;t � par2;t, etc. A backwardstepwise selection method based on the likelihood ratio was used for reducingthis model. As initial model the complete model with all explanatory vari-ables was taken and variables were removed iteratively if the probability ofthe likelihood-ratio statistic, which is based on the maximum partial likeli-hood estimates, was greater than 0.01.

The regression results for SIZE revealed many significant terms, with threeterms clearly standing out: difference in sizeSpec (with an estimated value ofthe corresponding coefficient � of 0.988), difference in sizeTemp (� = 0.235),and the interaction between these two (� = –0.252). When the responses of thesubjects were not pooled but binary logistic regressions were applied to theindividual responses instead, exactly these three came out as significant for allsubjects. This model with three terms for SIZE predicted proportion “larger”judgments of 0.88, 0.50, 0.62, and 0.50 for sizeSpec, veloSpec, sizeTemp, andveloTemp, respectively, which match the observed mean values of 0.87, 0.45,0.58, and 0.46 (shown in Figure 3.8) fairly well. The overall percentage ofcorrectly predicted individual responses was 73%.

The most important parameter appears to be sizeSpec, indicating that sub-jects mainly chose the sound with the spectral content of a large ball when theyhad to choose the larger ball. Furthermore, sizeTemp and the interaction be-tween sizeSpec and sizeTemp contribute to the size judgments of the subjects.Substituting the estimated coefficients into the model for individual stimuli(Equation 3.6) reveals a measure of perceived size category (ranging from –1or infinitely small to +1 or infinitely large):

perceived size category = 0:988 sizeSpec + 0:235 sizeTemp

� 0:252 sizeSpec � sizeTemp: (3.8)

68

Page 80: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.4 Main experiments

Figure 3.10 visualizes the perceived size category as a function of sizeSpecand sizeTemp for the size judgment task. It shows that if the ball providingthe spectral content is large (sizeSpec = large), the sound is judged as beingproduced by a large ball, independently of the value of sizeTemp. On the otherhand, if the ball providing the spectral content is small (sizeSpec = small),the sound is judged as being produced by a small ball and this percept iseven stronger if the sound also contains the temporal content of a small ball(sizeTemp = small).

−1.5

−1

−0.5

0

0.5

1

1.5

sizeSpec = small sizeSpec = large

sizeTemp = large

sizeTemp = small

perc

eive

d si

ze c

ateg

ory

Figure 3.10: Perceived size category as function of sizeSpec and sizeTemp for SIZE.

Apparently, the judgment of size depends primarily on parameters arisingfrom size (sizeSpec and sizeTemp) and is at most slightly influenced by thespeed of the rolling ball, which agrees with the results of the size judgmenttask of the interaction experiment described in Section 2.5. Hence, we mightconclude that the listener correctly distinguishes between the physical infor-mation in the signal contributed by the size of the ball, and succeeds in ignor-ing the change in information determined by speed. Furthermore, Figure 3.10shows that the perception of size is dominated by the spectral attributes ofsize (sizeSpec). If the centroid of specific loudness is the spectral cue on whichlisteners base their judgments of size (which has to be confirmed by manipula-tion and perception experiments), the larger variation in the centroid inducedby size compared to speed, as shown in Figure 2.13 in Section 2.6.2, may ex-plain the observation that variations in speed hardly influence discriminationof size.

The binary logistic regression applied to the pooled results for SPEED leads toa model with four terms: difference in sizeSpec (with an estimated value ofthe corresponding coefficient � of – 0.812), difference in veloSpec (� = 0.536),

69

Page 81: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

the interaction between these two (� = 0.292), and difference in sizeTemp (�= 0.189). This model for SPEED predicted proportion “faster” judgments of0.16, 0.75, 0.59, and 0.50 for sizeSpec, veloSpec, sizeTemp, and veloTemp, re-spectively, which match the observed mean values of 0.20, 0.69, 0.57, and 0.51(shown in Figure 3.9) fairly well. The overall percentage of correctly predictedindividual responses was 72%.

Again, the most important parameter appears to be sizeSpec, but now with anegative coefficient indicating that subjects mainly chose the sound with thespectral content of a small ball when they had to choose the faster ball. Further-more, veloSpec and the interaction between sizeSpec and veloSpec contributeto the speed judgments of the subjects. Substituting the estimated coefficientsinto the model for individual stimuli (Equation 3.6) reveals a measure of per-ceived speed category (ranging from –1 or infinitely slow to +1 or infinitelyfast):

perceived speed category = � 0:812 sizeSpec + 0:536 veloSpec

� 0:292 sizeSpec � veloSpec

+ 0:189 sizeTemp: (3.9)

Figure 3.11 visualizes the perceived speed category as a function of sizeSpecand veloSpec for the speed judgment task. It shows that if the ball providingthe spectral content is large (sizeSpec = large), the sound is judged as beingproduced by a ball rolling slowly. On the other hand, if the ball providingthe spectral content is small and fast (sizeSpec = small, veloSpec = fast), thesound is judged as being produced by a fast rolling ball, whereas, if the ballproviding the spectral content is small and slow (sizeSpec = small, veloSpec= slow), the sound is perceived as being neither slow or fast. The small dif-ference in perceived speed category between a small ball rolling slowly (size-Spec = small, veloSpec = slow) and a large ball rolling fast (sizeSpec = large,veloSpec = fast) indicates the difficulty shown by subjects to discriminate thespeed between these two rolling balls. This agrees with the results of the speedjudgment task of the interaction experiment described in Section 2.5.

We proposed the centroid of specific loudness as a possible spectral cue. How-ever, for the perception of speed, both the size and the speed of the ball pro-viding the spectral content (sizeSpec and veloSpec) are important whereas forthe perception of size, only the size, not the speed, of the ball providing thespectral content (sizeSpec) influences the judgment of the listeners. This isnot possible if the centroid of specific loudness is the only spectral cue onwhich listeners base their judgment of size as well as speed. The centroid

70

Page 82: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3.4 Main experiments

−1.5

−1

−0.5

0

0.5

1

1.5

sizeSpec = small sizeSpec = large

veloSpec = slow

veloSpec = fast

perc

eive

d sp

eed

cate

gory

Figure 3.11: Perceived speed category as function of sizeSpec and veloSpec for SPEED.

may be important, but apparently it is not the only spectral cue that listenersuse. We think that the spectral tilt is also of importance as the spectra in Fig-ures 2.10 and 2.11 show that the slope of the first part of the spectrum, up to2 kHz, decreases (flattens) with decreasing size and increases (steepens) withdecreasing speed. This assumption, however, should be verified by indepen-dent manipulation of the spectral shape, such as centroid and tilt, followed byperception experiments.

The final and smallest significant effect on judgment of speed, found by thebinary logistic regression, was that of sizeTemp (� = 0:189). The small pos-itive coefficient indicates that if the sound contains the temporal content ofa large ball, the ball is judged as being slightly faster than if the sound con-tains the temporal content of a small ball. The small effect of sizeTemp andthe non-significance of veloTemp reveal that, for the range of stimuli used inthis experiment, temporal aspects are of relatively minor importance to the lis-teners. However, the large interquartile range for sizeTemp points to a largedifference between subjects. Obviously some subjects do take temporal cuesinto account while others do not.

By informal testing it was verified that if audible ticks and amplitude modu-lation are present in the sound providing the temporal content, the manipu-lation algorithm successfully passes these temporal attributes on to the syn-thesized sound. But we note again that in this experiment, we deliberatelydid not select or exclude stimuli on the basis of their temporal content. As aresult, the stimuli may have lacked distinct temporal cues that subjects coulduse in their judgments of size and speed. In the next chapter, we provided the

71

Page 83: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

3 Perception experiments with manipulated sounds

sounds with amplitude modulation to analyze the influence of this temporalcue on the perception of size and speed and to address the question whethersubjects attend to the linear or angular speed of a rolling ball.

72

Page 84: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4The influence of angular speed

A ball that is not perfectly spherical produces a quasi-periodic rolling sound.The quasi-periodicity consists of an amplitude modulation, and contributesto the percept of rolling. Moreover, this temporal variation may help the lis-tener to identify the size or the speed of the rolling ball. Chapter 3 showedthat listeners mainly base their judgments of size and speed on spectral cues.However, due to stimulus selection, temporal cues were probably less clearlypresent within the sounds. Therefore, we conducted the experiments reportedin this chapter. In these experiments subjects had to judge the size and thespeed of wooden rolling balls by listening to recordings provided with artifi-cially added amplitude modulation. Results showed that when judging size,listeners take into account both the physical size and the angular speed in-duced by the additional amplitude modulation. As to speed, the judgmentscorrelated predominately with angular speed. Furthermore, by modulatingthe amplitude at a rate different from that specified by the size and the linearspeed of the rolling ball, the percept of speed and, to a lesser extent, size is af-fected. Also, the interaction effect between the two physical properties of theballs, as observed in Chapter 2, may be explained by listeners’ attention to theangular speed.

Page 85: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

4.1 IntroductionThe perception experiments presented in Chapter 2 showed that subjectscould clearly discriminate between the sounds of rolling balls of differentsizes and, to a lesser degree, between the sounds of rolling balls with dif-ferent speeds. Investigation of the available spectral cues indicated that a kindof spectral mean of the sounds (the centroid of the specific loudness), indi-cating the ‘brightness’ (McAdams et al., 1999) or ‘sharpness’ (Zwicker andFastl, 1999) of the sound, changed with size as well as with speed, the latterinfluence being smaller. Possible cues in the temporal domain are the ‘rough-ness’ (Terhardt, 1974) of the sound and the quasi-periodicity induced by theball’s imperfect sphericity (amplitude modulation). Acoustical analysis of thesounds that were used in the experiments of Chapter 2 did not yield a plau-sible temporal cue varying with size and speed. This may be due to the factthat in the experiments ‘smooth’ sound examples were used.

In Chapter 3, perception experiments with manipulated sounds, synthesizedby combining the temporal characteristics of one sound with the spectral char-acteristics of another, showed that subjects mainly attended to spectral cueswhen judging the size or speed of rolling balls. Surprisingly, subjects attendedonly to a small extent to temporal cues when judging the speed of rolling balls.Maybe these cues were less clearly present within the sounds, resulting in alack of a truly convincing temporal structure related to the speed of a rollingball.

To investigate whether a clearly audible periodicity in the sound provokes apercept of speed, this chapter presents perception experiments using soundsartificially provided with amplitude modulation. In Section 4.2 amplitudemodulation is described. To check whether a boost in amplitude modulationhas an effect on the perception of size at constant linear speed or the per-ception of speed at constant size, the experiments on the perception of size(Experiment I, Section 2.3) and the perception of speed (Experiment II, Sec-tion 2.4) were repeated with the same sounds but amplitude modulated ata rate matching the natural angular speed. These experiments are reportedin Section 4.3. Basically the relation between the acoustic source event (thesound of a rolling ball with emphasized amplitude modulation) and auditoryperception is reevaluated, which is analysis 1 in our research methodologydepicted in Figure 1.1. Section 4.4 reports a perception experiment in whichboth size and linear speed were varied, and amplitude modulation with a ratematching the natural angular speed was applied. Section 4.5 describes per-ception experiments in which the angular speed was varied independently by

74

Page 86: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.2 Amplitude modulation

modulating the amplitude at three frequencies (one of which was the naturalfrequency). This is analysis 3 of the research methodology depicted in Fig-ure 1.1, that is, the evaluation of the relationship between the acoustic struc-ture (amplitude modulation) and auditory perception. These latter experi-ments are also presented in Houben and Stoelinga (2002). After a discussion(Section 4.6), the results of perception experiments that were reported in theprevious chapters are reanalyzed in terms of differences in angular speed (Sec-tion 4.7).

4.2 Amplitude modulationA ball that is not perfectly spherical produces a rolling sound with a certainperiodicity. This periodicity consists of an amplitude modulation, and con-tributes to the percept of rolling. Listening to a rolling sound produced bya perfectly spherical ball, without prior knowledge of the source, may easilyresult in a percept different from that of rolling, for instance scraping. More-over, the presence of amplitude modulation in the rolling sound may help alistener to discriminate the size or speed of the rolling ball. In the percep-tion experiments reported in this chapter, amplitude modulation is applied torolling sounds. By applying amplitude modulation to recordings instead ofrecording rolling sounds of oval shaped balls, the experiments on the percep-tion of size and speed of Chapter 2 can be repeated using the same stimuli butwith additional amplitude modulation. Furthermore, it enables us to vary theamplitude modulation rate independently of the size and linear speed of therolling ball, thus providing a way to manipulate the rolling sounds.

Amplitude modulation uses the instantaneous amplitude of a signal to di-rectly vary the amplitude of another signal:

y(t) = [1 +m(t)] x(t); jm(t)j � 1: (4.1)

Signal m(t) modulates signal x(t) resulting in the amplitude modulated signaly(t). In radio engineering, x(t) is called the ‘carrier’ and m(t) the ‘modulator’.For the modulator, a pure tone with frequency fm, phase �m and amplitudemd is used in our experiments:

m(t) = md cos(2�fmt+ �m): (4.2)

The amplitude md is called the modulation depth and ranges from zero (nomodulation) to one (maximum modulation). Expressed as a percentage, md isknown as the ‘modulation percentage’. An example of amplitude modulation

75

Page 87: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

is shown in Figure 4.1. The top panel shows the sound of a rolling ball. Thebottom panel shows the same sound but amplitude modulated (fm = 5 Hz,md = 0.3, and �m = 0 rad).

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8−5

0

5un

mod

ulat

ed s

igna

l

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8−5

0

5

time (s)

mod

ulat

ed s

igna

l

Figure 4.1: Visualization of amplitude modulation. The top panel shows a rolling sound.The bottom panel shows the same sound but amplitude modulated by a modulator withfrequency, fm, of 5 Hz, modulation depth, md, of 0.3, and phase, �m of 0 rad. The modulatoris depicted by the dark colored sinusoids.

4.3 Experiments on size and speed perception revisitedThe experiments on the perception of size (Experiment I, Section 2.3) and theperception of speed (Experiment II, Section 2.4) were repeated with exactlythe same experimental procedure. The same stimuli were used but now pro-vided with amplitude modulation. The value of the amplitude modulationfrequency, fm, was chosen to be twice the rotation frequency of the ball. Thiscorresponds to the theoretical natural frequency of maxima generated by arolling ball that is not perfectly round but more oval shaped because one ro-tation contains two maxima. A modulation depth, md, of 0.3 was chosen. Toolow values for md would result in inaudible amplitude modulation, but toohigh modulation depths (md > 0.5) make the sound unnatural (‘helicopter-like’). The phase was chosen such that the audibility of the amplitude mod-ulation was maximal (i.e. in phase with a possibly present natural amplitudemodulation).

76

Page 88: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.3 Experiments on size and speed perception revisited

Perception of size at constant linear speedThe size perception experiment was conducted twice. In one session the orig-inal stimuli were used (i.e. the size perception experiment reported in Sec-tion 2.3 was repeated), and in the other, amplitude modulation was applied tothe stimuli as described above. Eight naive subjects, who had not participatedbefore, took part in both sessions in counterbalanced random order. In eachsession, the 12 stimulus pairs were presented four times in random order andwere preceded by 10 test pairs. Figure 4.2 depicts the results.

22−25 25−35 35−45 45−55 55−68 68−83 0

50

100

Pc

(%)

diameter of first and second ball (mm)

Original

22−25 25−35 35−45 45−55 55−68 68−83 0

50

100

Pc

(%)

diameter of first and second ball (mm)

Provided with AM

Figure 4.2: Perception of the size of rolling balls with no amplitude modulation provided(upper panel) and with amplitude modulation provided (lower panel). Percentage correctresponses, Pc, is represented as a function of the stimulus pair. The horizontal dot-dashedline is the upper boundary of the 95% confidence interval for guessing. Medians are shownby dots and quartiles by horizontal bars. The number pairs on the abscissa denote the twodiameters of the rolling ball in the stimulus pair.

The results show that, again, subjects are capable of discriminating betweenthe sounds of rolling balls with different sizes, except for the stimulus paircomprising diameter 22 mm and 25 mm for which the relative difference insize was the smallest of all stimulus pairs. Comparing the results of the sizeperception experiment in Section 2.3 (Figure 2.2) and the results of the re-peated experiment (upper panel of Figure 4.2) statistically, by means of two-

77

Page 89: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

sample t-tests, indicates that only for the stimulus pair comprising diameter25 mm and 35 mm the responses of the two experiments differ significantly atthe 5% level (p = 0.0076). Comparison of the measurement with and withoutamplitude modulation shows that the overall discrimination is hardly influ-enced by the additional amplitude modulation. The mean percentage correctresponses increased from 77% to 81%, but pairwise t-tests did not show a dif-ference at the 5% significance level for any of the stimulus pairs.

Perception of speed at constant sizeThe speed perception experiment reported in Section 2.4 was repeated withthe same procedure and with amplitude modulation applied to the stimuli asdescribed above. Eight subjects, who had not participated before, listened topairs of sounds and had to choose which of the two sounds was producedby the faster ball. All pairs were presented four times in random order, thatis, each subject participated in one session. Figure 4.3 depicts the results persubject as well as averaged over all subjects (shown in the bottom right panel).The subjects could clearly discriminate between rolling balls with differentspeeds. The Pc values for stimulus pairs are high, and reverse labeling (Pcbelow 50%) occurs only sporadically, resulting in mean Pc values (printed inthe top left corner of each panel) from 82% to 95%.

Figure 4.4 depicts the results of the speed perception experiment with originalsounds (Figure 2.3 of Section 2.4). The left panel shows the percentage correctresponses, Pc, averaged over all six subjects. As some subjects reversed thelabeling, the middle panel shows the average of the Pc values after folding,for each subject individually, values below 50% across the 50% level (e.g. a Pcvalue of 24% is flipped to 76%). In this way only the discriminability betweendifferent speeds is taken into account, and not the ability to label the soundscorrectly, but it facilitates comparison with Figure 4.3. The right panel showsthe Pc values averaged over the four best subjects (i.e. subjects 1, 2, 4, and 6),all with a mean Pc above 65% (which is the upper boundary of the 95% confi-dence interval for guessing). Comparison of the results depicted in Figure 4.3and the left panel of Figure 4.4 shows that discrimination between the soundsof rolling balls with different speeds improved by providing the sounds withjust audible amplitude modulation. Pc values are higher and none of the sub-jects reversed the labeling of the sounds. However, if reverse labeling of thesounds (Pc below 50%) is omitted from the results (middle and right panel ofFigure 4.4), the discrimination is only slightly improved by providing ampli-tude modulation.

78

Page 90: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.3 Experiments on size and speed perception revisited

mean Pc: 93%

75 100

100

100

100

100

100

100

88

88

100

88

100

100

63

100

100

100

100

63

100 subject 2 mean Pc: 92%

75 100

100

100

100

100

88

100

100

63

100

100

88

100

75

100

88

100

100

88

63 subject 3

mean Pc: 86%

88 88

100

100

88

88

75

100

88

25

88

100

100

88

50

100

100

100

75

88

75 subject 5 mean Pc: 85%

75 100

100

100

100

75

100

100

88

75

100

88

100

63

25

100

100

88

88

50

63 subject 6

.50 .63 .71 .79 .87 .93speed (m/s)

mean Pc: 83%

100 100

88

100

100

88

100

100

75

75

88

100

63

75

63

100

88

63

75

88

25 subject 8

.50 .63 .71 .79 .87 .93speed (m/s)

mean Pc: 89%

81 93

91

98

98

86

95

99

89

66

98

98

91

86

63

99

94

94

91

81

70 average

.36

.50

.63

.71

.79

.87sp

eed

(m/s

) mean Pc: 95%

100 100

100

88

100

100

100

100

100

63

100

100

100

88

75

100

100

100

100

100

88 subject 1

.36

.50

.63

.71

.79

.87

spee

d (m

/s) mean Pc: 90%

88 88

88

100

88

75

100

88

100

75

100

100

88

88

75

100

88

100

88

100

75 subject 4

.50 .63 .71 .79 .87 .93

.36

.50

.63

.71

.79

.87

spee

d (m

/s)

speed (m/s)

mean Pc: 82%

50 63

50

88

100

63

100

100

75

63

100

100

88

88

75

88

88

100

100

75

75 subject 7

Figure 4.3: Perception of the speed of rolling balls with artificially provided amplitude modu-lation. Percentage of correct responses is depicted per subject and per stimulus combination.The mean percentage of correct responses is given in the top-left corner of the panels. Thebottom right panel shows the results averaged over all subjects.

.50 .63 .71 .79 .87 .93

.36

.50

.63

.71

.79

.87

speed (m/s)

spee

d (m

/s) mean Pc: 69%

50 58

58

68

74

73

68

70

68

56

71

68

70

71

68

73

78

77

80

79

63 average

.50 .63 .71 .79 .87 .93speed (m/s)

mean Pc: 87%

60 78

74

92

96

88

92

95

91

65

98

96

95

89

79

96

97

98

95

91

67

.50 .63 .71 .79 .87 .93

mean Pc: 85%

73

94

9383

74

87

88

78

88

90

87

94

74

91

91

81

78

89

93

94

69

flipped average

speed (m/s)

best average

Figure 4.4: Results of previous experiment on perception of speed (depicted in Figure 2.3of Section 2.4), Pc values averaged over all subjects (left panel), Pc values flipped and thanaveraged over all subjects (middle panel), and Pc values averaged over the best four subjects(right panel).

79

Page 91: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

The results of the experiments on size and speed perception suggest that am-plification of the naturally present amplitude modulation improves the per-ception of rolling speed while the perception of size is not influenced.

4.4 Interaction experiment with amplitude modulation

4.4.1 Introduction

In the experiment presented in this section, the influence of angular speedon the perception of size and speed, when both size and speed are varied,is investigated. The angular speed depends on the size as well as the linearspeed of the rolling ball. Furthermore, when judging the speed of a rolling ballby its periodicity, people may judge the angular speed by ‘simply counting’ thenumber of rotations per second, or the linear speed by compensating for thesize of the rolling ball. If subjects focus on the angular speed, a small slowlyrolling ball will be confused with a large fast rolling ball, where slow and fastrefer to the linear speed.

4.4.2 Method

Sound recordings of wooden balls rolling over a wooden surface, recordedas described in Section 2.2, were used. Eleven sounds were chosen in such away that comparisons over three sizes, three linear speeds, and three angu-lar speeds are possible1. Figure 4.5 depicts the stimuli. Labels s1, s2, and s3,denote levels of size with diameters of approximately 35, 45, and 55, respec-tively. Labels v0, v1, v2, v3, and v4, denote levels of linear speed with averagevalues of 0.34, 0.49, 0.61, 0.78, and 0.93 m/s, respectively. Labels w0, w1, w2,w3, and w4, denote levels of angular speed with average values of 18.0, 21.3,27.9, 34.0, and 44.0 rad/s, respectively.

An amplitude modulation with modulation depth, md, of 0.3 and frequency,fm, of twice the rotation frequency was applied to the stimuli as described insection 4.2. The stimuli were presented pairwise. The duration of the stimuliwas 800 ms with 700 ms silence between them. The stimuli were faded in andout over 10 ms by means of a Hanning window. Subjects were seated in a

1Note that size, linear speed and angular speed are not three independent parameters.One parameter can be expressed as a function of the other two, for instance ! = v/r with !

the angular speed (rad/s), v the linear speed (m/s) and r the radius (m).

80

Page 92: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.4 Interaction experiment with amplitude modulation

Figure 4.5: Visualization of the stimulus domain. Each grey circle represents a stimulus. Thestrings represent the stimulus attributes with s1, s2, and s3 denoting increasing sizes, v0,v1, v2, v3, and v4 increasing linear speeds, and w0, w1, w2, w3, and w4 increasing angularspeeds.

soundproof booth and listened to the stimuli over headphones. They did notreceive any feedback about the correctness of their responses.

Twenty naive subjects who had not participated before were presented withall possible stimulus pairs composed of two different stimuli (the entire do-main minus the main diagonal) in random order, resulting in 110 pairs. Eachsubject participated in two experiments. In one experiment, which will be de-noted as SIZE, subjects had to choose the larger ball, and in the other, whichwill be denoted as SPEED, the task was to choose the faster rolling ball. It wasnot mentioned whether by faster the linear or angular speed was meant. Bothexperiments were divided into two sessions, only differing in the sound levelof the stimuli. In one session the sound levels of the stimuli were adjustedto an equal overall RMS value corresponding to 78 dB SPL. In the other ses-sion, the sound levels of the stimuli were not equalized but shifted by thesame amount to an average level of 78 dB SPL and ranging from 70.1 dB to82.9 dB SPL, so as to preserve the level differences between sounds. In thisway it is possible to test whether subjects use differences in intensity levels

81

Page 93: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

to discriminate between differences in sizes and speeds. The order of the twoexperiments (choosing the larger or the faster ball) as well as the order of thetwo sessions (RMS equalled or original sound levels) were counter balanced.Each session was preceded by five test pairs.

4.4.3 Results

The Pearson correlation between the responses of the two sessions (equalledand original sound levels) is 0.96 and 0.92 for SIZE and SPEED, respectively.Both are significant at the 1% level. The high positive correlations indicate thatsubjects’ responses for sounds equalled in RMS value hardly differed from theresponses for sounds with original sound level. Therefore the results of thetwo sessions are combined. The Pearson correlation between the responses ofSIZE and SPEED is –0.88 (significant at the 1% level), indicating that the soundjudged by subjects as being produced by the larger ball when asked for size,was generally also judged as being produced by the slower ball when askedfor speed.

Figure 4.6 visualizes the average responses of the subjects for SIZE. In eachpanel, one of the three parameters (size, linear speed, or angular speed) is thesame for the two stimuli within a pair (so results for pairs with a differencein size, linear speed, and angular speed are excluded). Its name and cate-gory levels are shown on top of the panel. In the upper panel the results arepresented for the three size categories, shown by the diagonal lines in Fig-ure 4.5. So, the first six data points show the proportion of ‘larger’ responsesto the six possible stimulus pairs from the four stimuli of size category 1. (InFigure 4.5 these stimuli are the four stimuli s1v0w1, s1v1w2, s1v2w3, ands1v3w4 running from the most left to the top of the stimulus configuration.)The two-digit numbers at the abscissa denote the remaining parameter valuesof the two stimuli in the pair that is considered, that is, one pair of indicesfor linear speed, and one pair of indices for angular speed, for each stimu-lus pair. Because size, linear speed and angular speed are not independent,only two of the three parameters are sufficient to describe the stimulus, andthus each stimulus can be expressed by the indices of the parameter kept thesame (size) and one of the two remaining parameters (linear speed or angularspeed). However, for the sake of completeness, the indices of both remainingparameters are shown on top of each other at the abscissa. The preference forthe stimulus denoted by the second index within the two-digit number at theabscissa is given by the ordinate. For example, the most left dot is the mean re-sponse for the stimulus pair with both stimuli from size category 1, and linear

82

Page 94: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.4 Interaction experiment with amplitude modulation

0

0.2

0.4

0.6

0.8

1pr

efer

ence

for

seco

nd in

dex 1 2 3 size

0112

1223

2334

0213

1324

0314

1212

2323

1313

1201

2312

3423

1302

2413

1403

linear speedangular speed

0

0.2

0.4

0.6

0.8

1

pref

eren

ce fo

r se

cond

inde

x 1 2 3 linear speed

1221

2310

1320

1232

2321

1331

1243

2332

1342

sizeangular speed

0

0.2

0.4

0.6

0.8

1

pref

eren

ce fo

r se

cond

inde

x 1 2 3 angular speed

1201

2312

1302

1212

2323

1313

1223

2334

1324

sizelinear speed

Figure 4.6: Results for SIZE, combined for pairs consisting of stimuli with the same size (toppanel), linear speed (middle panel), and angular speed (bottom panel). The two-digit num-bers at the abscissa denote the remaining parameter values of the two stimuli in the pair that isconsidered. The preference for the stimulus denoted by the second index within the two-digitnumber at the abscissa is given by the ordinate.

83

Page 95: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

speed of level 0 and 1 or, equivalently, angular speed of level 1 and 2 (s1v0w1and s1v1w2). A preference of 0.2 for this pair simply means that in 20% of allpresentations (40 in total) subjects chose s1v1w2 above s1v0w1. So subjectsclearly preferred the first, that is the ball with lower linear and angular speed(s1v0w1) when asked to choose the larger ball of two balls equal in size. Thisis the case for all sound pairs in the top panel of Figure 4.6, revealing that ifthe physical size of the ball is the same, and the subjects have to choose thelarger ball, they choose the slower ball (overall mean of 0.27). Subjects mayhave listened to information in the sound representing either linear or angularspeed.

The middle panel of Figure 4.6 presents the results for pairs consisting of stim-uli with the same linear speed, shown by the vertical encirclements in Fig-ure 4.5. The three categories of linear speed that are looked at are shown ontop of the panel. The abscissa shows the indices of size and angular speedof the two stimuli in the pair that is considered. The high preference (overallmean of 0.82) for the sound denoted by the second index in each pair, indi-cates that subjects have a preference for the larger ball or the ball with lowerangular speed. Note that the preference values in this panel can be consid-ered as proportion correct responses. The low values for the middle circleand middle square (proportion correct responses of about 0.64) indicate thatdiscrimination between large balls (size category 2 and 3) is difficult for lowlinear speeds (category 1 and 2).

In the bottom panel of Figure 4.6, the results are presented for the three an-gular speed categories shown by the horizontal encirclements in Figure 4.5.If the angular speed is the same within a pair, subjects choose the larger ballor the ball with higher linear speed (overall mean of 0.68). The overall meanof the preference is much lower than for pairs consisting of sounds equal inlinear speed, mainly due to the near chance responses (preference of 0.5) forthe lowest constant angular speed. The preference values for this panel areactually proportion correct responses. Apparently, subjects are not able to dis-criminate size for pairs consisting of sounds with the same low angular speed(category 1) given the near chance performance of the subjects.

Figure 4.7 visualizes the average responses of the subjects for SPEED. The dataare presented in the same way as in Figure 4.6 for SIZE. In the top panel,results are shown for pairs consisting of stimuli with the same size and forwhich both the linear speed and the angular speed differs. The high values(overall mean of 0.89) suggests that if subjects had to choose the faster balland the size is equal, they indeed choose the ball with higher speeds. The

84

Page 96: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.4 Interaction experiment with amplitude modulation

0

0.2

0.4

0.6

0.8

1pr

efer

ence

for

seco

nd in

dex 1 2 3 size

0112

1223

2334

0213

1324

0314

1212

2323

1313

1201

2312

3423

1302

2413

1403

linear speedangular speed

0

0.2

0.4

0.6

0.8

1

pref

eren

ce fo

r se

cond

inde

x 1 2 3 linear speed

1221

2310

1320

1232

2321

1331

1243

2332

1342

sizeangular speed

0

0.2

0.4

0.6

0.8

1

pref

eren

ce fo

r se

cond

inde

x 1 2 3 angular speed

1201

2312

1302

1212

2323

1313

1223

2334

1324

sizelinear speed

Figure 4.7: Results for SPEED, combined for pairs consisting of stimuli with the same size(top panel), linear speed (middle panel), and angular speed (bottom panel). The two-digitnumbers at the abscissa denote the remaining parameter values of the two stimuli in the pairthat is considered. The preference for the stimulus denoted by the second index within thetwo-digit number at the abscissa is given by the ordinate.

85

Page 97: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

preference values can be considered as proportion correct responses and thusindicate good performance (nearly 90% correct). It is still unclear whethersubjects focus on linear or angular speed (or both).

The middle panel depicts results for pairs consisting of stimuli with the samelinear speed. The overall mean preference is 0.19 revealing that subjectslargely chose the smaller ball or the ball with highest angular speed whenjudging sounds with equal linear speed. Now, two explanations are possible:subjects mistake small balls for fast rolling balls, or because of the differencein angular speed, subjects simply choose the ball that rotates faster, maybebecause subjects had to choose one of the two sounds in a pair, and no ‘don’tknow’ or ‘no difference’ option was available.

Results for pairs with equal angular speeds are shown in the bottom panelof Figure 4.7 (overall mean of 0.58). Subjects performed near change, exceptfor pairs with a low constant angular speed, for which the mean preference isabout 0.74. This indicates that, for these pairs, subjects judged larger balls orballs with higher linear speeds as being faster. But as the same subjects judgedthese pairs as composed of two rolling balls equal in size (see Figure 4.6, bot-tom panel, the three dots to the left), the subjects must have judged the speedon the basis of the actual linear speed, not of size. Note that if subjects judgethe linear speed when asked for the faster ball, the preference values for thetop and bottom panel are actually proportion correct responses.

The overall mean levels (folded across the 0.5 level if below 0.5) for judgmentof size for pairs consisting of sounds with the same size, linear speed, or an-gular speed (0.73, 0.82, and 0.68, respectively) indicate that the subjects’ pref-erence for one of the sounds in a pair is least pronounced in the absence of adifference in angular speed, followed by the absence of a difference in size. So,the responses used for compiling Figures 4.6 and 4.7 (which are only a part ofall the responses because pairs consisting of sounds that differed in size, linearspeed and angular speed were left out of the analysis) suggest that the influ-ence of angular speed on perceived size is slightly higher than physical size,the influence of which is higher than that of linear speed. In the same way, theoverall mean levels (folded across the 0.5 level if below 0.5) for judgment ofspeed for pairs consisting of sounds with the same size, linear speed, or angu-lar speed (0.89, 0.81, 0.58, respectively) suggest that the influence of angularspeed on perceived speed is higher than linear speed, the influence of whichis higher than that of size.

In order to be able to better separate the effects of size, linear speed, and angu-

86

Page 98: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.4 Interaction experiment with amplitude modulation

lar speed on the judgment of size and speed, the responses will be visualizedin an alternative way. Figures 4.8 (SIZE) and 4.9 (SPEED) show all eleven stim-uli as circles. The abscissa denotes the linear speed and the ordinate the angu-lar speed. All 110 possible stimulus pairs composed of two different stimuliwere presented twice (two sessions) to 20 subjects. For every stimulus, thepreference of this stimulus above all the other stimuli was calculated. Thiswas done by dividing the number of times subjects chose stimulus X by thenumber of stimulus pairs containing stimulus X (20 pairs x 2 sessions x 20subjects = 800). In this way, all responses to the pairwise presentation of thesounds are included twice. The preference per stimulus is displayed by thesize and the darkness of the circles as well as the value printed close by. Theaverage preference is 0.5 by definition.

0.3 0.4 0.5 0.6 0.7 0.8 0.9 115

20

25

30

35

40

45

0.22

0.60

0.31

0.13

55 mm

45 mm

0.54

0.42

0.67

35 mm

0.68

0.52

0.76

0.64

angu

lar

spee

d (r

ad/s

)

linear speed (m/s)

Figure 4.8: Overall preferences for the stimuli for SIZE. Circles represent stimuli. The abscissadenotes the linear speed and the ordinate the angular speed. Dotted lines connect soundsof rolling balls with the same size. For every stimulus the preference of this stimulus aboveall the other stimuli, averaged over subjects, is displayed by the size and the darkness of thecircles as well as the value printed close by.

Figure 4.8 shows that when judging the size of a rolling ball, the preference fora stimulus increases with increasing diameter of the ball and with decreasinglinear speed and angular speed. The influence of angular speed is higher thanthe influence of linear speed. These trends are statistically tested by a stepwise

87

Page 99: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

multiple regression of all three predictor parameters (size, linear speed, andangular speed) onto the experimental data (overall preference for a stimulus).The parameter that enters first is angular speed as its correlation with pref-erence is highest (see second column of Table 4.1). This single parameter ex-plains 69% of the variance (R2 = 0.687). The second and last parameter that en-ters is linear speed. Its partial correlation with preference while controlling forangular speed (i.e. the strength of the relationship between linear speed andpreference having accounted for their joint relationship with angular speed) of0.90 (p < 0.001) is slightly higher than the partial correlation between size andpreference controlled for angular speed (0.89, p < 0.001). Angular speed andlinear speed combined explain 94% of the variance (R2 = 0.939). However,if angular speed and size are entered as predictors in the regression model,the explained variance is 93% (R2 = 0.933), which is about equal to the ex-plained variance by the model with angular speed and linear speed. This isnot strange because each parameter (size, linear speed, and angular speed) isa function of the other two. Although size and linear speed are statisticallyabout equally valid as second parameter, it is conceivable that when judgingsize listeners use information in the sound that is related to the physical size.This also appears when looking at individual results (not shown). For 13 sub-jects, the preference for a stimulus correlates best with the angular speed (andall correlations are significant at the 1% level of 0.74), for the other 7 subjectsit correlates best with the size (and, again, all correlations are significant atthe 1% level). For each subject, the correlation between preference and linearspeed is smallest. Likewise, individual regression leads 9 times to a modelwith angular speed only, 5 times to parameter size only, 3 times to both sizeand angular speed, and 3 times to a model with angular speed as well as lin-ear speed as predictors. Thus, angular speed explains most of the variance ofthe pooled results but size (or linear speed) additionally explains some of thevariance that is left. This is in accordance with the conclusions drawn fromthe overall mean levels in Figure 4.6.

Table 4.1: Pearson correlations between the data in Figures 4.8 (SIZE) and 4.9 (SPEED) and thephysical parameters size, linear speed, and angular speed. Values with superscript * and **are significant at the 5% and 1% level, respectively.

SIZE SPEED

size 0.73� �0.22

linear speed �0.24 0.77��

angular speed �0.83�� 0.96��

88

Page 100: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.4 Interaction experiment with amplitude modulation

0.3 0.4 0.5 0.6 0.7 0.8 0.9 115

20

25

30

35

40

45

0.71

0.15

0.54

0.85

55 mm

45 mm

0.5

0.75

0.25

35 mm

0.33

0.74

0.13

0.56

angu

lar

spee

d (r

ad/s

)

linear speed (m/s)

Figure 4.9: Overall preferences for stimulus for SPEED. The preferences are presented in thesame way as in Figure 4.8 for SIZE.

Figure 4.9 shows that when judging the speed of a rolling ball, the preferencefor a stimulus increases with increasing angular speed. Decreasing the size orincreasing the linear speed only slightly increases the preference for the stim-ulus. These findings are confirmed by a stepwise multiple regression of allthree predictor parameters onto the experimental data. Only angular speedenters the regression. Not surprisingly, its correlation with the subjects’ pref-erence is higher than that of size or linear speed (see third column of Table 4.1).The variance explained by this single parameter is 93% (R2 = 0.925). Includ-ing size or linear speed does not significantly improve the regression model,that is, the small increase in explained variance by including a second param-eter does not countervail the increased complexity of the model. The partialcorrelation between size and preference controlled for angular speed (i.e. thecorrelation between size and preference if angular speed was held constant)of 0.32 (p = 0.36) and the partial correlation between linear speed and prefer-ence controlled for angular speed of 0.44 (p = 0.20) are both non-significant.Looking at individual results (not shown) leads to the same conclusion. Forall subjects but one, the correlation between preference and angular speed issignificant at the 1% level of 0.74, and higher than the other correlations. Like-

89

Page 101: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

wise, individual regression leads 19 times to a model with angular speed asonly predictor, and once to linear speed as only predictor.

4.4.4 Conclusion

The influence of the sound level factor (equalled vs original sound levels) onthe subjects’ responses was not significant. This indicates that the perfor-mance of the subjects was not affected by adjusting the sound levels to anequal overall RMS value. As described in Section 2.4.3, retaining the RMSvalue of the stimuli used in the speed perception experiment of Section 2.4improved the subjects’ performance considerably. This indicates that subjectswere able to correctly judge the speed of rolling balls with equal sizes by listen-ing to the sound levels of the sounds. However, in the experiment describedin the current section, both size and speed varied. Therefore, a difference inintensity level was not a reliable cue to discriminate between differences insizes or speeds, and accordingly subjects ignored that cue.

If subjects have to choose the larger ball out of two sound examples of rollingballs, they may have a preference for larger balls (as predicted by the revisitedexperiment on the perception of size in Section 4.3) or higher linear speeds, butthey mainly prefer lower angular speeds Apparently subjects associate fewerrotations per second (lower angular speed) with a larger ball. This does nothave to surprise us because, if two balls with different sizes roll at the samelinear speed, the larger ball will rotate fewer times in the same period of timecompared to the smaller ball. However, in the case of varying linear speed,the angular speed is not a reliable cue for size, as it depends on linear speedtoo. Nevertheless, subjects clearly judge the size on the basis of cues which arerelated to the angular speed. The two data points with low proportion correctresponses in the middle panel of Figure 4.6, which are from stimuli with lowlinear and angular speeds, and the near chance performance for stimuli withthe lowest angular speed in the bottom panel (corresponding to the lower leftpart of the stimulus domain depicted in Figure 4.5), indicate that discrimina-tion of size was more difficult at low speeds than at high speeds.

If subjects have to choose the faster ball out of two sound examples of rollingballs, they clearly prefer balls with higher angular speeds, not higher linearspeeds. Apparently subjects associate more rotations per second (higher an-gular speed) with a faster ball. In contrast to our intuitive assumption thatsubjects judge the linear speed of rolling when asked to discriminate the speedof the rolling ball, these experiments show that they give much more empha-

90

Page 102: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.5 Experiment with independent variation of angular speed

sis to the angular speed in such a judgment. In the stimulus examples usedin this experiment, the amplitude modulation frequency was chosen to matchthe natural angular speed. Therefore linear speed and angular speed show ahigh degree of covariation and this makes it difficult to decide whether linearspeed is relevant for the judgments. For this reason, we performed anotherexperiment in which the rate of amplitude modulation was varied indepen-dently of the linear speed and size of the rolling balls.

4.5 Experiment with independent variation of angularspeed

4.5.1 Introduction

In the recordings used for the experiments, audible amplitude modulationwas not prominent. In the previous experiment, therefore, it was emphasizedartificially, at a rate matching the natural angular speed. Consequently, theamplitude variations are naturally coupled with other temporal and possiblyspectral properties varying with rotations of the ball. In order to separate therole played by amplitude modulation in the judgments made by the subjects,the amplitude modulation will now be varied independently of these othervariables by amplitude modulating each sound at three different rates, one ofwhich is the natural rate.

4.5.2 Method

Sound recordings of wooden balls rolling over a wooden surface, recorded asdescribed in Section 2.2, were used. The experiment was divided into two ses-sions. In one session, denoted by SIZE, subjects had to choose the larger ballof two sounds. In the other session, denoted by SPEED, subjects had to choosethe sound produced by the faster rolling ball. For these two sessions differentstimulus sets were used. For SIZE three recordings of rolling balls with threedifferent sizes (diameters of 35, 45, and 55 mm), at practically the same lin-ear speed (0.62 m/s) were selected. The angular speeds were 34.8, 28.1, and22.3 rad/s, respectively. For SPEED three recordings of a rolling ball (diame-ter of 45 mm), rolling at three different speeds (0.50, 0.63, and 0.79 m/s) werechosen as stimuli. The angular speeds were 22.2, 28.1, and 35.1 rad/s, respec-tively. Because one rotation contains two maxima, amplitude modulation at arate corresponding to twice the rotation speeds, that is, angular speeds of 44,

91

Page 103: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

56, and 70 rad/s, were applied. This was done for each sound, resulting inthree versions of the sounds with only a difference in amplitude modulationrate, one of which was the natural rate. The value of the modulation depthwas set to 0.4. This value is somewhat higher than the one chosen in the pre-vious experiment. In that experiment the amplitude modulation rate matchedthe natural modulation rate. In the present experiment, amplitude modula-tion at three different frequencies (one of which was the natural frequency)was applied. To prevent the imposed amplitude modulation from being ob-scured by the natural amplitude modulation, a somewhat higher modulationdepth of 0.4 was used. Thus, the stimulus domain for the SIZE session con-sisted of 3 sizes by 1 linear speed by 3 angular speeds, whereas the stimulusdomain for the SPEED session consisted of 1 size by 3 linear speeds by 3 angu-lar speeds. This is visualized in Figure 4.10.

Figure 4.10: Visualization of the stimulus domain for the SIZE (left) and SPEED (right) session.Circles represent stimuli. In vertical direction the stimuli are derived from the same originalsound, but amplitude modulation with different rates is applied. Filled circles denote soundswith natural amplitude modulation rates.

As usual, the 800-ms sounds were presented pairwise with 700 ms silencebetween them. The stimuli were faded in and out over 10 ms by means of aHanning window. Subjects were seated in a soundproof booth and receivedthe stimuli over headphones. They did not receive any feedback about thecorrectness of their responses.

Twenty naive subjects who had not participated before were presented twicewith all possible stimulus pairs composed of two different stimuli (the entiredomain minus the main diagonal) in random order, resulting in 144 pairs.

92

Page 104: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.5 Experiment with independent variation of angular speed

Each subject participated in both sessions (choosing the larger or the fasterball) in random order. Each session was preceded by 9 test pairs covering therange of stimuli. The sound levels of the stimuli were adjusted to a constantoverall RMS value corresponding to 80 dB SPL.

4.5.3 Results

Figures 4.11 and 4.12 show the preferences for the stimuli, for SIZE and SPEED,respectively, in a similar way as Figures 4.8 and 4.9. All stimuli are shown ascircles, and the preference of each stimulus with respect to all other stimuliis displayed by the size and the darkness of the circle (the larger/darker thecircle the higher the preference) and accompanied by its printed value. Theordinate denotes the angular speed of the stimulus. In Figure 4.11 (SIZE) theabscissa denotes the size, and in Figure 4.12 (SPEED) it denotes the linear speedof the stimulus.

35 45 55

22

28

35

0.28

0.43

0.23 0.43

0.83

0.62 0.56

0.34

0.78

diameter (mm)

angu

lar

spee

d (r

ad/s

)

Figure 4.11: Overall preferences for SIZE. Circles represent the 9 stimuli. The abscissa denotesthe diameter and the ordinate the angular speed. For every stimulus the preference of thisstimulus above all the other stimuli, averaged over subjects, is displayed by the size and thedarkness of the circles as well as the value printed close by.

The overall preference for SIZE, shown in Figure 4.11, increases monotonicallywith increasing size and decreasing angular speed. The highest size prefer-ence is obtained for the largest ball (diameter of 55 mm) rolling at the lowest

93

Page 105: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

angular speed (22 rad/s). Correspondingly, the lowest size preference is seenfor the smallest ball (diameter of 35 mm) rolling at the highest angular speed(35 rad/s). These findings are confirmed by a stepwise multiple regression ofthe two predictor parameters (size and angular speed) onto the experimentaldata (overall preference for a stimulus). Both parameters enter the regression,with angular speed being the first one as its correlation with preference (–0.70,p = 0.036) is higher, in absolute value, than the correlation between size andpreference (0.64, p = 0.064). These two parameters combined explain 95% ofthe variance (R2 = 0.947). Apparently subjects do take temporal cues into ac-count, as they are inclined to judge rolling balls with lower angular speedsas being larger. By investigating the correlations per subject (not shown) itappears that for about half of the subjects the correlation between preferenceand angular speed is, in absolute values, above the 5% significance level corre-sponding to a correlation of 0.67, and the correlation between preference andsize is not. For the preference of the other half of the subjects it is just theother way around: a significant correlation with size, and a non-significantcorrelation with angular speed. As a result, individual regression leads tomodels with angular speed only (4 subjects), size only (7 subjects), and bothangular speed and size (9 subjects) as predictors, which is in accordance withthe regression on the pooled results leading to both angular speed and size aspredictors.

The overall preference for SPEED, shown in Figure 4.12, increases monotoni-cally with increasing angular speed. The influence of linear speed is generallymuch smaller and is unsystematic. This is confirmed by a stepwise multipleregression. Only angular speed enters the regression (R2 = 0.869). Its corre-lation with preference is 0.93 (p <0.001) which is much higher than the cor-relation between linear speed and preference (0.13, p = 0.73). As the partialcorrelation between linear speed and preference controlled for angular speed(i.e. the unique correlation between linear speed and preference which is notshared with angular speed) of 0.37 (p = 0.37) is very low, linear speed doesnot enter the regression as second parameter. The correlations analyzed persubject (not shown) agree with this. For each subject individually, the highestcorrelation with their stimulus preference is for angular speed (for all abovethe 5% significance level corresponding to a correlation of 0.67, and for half ofthem above the 1% significance level of 0.80). The correlation between prefer-ence and linear speed is below the 5% significance level for each subject. Thepartial correlation between preference and linear speed controlled for angularspeed, is above 0.5 for about a third of the subjects. However, only for threeof them it is above the 5% significance level of 0.71. Therefore, for most of the

94

Page 106: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.5 Experiment with independent variation of angular speed

0.5 0.63 0.79

22

28

35

0.12

0.40

0.80

0.23

0.54

0.86

0.37

0.51

0.67

linear speed (m/s)

angu

lar

spee

d (r

ad/s

)

Figure 4.12: Overall preferences for SPEED. Circles represent the 9 stimuli. The abscissa de-notes the linear speed and the ordinate the angular speed. For every stimulus the preferenceof this stimulus above all the other stimuli, averaged over subjects, is displayed by the sizeand the darkness of the circles as well as the value printed close by.

subjects (17 out of 20), individual regression results in a model with angularspeed as only predictor, and only for three subjects in a model with both an-gular speed and linear speed as predictors. So, although some of the subjectstake information related to the linear speed into account, most of them do not.

4.5.4 Conclusion

If angular speed is varied independently of linear speed and size, its influenceon the perception of size and speed is considerable. When judging the size,the subjects’ judgments not only correlated with the physical size of the rollingball, but also with the angular speed corresponding with the additional ampli-tude modulation. So by modulating the amplitude at a rate different from thatspecified by the size and linear speed of the rolling ball, the percept of size isinfluenced. When judging the speed of a rolling ball, the subjects’ judgmentscorrelated predominately with angular speed. By modulating the amplitudeat a rate different from the natural rate, the percept of speed is changed con-siderably.

95

Page 107: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

4.6 Discussion

In the experiments presented in this chapter, the influence of amplitude mod-ulation on the perception of size and speed, when both size and speed arevaried, was investigated. It appeared that if subjects have to judge the sizeof a rolling ball by listening to its sound (provided with amplitude modula-tion), they take both the physical size and angular speed of the rolling ball intoaccount. When judging the ‘speed’ of a rolling ball, the subjects’ judgmentscorrelated predominately with angular speed. Only a few of the subjects alsotake information related to the linear speed into account. The angular speedof a rolling ball is determined by its size and linear speed, and thus providesinformation about the size and linear speed of the ball. We had expected thatwhen judging the speed of a rolling ball, people would judge the linear speed,either directly, or indirectly by judging the angular speed and compensatingfor the size, because in everyday life the linear speed is important, as it is di-rectly related to the time to contact. On the other hand, our stimuli did notcontain spatial information and they were presented in isolation so that therewas no context which indicated (a change in) the distance between object andobserver. For the acoustic content chosen in our experiments, most of thesubjects apparently only attend to information in the sound representing theangular speed itself.

In the experiments, subjects had to judge a property of the sound source whichfor some of the pairs was kept constant. For instance, subjects were asked tochoose the sound that was produced by the larger of two balls, whereas some-times the physical size of the two balls was the same. In this situation subjectscan behave in two different ways. They can guess, resulting in an averagepreference of 0.5, and no correlation with any of the physical parameters ofthe sound. Or, they can base their judgments on another dimension that doesvary. The latter strategy may induce a response bias if subjects consistentlymatch one direction of the first dimension (e.g. speed) with one directionof the second dimension (e.g. size). In this condition, subjects would havea preference for congruent attributes at a semantic level. Melara and Marks(1990a, 1990b) have demonstrated a congruity effect between the primary di-mensions loudness, pitch and timbre: attributes from corresponding poles ofa dimension (e.g. high pitch and loud, ‘congruent’) are classified faster thanthose from noncorresponding poles (e.g. high pitch and soft, ‘incongruent’).However, as, to our knowledge, no previous experiments of this type havebeen performed with the kind of stimuli we use, it is unclear whether a con-gruity effect exists between the dimension speed and size, and what would be

96

Page 108: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.6 Discussion

a natural match of attributes that we would call congruent.

In the first experiment of this chapter, in which amplitude modulation wasapplied at a natural rate, a modulation depth, md, of 0.3 was chosen. As men-tioned in Section 4.5.2, in the experiment with independent variation of an-gular speed a modulation depth of 0.4 was chosen in order to prevent theimposed amplitude modulation from being obscured by the natural ampli-tude modulation. These values are somewhat arbitrary. However, low values(md < 0.2) as well as high values (md > 0.5) are not appropriate, as the am-plitude modulation should be audible2, yet not be too salient or intrusive.Furthermore, we assumed that the natural amplitude modulation rate equalstwice the rotation rate, as a ball that is not perfectly round, but more ovalshaped, will generate two maxima per rotation. Informal listening confirmedthe correctness of this assumption.

The influence of angular speed on the perception of size and speed may beexploited when synthesizing rolling sounds. As noted before, without ampli-tude modulation it may be difficult (just as for a real rolling sound) to identifya synthesized sound as being created by a rolling object at all. By providingthe sounds with amplitude modulation and changing the rate, it may be pos-sible to make the listener believe that the speed of the rolling ball has changed.

Although subjects mainly attend to angular speed, it is unclear whether subjectsintend to judge linear speed or angular speed when asked to judge speed. Thismay be uncovered by a between-subjects design with three different instruc-tions. One group of subjects should receive instructions like: ’A fast rollingball travels the same distance in a shorter amount of time than a slowly rollingball. We want you to choose the faster ball, that is, the one travelling faster.’Another group should be instructed like: ’As you probably know, a fast rollingball will rotate more often than a slowly rolling ball in the same amount oftime. We want you to choose the ball rotating faster.’ The third group of sub-jects should simply be told to ’choose the faster rolling ball’. By comparingthe results of the latter group with the first two groups, it may be possible todetermine whether subjects implicitly judge linear or angular speed when justasked for speed.

In previous experiments, an interaction effect between size and speed was

2The detection threshold of the modulation depth, md, for broadband white noise which isamplitude modulated at a frequency between 5 Hz and 15 Hz, is 0.05 (Bacon and Viemeister,1985). The modulation depths of 0.3 and 0.4 used in the perception experiments reported inthis chapter, are well above this value.

97

Page 109: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

found (Section 2.5) which was explained in terms of the spectral cues thatwere influenced both by changes in size and in linear speed. However, thefindings in the current chapter suggest a strong influence of angular speed.Therefore, these experiments should be analyzed again in terms of angularspeed. The same holds for the experiments with manipulated sounds of Chap-ter 3 in which the spectral content of a sound with specific size, linear speed,and angular speed was combined with the temporal content of another soundwith specific size, linear speed, and angular speed. Therefore the results of theseexperiments are reanalyzed in the next section.

4.7 Previous perception experiments revisited

In this section, the results of previous perception experiments will be reana-lyzed in terms of differences in angular speed. This corresponds to analysis 3in our research methodology depicted in Figure 1.1. Experiments I and II ofChapter 2 investigated the perception of size at constant speed (Section 2.3),and the perception of speed with constant size (Section 2.4), respectively. Asonly one physical attribute was varied (size in experiment I and linear speedin experiment II), the results of these experiments need not be reconsidered.A monotonic increase in size at constant speed corresponds to a monotonicdecrease in angular speed, and likewise, a monotonic increase in linear speedwith constant size corresponds to a monotonic increase in angular speed. Inexperiment III of Chapter 2, (Section 2.5) with recorded sounds of rolling balls,both the size and the speed of the rolling balls were varied. In Section 4.7.1,the results of this experiment will be reinterpreted in view of the findingspresented before in this chapter. The manipulated sounds of Chapter 3 wereconstructed from sounds of rolling balls with varying size and speed. For thisreason, the perception experiments reported in that chapter are reconsideredin Section 4.7.2.

4.7.1 Perception experiment III with recorded sounds revisited

The results of experiment III, shown in Figures 2.6 and 2.7 of Section 2.5, arepresented in a different way in Figures 4.13 and 4.14, for the two subexperi-ments labeled SIZE and SPEED, respectively. The gray circles represent stimu-lus pairs. The abscissa and ordinate give the angular speeds of the two soundsin a pair (on a logarithmic scale). The preference for the sound with angularspeed given by the ordinate is visualized by the size and shade of grey of the

98

Page 110: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.7 Previous perception experiments revisited

10 18 22 32 40 72

10

18

22

32

40

72

angular speed (rad/s)

angu

lar

spee

d (r

ad/s

)

SIZE s3v1 s2v1 s3v3 s1v1 s2v3 s1v3

s3v1

s2v1

s3v3

s1v1

s2v3

s1v3

Figure 4.13: Influence of angular speed on the responses of experiment III in Chapter 2 forSIZE. The gray circles represent stimulus pairs. The abscissa and ordinate give the angularspeeds of the two sounds in a pair on a logarithmic scale. The higher the preference for thestimulus with angular speed given by the ordinate over the stimulus with angular speedgiven by the abscissa, the larger and darker the circle. Each pair is depicted twice, once aboveand once below the diagonal. For every stimulus at the abscissa and ordinate, its label isshown on top and to the right of the figure.

circles (the larger and darker the circle the higher the preference for the soundgiven by the ordinate). The labels of the stimuli with angular speeds givenby the abscissa are shown on top of the figure. To the right of the figure, thelabels of the stimuli with angular speeds given by the ordinate are shown. Apair of sounds with the same angular speed would lie on the diagonal fromthe bottom left to the top right corner, shown by a dotted line. None of thecircles lies on the diagonal indicating that every pair consisted of two soundswith a difference in angular speed. Each pair is depicted twice, once aboveand once below the diagonal.

The performance of the subjects for SIZE was very good, as shown by the veryhigh percentage correct responses, Pc, in Figure 2.6, except for the pair com-prising s2v1 and s3v3. If subjects base their size judgments on the angular

99

Page 111: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

10 15 22 32 48 72

10

15

22

32

48

72

angular speed (rad/s)

angu

lar

spee

d (r

ad/s

)

SPEED s3v1 s3v2 s3v3 s1v1 s1v2 s1v3

s3v1

s3v2

s3v3

s1v1

s1v2

s1v3

Figure 4.14: Influence of angular speed on the responses of experiment III in Chapter 2 forSPEED. The results are presented in the same way as in Figure 4.13 for SIZE.

speed only, that is, if they always prefer the sound with lowest angular speed,the preference for stimuli depicted below the diagonal in Figure 4.13 wouldbe very high (and the preference for stimuli above the diagonal would be verylow). It can be seen that this is indeed the case for all but two stimulus pairs.If people mostly choose the larger ball in each pair, exactly these two pairsshould have a high preference above the diagonal and a low preference belowthe diagonal, opposite to all other stimulus pairs. So clearly, subjects prefer thelarger ball when asked to choose the sound generated by the larger ball. Thepreference for the pair comprising s2v1 and s3v3, which is the only pair witha moderate Pc value (68%), is less extreme than for the other pairs. Probably,to some extent, subjects are misled by the higher angular speed of the largerball.

Figure 4.14 shows the results for SPEED. The preference for the stimulus withangular speed given by the ordinate over the stimulus with angular speedgiven by the abscissa, decreases neatly from the top left to the bottom rightcorner. This means that if subjects are presented with two rolling sounds and

100

Page 112: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.7 Previous perception experiments revisited

0.4 0.6 0.8 10

20

40

60

80

angu

lar

spee

d (r

ad/s

)

linear speed (m/s)

SIZE

s1v1

0.12

0.083

s1v3

0.48 s2v3

0.56 s2v1

s3v1 0.90

s3v3 0.86

Figure 4.15: Overall preferences for the stimuli of experiment III in Chapter 2 for SIZE. Circlesrepresent stimuli. The abscissa denotes the linear speed and the ordinate the angular speed.Dotted lines connect sounds of rolling balls with the same size. For every stimulus its label isshown. The preference of this stimulus above all the other stimuli, averaged over subjects, isdisplayed by the darkness of the circles as well as the value printed close by.

they have to choose the faster ball, they prefer the one produced by the ballwith higher angular speed. The larger the difference in angular speed, themore decisively the subjects choose the higher angular speed. However, forthree stimulus pairs, the ball with higher angular speed is not the one withhigher linear speed, namely pairs (s1v1, s3v2), (s1v1, s3v3), and (s1v2, s3v3).As a result, for these pairs the percentage correct responses in terms of linearspeed, as depicted in Figure 2.7, are below 50%. For all other pairs, with a Pcabove 50%, the preferred ball with higher linear speed also has a higher an-gular speed. In summary, the diminished performance for some of the pairs,which, in Chapter 2, was attributed to an interaction effect between size andlinear speed, can be fully explained by a preference for the sound producedby the ball with higher angular speed when asked to choose the faster ball.

In Figures 4.15 and 4.16, the overall preference of a stimulus above all theother stimuli is shown in a similar way as in Figures 4.8 and 4.9, for SIZE andSPEED respectively. In Table 4.2, the Pearson correlations between the overallpreference and size, linear speed, and angular speed are listed. A stepwise

101

Page 113: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

0.4 0.6 0.8 10

20

40

60

80

linear speed (m/s)

angu

lar

spee

d (r

ad/s

)

SPEED

s1v1

s1v2 0.58

0.82 s1v3

0.45

0.35

0.56 s3v3

s3v1 0.25 s3v2

Figure 4.16: Overall preferences for the stimuli of experiment III in Chapter 2 for SPEED. Thepreferences are presented in the same way as in Figure 4.15 for SIZE.

multiple regression of all three predictor parameters (size, linear speed, andangular speed) onto the overall preference for a stimulus was performed. ForSIZE, only parameter size enters the regression, resulting in a model that ex-plains 94% of the variance (R2 = 0.943). This confirms that listeners prefer thelarger ball when asked to choose the sound generated by the larger ball, andmostly neglect the linear and angular speed. This contrasts the findings ofSections 4.4 and 4.5, namely that when judging the size, listeners pay mostattention to information in the sound representing the angular speed, and lessattention to the size and linear speed. However, in these latter experiments,

Table 4.2: Pearson correlations between the data in Figures 4.15 (SIZE) and 4.16 (SPEED) andthe physical parameters size, linear speed, and angular speed. Values with superscript ** aresignificant at the 1% level.

SIZE SPEED

size 0.97�� �0.63

linear speed �0.057 0.75

angular speed �0.76 0.92��

102

Page 114: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.7 Previous perception experiments revisited

amplitude modulation was added artificially and thus was more clearly audi-ble. Evidently, by making the amplitude modulation more apparent, listenerstake it more into consideration when judging the size.

For SPEED, both angular speed and linear speed enter the regression (angularspeed first), resulting in a model that explains 98% of the variance (R2 = 0.979).So, listeners attend to information in the sound representing angular speed,but they also attend to linear speed. The experiments in Sections 4.4 and 4.5,in which sounds artificially provided with amplitude modulation were used,showed that listeners mainly attend to angular speed. Evidently, by makingthe amplitude modulation more apparent, listeners take it more excessivelyinto consideration when judging speed.

4.7.2 Perception experiments with manipulated sounds revisited

In Figures 4.17 and 4.18, the overall preferences for the stimuli of the pilot ex-periment in Chapter 3 are shown for SIZE and SPEED, respectively. The overallpreferences for ‘unmanipulated’ sounds of the main experiment in Chapter 3,that is, sounds created by combining the spectral and temporal content of thesame sound (original recordings passed through a Gammatone filterbank andresynthesized without further manipulation), are very similar.

Table 4.3: Pearson correlations between the data in Figures 4.17 (SIZE, pilot experiment) and4.18 (SPEED, pilot experiment). Correlations for the main experiment in Chapter 3 are alsoshown. Values with superscript * are significant at the 5% level.

SIZE SPEED

pilot main pilot main

size 0.95� 0.99� �0.59 �0.55

linear speed �0.066 �0.024 0.60 0.63

angular speed �0.86 �0.82 0.95� 0.94

Table 4.3 lists the Pearson correlations between the data in Figures 4.17 (SIZE,pilot experiment) and 4.18 (SPEED, pilot experiment) and the physical parame-ters size, linear speed, and angular speed. Also shown are the correlations forthe main experiment in Chapter 3 (SIZE and SPEED, main experiment). Thesecorrelations resemble the correlations of experiment III described in the previ-ous section (Table 4.2). As we have even fewer data points entering the anal-ysis, the highest correlations only reach significance at the 5% level. For SIZE,the highest (significant) correlation is with parameter size, and for SPEED, the

103

Page 115: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

0.4 0.6 0.8 1

20

30

40

50

linear speed (m/s)

SIZE

angu

lar

spee

d (r

ad/s

)

s1v2

s1v1

0.092

0.34

s2v2 0.78

0.76 s2v1

Figure 4.17: Overall preferences for the stimuli of the pilot experiment in Chapter 3 for SIZE.Circles represent stimuli. The abscissa denotes the linear speed and the ordinate the angularspeed. Dotted lines connect sounds of rolling balls with the same size. For every stimulusits label is shown. The preference of this stimulus above all the other stimuli, averaged oversubjects, is displayed by the darkness of the circles as well as the value printed close by.

0.4 0.6 0.8 1

20

30

40

50

linear speed (m/s)

SPEED

angu

lar

spee

d (r

ad/s

)

s1v2

s1v1

0.94

0.38

s2v2 0.42

0.30 s2v1

Figure 4.18: Overall preferences for the stimuli of the pilot experiment in Chapter 3 for SPEED.The preferences are presented in the same way as in Figure 4.17 for SIZE.

104

Page 116: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4.7 Previous perception experiments revisited

angular speed. So, when judging the size, listeners pay most attention to infor-mation in the sound representing the physical size. When judging the speed,listeners pay most attention to the angular speed. This conclusion agrees withthe one in the previous section.

4.7.3 Summary

If listeners have to judge the size of a rolling ball by its sound, they mainlyattend to information in the sound representing the physical size. However,if the angular speed is made more apparent by artificially adding amplitudemodulation, listeners take both the angular speed and the physical size intoconsideration, giving angular speed most of the weight. If listeners have tojudge the speed of a rolling ball by its sound, they apparently pay most atten-tion to information linked to angular speed, and, to a lesser degree, to linearspeed. The interaction effect between size and linear speed as observed inChapter 2, can be fully explained by a preference for the sound produced bythe ball with higher angular speed when asked to choose the faster ball. Byartificially adding amplitude modulation, the listeners’ preference for higherangular speeds when asked to choose the faster ball is increased even more atthe cost of their preference for higher linear speeds.

105

Page 117: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

4 The influence of angular speed

106

Page 118: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

5General discussion

Page 119: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

5 General discussion

RecapitulationListeners can discriminate differences in the size and the linear speed ofwooden rolling balls on the basis of recorded sounds. The reverse labelingobserved for some subjects when judging speed, could be prevented by in-cluding the loudness cue. If both size and linear speed vary, the judgmentsare influenced by an interaction effect between the two physical properties ofthe balls.

If the rolling sound contains distinct amplitude modulation (naturally presentdue to deviation from perfect sphericity or artificially provided) listeners takeit into account when discriminating between different sizes as well as betweendifferent speeds of the rolling ball. If listeners have to choose the larger ball,they not only have a preference for larger balls, but they also have a strongpreference for lower angular speeds. When judging speed, listeners attendeven more to the angular speed. If listeners have to choose the faster ball, theypredominately prefer balls with higher angular speeds. This can explain theapparent interaction between size and speed, mentioned above. If amplitudemodulation is added independently of the natural angular speed, the perceptof speed can be changed considerably. Temporal cues other than amplitudemodulation did not seem to play a role in our perception experiments.

If the sounds are selected to contain only a minimum amount of amplitudemodulation, spectral cues in the sounds dominate the judgments of bothspeed and size. Which spectral cues listeners use for discriminating betweendifferences in the size and the speed is still unclear. We think the centroidof specific loudness and the spectral tilt play an important role in listeners’judgment of the size and the speed of a rolling ball.

Research methodologyAn expanding body of research is dedicated to the physiological, percep-tual, and cognitive aspects of audition (for overviews see for example Blauert1997; Bregman 1990; Deutsch 1999; Handel 1989; McAdams and Bigand 1993;Moore 1997; Kramer et al. 1999; Warren 1999). With the exception of musicand speech, most research on auditory perception has focussed on the percep-tion of simple stimuli. However, most of the sounds we hear in daily life arecomplex sounds, consisting of many frequencies, changing in time and situ-ated somewhere in space. Although listeners are surprisingly good at iden-tifying particular sound source properties from those ‘rich’ sounds, relativelylittle is known about the perception of such auditory events. Furthermore,with the exception of applications in the field of entertainment, such as com-

108

Page 120: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

5 General discussion

puter games, the use of sound in human-computer interaction is often lim-ited to simple playback of sounds. An example is the arbitrary associationof sound files with system events. This kind of application may be ‘funny’to many users, but after prolonged use (or even after a few playbacks), thenovelty is gone and annoyance and irritation may set in.

In this thesis we investigated the perception of the size and the speed ofrolling balls. This research was carried out in the framework of the researchmethodology described in Section 1.5 (Figure 1.1). First, we investigated thelink between an acoustic source event (rolling ball) and auditory perception(discrimination between different sizes and speeds) by presenting recordedrolling sounds to listeners. We determined whether listeners are able to dis-criminate the sounds of slowly rolling balls from those of more rapidly rollingballs and the rolling sounds of large balls from those of small balls (analysis 1).Then we searched for available auditory cues within the sounds that listenerscould have used in their judgments of size and speed, that is, from the acous-tic source event we tried to extract acoustic structures specifying the size andthe speed of a rolling ball (analysis 2). Finally, the sounds were manipulatedand in perception experiments we determined the influence of this manipu-lation on the discrimination between different sizes and speeds (analysis 3).So we went through all analysis stages of our methodology for researchingauditory event perception. By closing this research cycle we could definitelypoint to the acoustic properties of the rolling sounds listeners use in makingtheir judgments. For instance, we could show for nearly perfectly sphericalballs that the listeners mainly use spectral properties of the rolling soundsin judging the size and the speed of rolling balls. Hence, the results of thesemanipulation–perception experiments strongly indicate that listeners estimatethe size of a rolling ball on the basis of spectral properties.

Many other studies discussed in the introduction of this thesis, concentrateon only one or two of the analysis stages. For instance, in the article on theperception of filling (Cabe and Pittenger, 2000), the authors presumed thatmodal frequencies constitute the acoustic structure yielding the fill height ofthe vessel, but they did not verify it by manipulating the modal frequenciesindependently of other sound characteristics and examining the consequenceon the perceived fullness of the vessel. Although in this thesis only a smallpart of the perception of rolling sounds (size and speed) was tackled, andnot all stages of the methodology were analyzed in complete depth, we thinkthat this methodology has proven to be useful and valuable for research onauditory event perception.

109

Page 121: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

5 General discussion

Auditory cuesTo prevent our listeners from learning to link any arbitrary feature in thesound to the variable that is controlled, they did not receive any feedbackabout the correctness of their responses. Furthermore, the listeners were naivewith respect to the experiment, they were not familiar with the sounds, andparticipated in only one experiment. The number of trials was kept as low aspossible. In this way, listeners had to base their judgments of the sounds onintuition, and could not rely on skill or familiarity acquired during the currentor previous experiments. In their judgments, listeners could use three typesof auditory cues that are possibly available in the rolling sounds, namely, cuesin the spectral domain, cues in the temporal domain, and the loudness cue.

The spectral content of the sound is an important source of information whenjudging the size or speed of a rolling ball by listening to the sound it gener-ates. But what in the spectrum specifies the perceived size or speed of a rollingball? A first step to extract a possible spectral cue was computing the centroidof specific loudness, which is a single-number descriptor of the spectral con-tent. From all measures we tried, this was the one that best correlated with thejudgments of the listeners. However, it is unlikely that listeners’ use of spec-tral information to judge source attributes of rolling objects is limited to onesingle-number descriptor, as it seriously reduces all spectral differences thatcould be obtained from comparing two spectra. Many other spectral charac-teristics may be of importance for judging the size or speed, such as spectralslope and the presence of resonance frequencies. In order to establish theirinfluence on perception, the spectral characteristics must be manipulated andperception experiments with these manipulated sounds should be conducted.

We also searched for a measure of the temporal content of a sound that variedclearly and monotonically with varying size and speed. Although sometimesaudible, amplitude modulation in the sound could not be detected by signalanalysis. A count of the number of peaks above a certain level at least allowedto find some covariation with changes in linear speed but not with size. How-ever, calculation of auditory roughness revealed a clear dependence on sizeand speed. Although the differences in roughness between stimuli used inour experiments were mostly below the threshold of detectability, roughnessvaried properly with size and speed.

An additional cue may be the loudness of the rolling sounds. Generally, if thematerial of the ball and the supporting surface are kept the same, a large ballproduces a louder sound than a small ball rolling at the same speed, and a fastrolling ball produces a louder sound than a slow rolling ball with the same

110

Page 122: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

5 General discussion

size. In order to prevent listeners from attending to relative differences insound level, the sounds in all but two perception experiments were presentedat the same sound pressure level. Although the perceived loudness of thesounds may still differ slightly because loudness perception is frequency de-pendent, we expect no considerable changes in performance due to the smalldifferences in loudness. The experiment on the perception of speed in Sec-tion 2.4 showed a better performance in terms of percentage correct responses,if the loudness cue was available. Furthermore, none of the listeners reversedthe labeling of the sounds which was observed for some listeners in the sim-ilar experiment in which the level was kept constant. On the other hand, theinteraction experiment with sounds provided with amplitude modulation inSection 4.4 showed no difference in results when the sound level cue was re-tained. Possibly, listeners did not need the sound level cue due to the presenceof amplitude modulation or the variation of both the size and the speed (botheffectuating a variation in sound level).

In practice, it may be easier to manipulate the percept of speed by providingamplitude modulation than by changing spectral characteristics of the sound.However, a considerable effect can only be obtained if the original sound doesnot contain much amplitude modulation by itself, or if the amplitude modu-lation already present in the sound is first cancelled out.

The use of auditory cues in natural environmentsIn everyday life most events are accompanied by information flows in morethan one sensory modality. For instance, we not only see objects interactingwith each other, but also hear accompanying sounds. Usually, multiple (abun-dant) cues are available, which means that if we miss one, for instance due tovisual occlusion or auditory noise, several hints remain of what kind of ob-ject we encountered or what event happened. Most sounds that we hear inisolation, without the information from other sensory modalities, can be inter-preted in more than one way. The Foley artist has made his or her professionfrom this. For instance, by mimicking thunder by oscillating a plate of metal,or a horse galloping sound by rhythmically hitting two hollow coconuts to-gether, the listener can be ‘fooled’. A naturally occurring sound presented inisolation may become ambiguous if information from other sounds or sensesas well as environmental modification clues are missing (Deutsch, 1983). Nor-mally, the context in which the sound is heard, will resolve this ambiguity. Ifthe image we see, the sound we hear, and the object we feel, all match eachother, they provoke a strong mental image of the object or event.

111

Page 123: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

5 General discussion

If we only consider the auditory modality, as we did in the experiments de-scribed in this thesis, still multiple cues may be available. For instance, somecues express themselves in the spectral domain while other cues express them-selves in the temporal domain. Consequently, if one cue is missed, there areother cues left. An example is spatial hearing. Many cues can play a role intelling the observer where a sound comes from. In everyday life, we listenbinaurally. This gives rise to spatial cues like interaural time delays, interau-ral intensity differences, and reflective characteristics of the head, pinnas andtorso. These cues together will specify the direction from which a sound isheard. The distance between the sound source and the listener is specifiedby the proportion of the intensity of the direct sound and the intensity of thereflected sound. If the sound-generating event moves, intensity changes dueto movements of the object may give some information. Furthermore, headmovements by the listener induce variations in above-mentioned cues, whichmay give additional information on where the sound comes from.

Also for the stimuli used in our experiments a link exists between the per-cepts associated with different physical dimensions. The perceived speed of arolling ball that is not perfectly spherical is not independent of its size; smallballs do not need to roll as fast (in terms of linear speed, the travelled distanceper second) as large balls to be perceived as rolling fast. An ecological parallelcan be found: a small animal (like an ant) that we consider as running veryfast may have a lower linear speed than a large animal (like a hippopotamus)that we consider as strolling at a slow pace. In judging the speed of an animalwe are inclined to scale it with respect to its size. We presumably do not judge‘absolute speed’, but ‘normalized speed’. This is analogous to the conclusionof our experiments that in the case of a rolling sound with amplitude mod-ulation, we do not judge the linear speed, but the angular speed. Althoughrolling objects are rather rare in nature, several objects made by humans dorotate or have rotating parts. The resulting periodic sounds may provoke apercept of rotation at a specific speed. If the periodicity within the sound ishigher than about 30 to 50 Hz, it results in a pitch which likewise informs usabout the speed of rotation of the object or machine part (e.g. the number ofrevolutions inside the engine of a car). Even temporal patterns produced bynon-rotating objects may provoke a clear percept of speed. For instance, weare able to tell the pace of a walker by listening to the sequence of footsteps(though the actual speed also depends on the length of the step). Besides onvarious auditory cues, the listeners’ judgments of the size and speed of anobject will very often also be based on what they know about an object. Ifthey know the object they will generally have a clear idea about its size and

112

Page 124: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

5 General discussion

about the speed with which it normally moves. This may explain some of the‘normalization’ we observed in the high correlations with the angular speedas derived from the natural or added amplitude modulation rate in the rollingsound. Hence, such ‘top-down’ knowledge will also have influenced the lis-teners’ judgments.

Future researchWe asked listeners to judge the size and the speed of a rolling ball on thebasis of the sounds generated by the rolling process. However, the vibrationof the wooden plate actually produced the sounds, not the ball itself. Theball is, so to speak the source, and the plate the resonator (source-resonatormodel). Probably the size of the plate is also of importance for the auditoryperception of the size of a rolling ball. Whether this is indeed the case, shouldbe investigated by perception experiments.

A rolling ball slowing down may be simulated by amplitude modulating thesound at a decreasing rate, possibly accompanied by a decrease in overall am-plitude. This is a very convincing trick that even works for simple syntheticrolling sounds instead of a recorded rolling sound (Hermes, 2000).

When using rolling sounds in auditory interfaces, the perception of physicalproperties other than size and speed may be of interest. Examples are the au-dibility of surface textures and the auditory perception of material. Regardingthe latter, interesting research questions may be: What is the influence of ma-terial type on the perception of size and speed? Can we perceive the materialof a ball by listening to the sound it produces when rolling? Or do we perceivethe material of the surface on which the ball rolls? Or both?

In future applications, the usefulness of the sounds should be tested. Is thesound of a rolling ball convenient to convey information? Does performanceincrease by adding sounds? Do the sounds annoy the user or people nearby?To avoid annoyance, the sound should at least be non-intrusive (e.g. not tooloud), and should vary from one presentation to another.

The range of possibilities increases considerably if we are able to synthesizerolling sounds with adjustable physical parameters like size, speed, material,and surface texture. With such an algorithm many interesting applicationsand experiments are possible, especially if the synthesis algorithm enables usto independently vary cues used to identify and judge physical properties ofa rolling ball. For example, with such a synthesis method, the space of soundsthat are perceived as natural (‘natural-sound perception’) may be defined and

113

Page 125: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

5 General discussion

compared to the space of sounds that are generated by physically possibleprocesses (‘natural sound-synthesis’).

When using rolling sounds in computer interfaces, the sounds may not haveto be fully realistic. Some characteristics may be exaggerated to emphasize(changes in) physical attributes. In other words, the sounds can be caricaturesof natural sounds, just like Foley sounds (post-production recorded sounds inmovies, such as punches and footsteps) are often exaggerated for extra effect.For instance, fighting scenes are usually accompanied by loud added thudsand slaps. Similarly, extra (excessive) amplitude modulation can be appliedto rolling sounds to enhance the percept of rolling. However, in advancedmulti-media applications like virtual reality, natural rolling sounds are likelyto be preferred.

EpilogueWe conclude that, in judging the size and the speed of a moving object, lis-teners take a number of cues into account. The weight attributed to each cuedepends on the context and the information the listeners already have aboutthe sound-producing event. In natural situations, the cues will be closely re-lated and one will covary with many others. By decorrelating these cues andstudying the responses to these synthesized or manipulated new sounds, wecan unravel something of the complex hierarchy of auditory information thelisteners use in reconstructing what happens around them on the basis of audi-tory information. This thesis has uncovered some of the processes taking placein this reconstruction and has emphasized the necessity of a sound method-ology. On the other hand, it will be clear that a lot of research will have to becarried out before we can really say that we understand the process that takesplace when a listener judges the size and the speed of a rolling object, on thebasis of its sound.

114

Page 126: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Bibliography

Bacon, S. P. and N. F. Viemeister (1985). Temporal modulation transfer func-tions in normal-hearing and hearing-impaired subjects. Audiology 24,117–134.

Baecker, R. M. and W. A. S. Buxton (1987). The audio channel. In: R. M.Baecker and W. A. S. Buxton (Eds.), Readings in Human-Computer Inter-action: A Multidisciplinary Approach, Chapter 9, pp. 393–399. San Mateo:Morgan Kaufmann Publishers.

Ballas, J. A. (1993). Common factors in the identification of an assortment ofbrief everyday sounds. Journal of Experimental Psychology: Human Percep-tion and Performance 19, 250–267.

Barras, S. (1997). Auditory information design. Ph.D. dissertation, AustralianNational University, Canberra, Australia.

Blattner, M. M., A. L. I. Papp, and E. P. Glinert (1994). Sonic enhancements oftwo-dimensional graphic displays. In: G. Kramer (Ed.), Auditory display:sonification, audification and auditory interfaces. Proceedings of the Interna-tional Conference on Auditory Display 1992, pp. 447–470. Reading, MA:Addison-Wesley.

Blattner, M. M., D. A. Sumikawa, and R. M. Greenberg (1989). Earcons andicons: their structure and common design principle. Human-ComputerInteraction 4, 11–44.

Blauert, J. (1997). Spatial hearing: the psychoacoustics of human sound localization.Cambridge, MA: MIT Press.

Bly, S. (1994). Multivariate data mappings. In: G. Kramer (Ed.), Auditory dis-play: sonification, audification and auditory interfaces. Proceedings of the Inter-national Conference on Auditory Display 1992, pp. 405–416. Reading, MA:Addison-Wesley.

Bregman, A. S. (1990). Auditory scene analysis: the perceptual organization of

115

Page 127: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Bibliography

sound. Cambridge, MA: MIT Press.

Brewster, S. A. (1994a). A detailed investigation into the effectiveness ofearcons. In: G. Kramer (Ed.), Auditory display: sonification, audificationand auditory interfaces. Proceedings of the International Conference on Audi-tory Display 1992, pp. 471–498. Reading, MA: Addison-Wesley.

Brewster, S. A. (1994b). Providing a structured method for integrating non-speechaudio into human-computer interfaces. Ph.D. dissertation, University ofYork, York, UK.

Brewster, S. A. (1997). Using non-speech sound to overcome informationoverload. Displays 17, 179–189.

Brewster, S. A., P. C. Wright, and A. D. N. Edwards (1994). The design andevaluation of an auditory-enhanced scrollbar. In: B. Adelson, S. Dumais,and J. Olson (Eds.), Proceedings of CHI’94, pp. 173–179. Boston, MA: ACMPress/Addison-Wesley.

Brewster, S. A., P. C. Wright, and A. D. N. Edwards (1995). Parallel earcons:Reducing the length of audio messages. International Journal of Human-Computer Studies 43, 153–175.

Brown, M. L., S. L. Newsome, and E. P. Glinert (1989). An experiment into theuse of auditory cues to reduce visual workload. In: Proceedings of CHI’89,pp. 339–346. Reading, MA: ACM Press/Addison-Wesley.

Bussemakers, M. P. and A. de Haan (2000). When it sounds like a duck andit looks like a dog... auditory icons vs. earcons in multimedia environ-ments. In: Proceedings of the International Conference on Auditory Display2000.

Buxton, W. (1989). Introduction to this special issue on nonspeech audio.Human-Computer Interaction 4, 1–9.

Cabe, P. A. and J. B. Pittenger (2000). Human sensitivity to acoustic informa-tion from vessel filling. Journal of Experimental Psychology: Human Percep-tion and Performance 26, 313–324.

Carello, C., K. L. Anderson, and A. J. Kunkler-Peck (1998). Perception of ob-ject length by sound. Psychological Science 9, 211–214.

Chaigne, A. and C. Lambourg (2001). Time-domain simulation of dampedimpacted plates. I. theory and experiments. Journal of the Acoustical Soci-ety of America 109(4), 1422–1432.

Cohen, J. (1994). Monitoring background activities. In: G. Kramer (Ed.), Au-ditory display: sonification, audification and auditory interfaces. Proceedings of

116

Page 128: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Bibliography

the International Conference on Auditory Display 1992, pp. 499–532. Read-ing, MA: Addison-Wesley.

Cox, D. R. and E. J. Snell (1989). Analysis of binary data. London, UK: Chapmanand Hall.

Daniel, P. and R. Weber (1997). Psychoacoustical roughness: implementationof an optimized model. Acustica 83, 113–123.

Deutsch, D. (1983). Auditory illusions, handedness and the spatial environ-ment. Journal of the Audio Engineering Society 31, 607–620.

Deutsch, D. (Ed.) (1999). The psychology of music (2nd ed.). San Diego, Califor-nia: Academic Press.

DiGiano, C. J. and R. M. Baecker (1992). Program auralization: sound en-hancements to the programming environment. In: Proceedings of theGraphics Interface’92, pp. 44–52.

Dombois, F. (2001). Using audification in planetary seismology. In: Proceed-ings of the International Conference on Auditory Display 2001, pp. 227–230.

Edwards, A. D. N. (1989). Soundtrack: an auditory interface for blind users.Human-Computer Interaction 4, 45–66.

Eggen, J. H. (1995). The sound of walking. Internal IPO report 1084. Eind-hoven, The Netherlands.

Flowers, J. H. and T. A. Hauer (1995). Musical versus visual graphs: cross-modal equivalence in perception of time series data. Human Factors 37,553–569.

Fowler, C. A. (1990). Sound-producing sources as objects of perception: ratenormalization and nonspeech perception. Journal of the Acoustical Societyof America 88, 1236–1249.

Fowler, C. A. (1991). Auditory perception is not special: we see the world,we feel the world, we hear the world. Journal of the Acoustical Society ofAmerica 89, 2910–2915.

Franssen, L. (1998). Auditory perception of the size of rolling balls. InternalIPO report 1192. Eindhoven, The Netherlands.

Freed, D. J. (1990). Auditory correlates of perceived mallet hardness for a setof recorded percussive sound events. Journal of the Acoustical Society ofAmerica 87, 311–322.

Gaver, W. W. (1986). Auditory icons: using sound in computer interfaces.Human-Computer Interaction 2, 167–177.

117

Page 129: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Bibliography

Gaver, W. W. (1988). Everyday listening and auditory icons. Ph.D. dissertation,University of California, San Diego, California, USA.

Gaver, W. W. (1989). The SonicFinder: an interface that uses auditory icons.Human-Computer Interaction 4, 67–94.

Gaver, W. W. (1993a). How do we hear in the world? Explorations of ecologi-cal acoustics. Ecological Psychology 5, 285–313.

Gaver, W. W. (1993b). What in the world do we hear? An ecological approachto auditory event perception. Ecological Psychology 5, 1–29.

Gaver, W. W. (1994). Using and creating auditory icons. In: G. Kramer (Ed.),Auditory display: sonification, audification and auditory interfaces. Proceed-ings of the International Conference on Auditory Display 1992, pp. 417–446.Reading, MA: Addison-Wesley.

Gibson, J. J. (1966). The senses considered as perceptual systems. Boston, MA:Houghton Mifflin.

Gibson, J. J. (1986). The ecological approach to visual perception. Hillsdale, NJ:Lawrence Erlbaum Associates, Inc. (Original work published 1979).

Glasberg, B. R. and B. C. J. Moore (1990). Derivation of auditory filter shapesfrom notched-noise data. Hearing Research 47, 103–138.

Grey, J. M. and J. W. Gordon (1978). Perceptual effects of spectral modifica-tions on musical timbres. Journal of the Acoustical Society of America 63(5),1493–1500.

Gygi, B. (2001). Factors in the identification of environmental sounds. Ph.D. dis-sertation, Indiana University, Bloomington, Indiana, USA.

Halpern, D. L., R. Blake, and J. Hillenbrand (1986). Psychoacoustics of a chill-ing sound. Perception & Psychophysics 39, 77–80.

Handel, S. (1989). An introduction to the perception of auditory events. Cam-bridge, MA: MIT Press.

Hayward, C. (1994). Listening to the earth sing. In: G. Kramer (Ed.), Auditorydisplay: sonification, audification and auditory interfaces. Proceedings of theInternational Conference on Auditory Display 1992, pp. 369–404. Reading,MA: Addison-Wesley.

Hermes, D. J. (1998). Auditory material perception. IPO Annual Progress Re-port 33, 95–102. Eindhoven, The Netherlands.

Hermes, D. J. (2000). Synthesis of the sounds produced by rolling balls. Inter-nal IPO report 1226. Eindhoven, The Netherlands.

118

Page 130: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Bibliography

Houben, M., L. Franssen, A. Kohlrausch, D. Hermes, and B. Eggen (1999a).Auditory perception of the size and velocity of rolling balls. In: Collectedpapers from the joint ASA/EAA Meeting, Forum Acusticum 1999.

Houben, M. M. J., D. J. Hermes, and A. Kohlrausch (1999b). Auditory per-ception of the size and velocity of rolling balls. IPO Annual Progress Re-port 34, 86–93. Eindhoven, The Netherlands.

Houben, M. M. J., A. Kohlrausch, and D. J. Hermes (2001). Auditory cuesdetermining the perception of the size and speed of rolling balls. In: Pro-ceedings of the International Conference on Auditory Display 2001, pp. 105–110.

Houben, M. M. J. and C. N. J. Stoelinga (2002). Some temporal aspects ofrolling sounds. To appear in: Proceedings of the International Conference onSound Design, Paris, March 2002.

Houix, O., S. McAdams, and R. Causse (1999). Auditory categorization ofsound sources. In: M. A. Grealy and J. Thomson (Eds.), Studies in per-ception and action, Volume V, pp. 47–51. New Jersey: Lawrence ErlbaumAssociates.

Jameson, D. H. (1994). Audio-enhanced monitoring and debugging. In:G. Kramer (Ed.), Auditory display: sonification, audification and auditoryinterfaces. Proceedings of the International Conference on Auditory Display1992, pp. 253–265. Reading, MA: Addison-Wesley.

Jones, S. D. and S. M. Furner (1989). The construction of audio icons and in-formation cues for human-computer dialogues. In: T. Megaw (Ed.), Con-temporary Ergonomics: Proceedings of the Ergonomics Society’s 1989 annualconference. Reading, MA: Addison-Wesley.

Kendall, R. A. and E. C. Carterette (1996). Music perception and cognition. In:M. Friedman and E. Carterette (Eds.), Handbook of perception and cognition:Cognitive ecology (2nd ed.)., pp. 87–149. New York: Academic Press.

Klatzky, R. L., D. K. Pai, and E. P. Krotkov (2000). Perception of material fromcontact sounds. Presence 9, 399–410.

Kramer, G., B. Walker, T. Bonebright, P. Cook, J. Flowers, N. Miner, J. Neuhoff,R. Bargar, S. Barrass, J. Berger, G. Evreinov, W. Fitch, M. Grohn, S. Han-del, H. Kaper, H. Levkowitz, S. Lodha, B. Shinn-Cunningham, M. Si-moni, and S. Tipei (1999). Sonification report: status of the field and re-search agenda. Report prepared for the National Science Foundation bymembers of the International Community of Auditory Display.

Kunkler-Peck, A. J. and M. T. Turvey (2000). Hearing shapes. Journal of Exper-

119

Page 131: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Bibliography

imental Psychology: Human Perception and Performance 26, 279–294.

Lakatos, S., S. McAdams, and R. Causse (1997). The representation of au-ditory source characteristics: simple geometric form. Perception & Psy-chophysics 59, 1180–1190.

Lemmens, P. M. C., M. P. Bussemakers, and A. de Haan (2000). The effectof earcons on reaction times and error-rates in a dual task vs. a singletask experiment. In: Proceedings of the International Conference on AuditoryDisplay 2000, pp. 177–183.

Lemmens, P. M. C., M. P. Bussemakers, and A. de Haan (2001). Effects ofauditory icons and earcons on visual categorization: the bigger picture.In: Proceedings of the International Conference on Auditory Display 2001, pp.117–125.

Li, X., R. J. Logan, and R. E. Pastore (1991). Perception of acoustic source char-acteristics: walking sounds. Journal of the Acoustical Society of America 90,3036–3049.

Lutfi, R. A. (2001). Auditory detection of hollowness. Journal of the AcousticalSociety of America 110, 1010–1019.

Lutfi, R. A. and E. Oh (1994). Auditory discrimination based on the physicaldynamics of a tuning fork. Journal of the Acoustical Society of America 95,2967.

Lutfi, R. A. and E. L. Oh (1997). Auditory discrimination of material changesin a struck-clamped bar. Journal of the Acoustical Society of America 102,3647–3656.

McAdams, S. (1993). Recognition of sound sources and events. In:S. McAdams and E. Bigand (Eds.), Thinking in sound, Chapter 6, pp. 146–198. Oxford: Clarendon Press.

McAdams, S. (2000). The psychomechanics of real and simulated soundsources. Journal of the Acoustical Society of America 107, 2792(A).

McAdams, S., J. W. Beauchamp, and S. Meneguzzi (1999). Discrimination ofmusical instrument sounds resynthesized with simplified spectrotempo-ral parameters. Journal of the Acoustical Society of America 105, 882–897.

McAdams, S. and E. Bigand (Eds.) (1993). Thinking in sound. Oxford: Claren-don Press.

Melara, R. D. and L. E. Marks (1990a). Interaction among auditory dimen-sions: timbre, pitch, and loudness. Perception & Psychophysics 48, 169–178.

120

Page 132: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Bibliography

Melara, R. D. and L. E. Marks (1990b). Perceptual primacy of dimensions:support for a model of dimensional interaction. Journal of ExperimentalPsychology: Human Perception and Performance 16, 398–414.

Moore, B. C. J. (1997). An introduction to the psychology of hearing (4rd ed.). NewYork: Academic Press.

Moore, B. C. J., B. R. Glasberg, and T. Baer (1997). A model for the predic-tion of thresholds, loudness, and partial loudness. Journal of the AudioEngineering Society 45, 224–240.

Mynatt, E. D. (1997). Transforming graphical interfaces into auditory inter-faces for blind users. Human-computer interaction 12, 7–45.

Neuhoff, J. G., G. Kramer, and J. Wayand (2000). Sonification and the inter-action of perceptual dimensions: can the data get lost in the map? In:Proceedings of the International Conference on Auditory Display 2000, pp.93–98.

Patterson, R. D., M. H. Allerhand, and C. Giguere (1995). Time-domain mod-eling of peripheral auditory processing: a modular architecture and asoftware platform. Journal of the Acoustical Society of America 98, 1890–1894.

Poll, L. H. D. (1996). Visualising graphical user interfaces for blind users.Ph.D. dissertation, Eindhoven University of Technology, Eindhoven,The Netherlands.

Rao, P., R. van Dinther, R. Veldhuis, and A. Kohlrausch (2001). A measurefor predicting audibility discrimination thresholds for spectral envelopedistortions in vowel sounds. Journal of the Acoustical Society of Amer-ica 109, 2085–2097.

Repp, B. H. (1987). The sounds of two hands clapping: an exploratory study.Journal of the Acoustical Society of America 81, 1100–1109.

Rigas, D. I. and J. L. Alty (1997). The use of music in a graphical interface forthe visually impaired. In: S. Howard, J. Hammond, and G. Lindgaard(Eds.), Proceedings of IFIP Interact’97, pp. 228–235. IFIP: Chapman & Hall.

Roussarie, V., S. McAdams, and A. Chaigne (1998). Perceptual analysis ofvibrating bars synthesized with a physical model. In: Proceedings of the16th international congress on acoustics, pp. 2227–2228.

Schorer, E. (1989). Vergleich eben erkennbarer Unterschiede und Variationender Frequenz und Amplitude von schallen. Acustica 68, 183–199.

Sikora, C. A. and L. A. Roberts (1997). Defining a family of feedback signals

121

Page 133: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Bibliography

for multimedia communication devices. In: S. Howard, J. Hammond,and G. Lindgaard (Eds.), Proceedings of IFIP Interact’97, pp. 373–380. IFIP:Chapman & Hall.

Stevens, R. D., S. A. Brewster, P. C. Wright, and A. D. N. Edwards (1994). De-sign and evaluation of an auditory glance at algebra for blind readers. In:G. Kramer and S. Smith (Eds.), Proceedings of the International Conferenceon Auditory Display 1994, pp. 21–30. Santa Fe Institute: Addison-Wesley.

Stoelinga, C. (2001). An analysis of rolling sounds. Internal IPO report 1243.Eindhoven, The Netherlands.

Terhardt, E. (1974). On the perception of periodic sound fluctuations (rough-ness). Acustica 30, 201–213.

Van den Doel, K. and D. K. Pai (1998). The sounds of physical shapes. Pres-ence 7, 382–395.

Van Esch-Bussemakers, M. P. (2001). Do I hear what I see? Ph.D. dissertation,University of Nijmegen, Nijmegen, The Netherlands.

Vanderveer, N. J. (1979). Ecological acoustics: human perception of environ-mental sounds. Dissertation Abstracts International 40/09B, 4543. Uni-versity Microfilms No. 8004002.

Vickers, P. (1999). CAITLIN: Implementation of a musical program auralisationsystem to study the effects on debugging tasks as performed by novice Pascalprogrammers. Ph.D. dissertation, Loughborough University, Loughbor-ough, UK.

Warren, R. M. (1999). Auditory perception: a new analysis and synthesis. Cam-bridge, UK: Cambridge University Press.

Warren, Jr, W. H., E. Kim, and R. Husney (1987). The way the ball bounces:visual and auditory perception of elasticity and control of the bouncepass. Perception 16, 309–336.

Warren, Jr., W. H. and R. R. Verbrugge (1984). Auditory perception of break-ing and bouncing events: a case study in ecological acoustics. Journal ofExperimental Psychology: Human Perception and Performance 10, 704–712.

Wildes, R. P. and W. A. Richards (1988). Recovering material properties fromsound. In: W. Richards (Ed.), Natural computation, pp. 356–363. Cam-bridge, MA: MIT Press.

Zwicker, E. and H. Fastl (1999). Psychoacoustics – Facts and models (2nd ed.).Berlin: Springer Verlag.

122

Page 134: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Summary

In everyday life, sounds inform us about the world we are living in. By listen-ing to these sounds, we are able to extract information about the sound source,the location, and the environment in which the sound is produced. Althoughsound is a familiar and natural medium to convey information, it is barelyused in systems based on information technology. In order to create suitableauditory interfaces based on everyday sounds, we have to better understandhow people perceive everyday sounds. The aim of this thesis was to obtaina clearer view on the auditory perception of the size and the speed of rollingobjects.

In Chapter 1, an overview of relevant studies on auditory event perception isgiven. This was followed by a description of the research methodology weused in this thesis. According to this methodology we investigated the re-lationship between the perceived and the physical properties of an acousticsource event (the size and the speed of rolling balls), with the aim of dis-covering acoustic structures within the sound that listeners can use in theirjudgments.

In Chapter 2, three experiments are reported, investigating the link betweenacoustic source events and auditory perception. In experiment I, listenerswere asked to discriminate differences in the size of wooden rolling balls onthe basis of recorded sounds. Results showed that they are able to choose thelarger ball from paired sounds. In experiment II, the auditory perception of thespeed of rolling balls was examined. Although listeners are able to discrim-inate the sounds of rolling balls with different speeds, some of them reversethe labeling of the speed. In experiment III, the interaction between size andspeed was tested. Results indicated that if the size and the speed of a rollingball are varied, listeners generally are still able to discriminate size and speed,but the judgment of speed is influenced by the variation in size. Investigationof auditory cues (acoustic structures) that listeners may use in their decisions

123

Page 135: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Summary

showed a conflict in available cues (centroid of specific loudness) when vary-ing both size and speed, which is in accordance with the interaction effect.

In Chapter 3, we investigated whether listeners base their judgments on spec-tral cues or temporal cues. Recorded sounds were manipulated by mergingthe temporal characteristics of one sound with the spectral characteristics ofanother. Perception experiments showed that if listeners had to choose thelarger ball from two sounds, they had a preference for the spectral content ofa large ball. If listeners had to choose the faster ball from two sounds, theypreferred the spectral content of a small ball, and, to a lesser degree, the spec-tral content of a fast rolling ball. The temporal cues in the sounds were ofminor importance for the range of stimuli used in these experiments, possi-bly because sounds with much amplitude modulation or excessive ticks wereexcluded from the experiments.

Therefore, the experiments reported in Chapter 4 were conducted. In these ex-periments subjects had to judge the size and the speed of wooden rolling ballsby listening to recordings provided with artificially added amplitude modula-tion. Results showed that when judging size, listeners take into account boththe physical size and the angular speed induced by the additional amplitudemodulation. As to speed, the judgments correlated predominately with angu-lar speed. Furthermore, by modulating the amplitude at a rate different fromthat specified by the size and the linear speed of the rolling ball, the perceptof speed and, to a lesser extent, size are affected. Also, the interaction effectbetween the two physical properties of the balls, as observed in Chapter 2,may be explained by listeners’ attention to the angular speed.

124

Page 136: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Samenvatting

We zijn gewend om naar geluiden te luisteren en hieruit informatie te halenover de wereld waarin we leven. Deze geluiden geven ons informatie overde geluidsbron, de locatie en de omgeving waarin het geluid is geproduceerd.Ondanks het feit dat geluid een vertrouwd en natuurlijk medium is om in-formatie over te brengen, wordt het nauwelijks gebruikt in informatiesyste-men. Wanneer we alledaagse geluiden zinnig willen gebruiken in auditieveinterfaces, dan zullen we beter moeten begrijpen hoe mensen deze geluidenwaarnemen. Het doel van dit proefschrift is het verkrijgen van een duidelij-ker beeld van de auditieve waarneming van de grootte en de snelheid vanrollende voorwerpen.

In Hoofdstuk 1 wordt een overzicht gegeven van studies naar de waarne-ming van geluid-producerende gebeurtenissen. Vervolgens wordt de on-derzoeksmethodologie beschreven die we in dit proefschrift hebben ge-bruikt. Aan de hand van deze methodologie hebben we de relatie tussen dewaargenomen en de fysische eigenschappen van een geluidsbron (de grootteen de snelheid van rollende ballen) onderzocht, met als doel het ontdekkenvan akoestische structuren in het geluid die luisteraars kunnen gebruiken bijhun beoordeling.

Hoofdstuk 2 beschrijft drie experimenten waarin het verband tussen geluid-producerende gebeurtenissen en auditieve waarneming werd onderzocht. Inhet eerste experiment werden luisteraars gevraagd verschillen in de groottevan rollende houten ballen te onderscheiden, aan de hand van opgenomengeluiden. De resultaten lieten zien dat luisteraars in staat zijn om van paars-gewijs aangeboden geluiden dat geluid te kiezen dat werd geproduceerd doorde grootste bal. In het tweede experiment werd de auditieve waarnemingvan de snelheid van rollende ballen onderzocht. Alhoewel luisteraars in staatzijn om de geluiden van rollende ballen met verschillende snelheden te on-derscheiden, keren sommigen de labeling om (langzaam wordt met snel ver-

125

Page 137: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Samenvatting

ward). In het derde experiment werd de interactie tussen de grootte en desnelheid bekeken. Uit de resultaten bleek dat indien zowel de grootte alsde snelheid van een rollende bal wordt gevarieerd, luisteraars over het alge-meen nog steeds in staat zijn om de grootte en de snelheid te onderscheiden,maar dat de beoordeling van de snelheid wordt beınvloed door de variatiein grootte. Onderzoek naar auditieve cues (akoestische structuren) die luis-teraars zouden kunnen gebruiken bij hun beslissingen lieten een conflict zienin beschikbare cues (centroide van de specifieke luidheid) wanneer zowel degrootte als de snelheid wordt gevarieerd. Dit is in overeenstemming met hetwaargenomen interactie-effect.

In Hoofdstuk 3 wordt onderzocht of luisteraars hun oordelen baseren op spec-trale cues of temporele cues. Opgenomen geluiden werden gemanipuleerddoor de temporele eigenschappen van een geluid samen te voegen met despectrale eigenschappen van een ander geluid. Experimenten waarin dezegeluiden paarsgewijs werden aangeboden lieten zien dat als luisteraars hetgeluid geproduceerd door de grootste bal moesten kiezen, ze een voorkeurhadden voor de spectrale inhoud van een grote bal. Wanneer luisteraars desnelste bal moesten kiezen, hadden ze een voorkeur voor de spectrale inhoudvan een kleine bal en, in mindere mate, de spectrale inhoud van een snel rol-lende bal. De temporele cues in de geluiden waren van minder belang voor degeluiden die gebruikt zijn in dit experiment. Een mogelijke reden is dat stu-itergeluiden en geluiden die veel amplitude-modulatie bevatten niet werdengebruikt in de experimenten.

Om deze reden werden de experimenten zoals beschreven in Hoofdstuk 4uitgevoerd. In deze experimenten beoordeelden luisteraars de grootte ensnelheid van rollende houten ballen aan de hand van opgenomen geluidenwaaraan amplitude-modulatie is toegevoegd. De resultaten lieten zien dat alsluisteraars de grootte moeten beoordelen, ze zowel de fysieke grootte als dehoeksnelheid (verkregen door toevoegen van amplitude-modulatie) in ogen-schouw nemen. Wat betreft snelheid correleerde de beoordelingen hoofdzake-lijk met de hoeksnelheid. Wanneer de frequentie van de amplitude-modulatieonafhankelijk van de grootte en de lineaire snelheid wordt gevarieerd, wordtde waargenomen grootte, en in mindere mate de snelheid, hierdoor beınvloed.Tevens kan de aandacht van de luisteraar voor de hoeksnelheid het interactie-effect tussen de twee fysische eigenschappen van de bal (zoals beschreven inHoofdstuk 2) verklaren.

126

Page 138: The sound of rolling objects : perception of size and speed · The sound of rolling objects : perception of size and speed Citation for published version (APA): Houben, M. M. J. (2002)

Curriculum vitae

27 June 1974 Born in Weert, The Netherlands

1986 – 1992 VWO (‘pre-university education’), Philips van HorneScholengemeenschap, Weert

1992 – 1997 Werktuigkundige Medische Technologie (‘BiomechanicalEngineering’), Eindhoven University of Technology

1998 – 2002 Graduate student at IPO, Center for User-System Interaction(in 2001 merged with the Department of TechnologyManagement), Eindhoven University of Technology

127