critical ratios of beluga whales (delphinapterus leucas) and masked signal duration

8
Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration Christine Erbe a JASCO Research Ltd., 55 Fiddlewood Crescent, Bellbowrie, Qld 4070, Australia Received 28 February 2008; revised 7 July 2008; accepted 21 July 2008 This article examines the masking of a complex beluga vocalization by natural and anthropogenic noise. The call consisted of six 150 ms pulses exhibiting spectral peaks between 800 Hz and 8 kHz. Comparing the spectra and spectrograms of the call and noises at detection threshold showed that the animal did not hear the entire call at threshold. It only heard parts of the call in frequency and time. From the masked hearing thresholds in broadband continuous noises, critical ratios were computed. Fletcher critical bands were narrower than either 1 5 or 1 11 of an octave at the low frequencies of the call 2 kHz, depending on which frequency the animal cued on. From the masked hearing thresholds in intermittent noises, the audible signal duration at detection threshold was computed. The intermittent noises differed in gap length, gap number, and masking, but the total audible signal duration at threshold was the same: 660 ms. This observation supports a multiple-looks model. The two amplitude modulated noises exhibited weaker masking than the unmodulated noises hinting at a comodulation masking release. © 2008 Acoustical Society of America. DOI: 10.1121/1.2970094 PACS numbers: 43.66.Dc, 43.66.Gf, 43.80.Lb WWA Pages: 2216–2223 I. INTRODUCTION In general terms, we speak of masking when a noise interferes with or obscures a signal. Auditory masking is a matter of everyday experience, and likely affects all life forms with an acoustic sense. The American National Stan- dards Institute defines masking as both a process and a quan- tity Sec. 12.29: 1 a The process by which the threshold of hearing for one sound is raised by the presence of another masking sound. b The amount by which the threshold of hearing for one sound is raised by the presence of another masking sound, expressed in decibels. Numerically, if SL 0 is the signal level at detection threshold in the absence of noise, and SL n is the signal level at detection threshold in the presence of noise, then the amount of masking M is simply: M = SL n - SL 0 dB . 1 A. Frequency selectivity in masking In 1940, Fletcher 2 published the results of a number of auditory signal detection experiments and postulated that the human auditory system can be modeled as a series of over- lapping bandpass filters. In one of the experiments described, humans detected pure tones in white noise of constant spec- tral density level but of varying bandwidth. With the tone located at the center frequency of the masking band, Fletcher found that masking in the sense of definition b increased with increasing bandwidth up to—what he called—a critical bandwidth. For noise wider than the critical bandwidth, masking remained constant. From this band-widening ex- periment, Fletcher postulated that at detection threshold and in the case of white noise, the critical bandwidth was numeri- cally equal to the ratio of the intensity of the tone to the average intensity per cycle of the noise. The American National Standards Institute defines the critical band Sec. 6.26 3 as that frequency band of sound, being part of a continuous-spectrum noise covering a wide band, that contains sound power equal to that of a pure tone centered in the critical band and just audible in the presence of the wideband noise; and the critical ratio Fletcher critical bandSec. 6.27 3 as the difference dB between the sound pressure level of a pure tone just audible in the presence of a con- tinuous noise of constant spectral density and the sound pressure spectrum level for that noise. Numerically, if I t denotes the intensity of the tone and PSD n the power spectral density intensity per Hertz of wideband noise at the levels where the tone is just audible through the noise, then the critical ratio CR becomes CR = 10 log 10 I t PSD n . 2 Remembering Fletcher’s concept of auditory filters, only a certain part critical band of this wideband noise is effec- tive at masking the tone. Under the equal-power assumption that the intensity of the tone I t equals the total intensity of the noise in the corresponding auditory filter, of width f , when the tone is just audible, so-called Fletcher critical bands can be computed. The equal-power assumption is I t = PSD n f . 3 a Electronic mail: [email protected] 2216 J. Acoust. Soc. Am. 124 4, October 2008 © 2008 Acoustical Society of America 0001-4966/2008/1244/2216/8/$23.00 Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17

Upload: christine

Post on 28-Mar-2017

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration

Redistri

Critical ratios of beluga whales (Delphinapterus leucas) andmasked signal duration

Christine Erbea�

JASCO Research Ltd., 55 Fiddlewood Crescent, Bellbowrie, Qld 4070, Australia

�Received 28 February 2008; revised 7 July 2008; accepted 21 July 2008�

This article examines the masking of a complex beluga vocalization by natural and anthropogenicnoise. The call consisted of six 150 ms pulses exhibiting spectral peaks between 800 Hz and 8 kHz.Comparing the spectra and spectrograms of the call and noises at detection threshold showed thatthe animal did not hear the entire call at threshold. It only heard parts of the call in frequency andtime. From the masked hearing thresholds in broadband continuous noises, critical ratios werecomputed. Fletcher critical bands were narrower than either 1

5 or 111 of an octave at the low

frequencies of the call ��2 kHz�, depending on which frequency the animal cued on. From themasked hearing thresholds in intermittent noises, the audible signal duration at detection thresholdwas computed. The intermittent noises differed in gap length, gap number, and masking, but thetotal audible signal duration at threshold was the same: 660 ms. This observation supports amultiple-looks model. The two amplitude modulated noises exhibited weaker masking than theunmodulated noises hinting at a comodulation masking release.© 2008 Acoustical Society of America. �DOI: 10.1121/1.2970094�

PACS number�s�: 43.66.Dc, 43.66.Gf, 43.80.Lb �WWA� Pages: 2216–2223

I. INTRODUCTION

In general terms, we speak of masking when a noiseinterferes with or obscures a signal. Auditory masking is amatter of everyday experience, and likely affects all lifeforms with an acoustic sense. The American National Stan-dards Institute defines masking as both a process and a quan-tity �Sec. 12.29�:1

�a� The process by which the threshold of hearingfor one sound is raised by the presence of another�masking� sound.�b� The amount by which the threshold of hearingfor one sound is raised by the presence of another�masking� sound, expressed in decibels.Numerically, if SL0 is the signal level at detection

threshold in the absence of noise, and SLn is the signal levelat detection threshold in the presence of noise, then theamount of masking M is simply:

M = SLn − SL0�dB� . �1�

A. Frequency selectivity in masking

In 1940, Fletcher2 published the results of a number ofauditory signal detection experiments and postulated that thehuman auditory system can be modeled as a series of over-lapping bandpass filters. In one of the experiments described,humans detected pure tones in white noise of constant spec-tral density level but of varying bandwidth. With the tonelocated at the center frequency of the masking band, Fletcherfound that masking �in the sense of definition b� increasedwith increasing bandwidth up to—what he called—a criticalbandwidth. For noise wider than the critical bandwidth,

a�

Electronic mail: [email protected]

2216 J. Acoust. Soc. Am. 124 �4�, October 2008 0001-4966/2008/1

bution subject to ASA license or copyright; see http://acousticalsociety.org/

masking remained constant. From this band-widening ex-periment, Fletcher postulated that at detection threshold andin the case of white noise, the critical bandwidth was numeri-cally equal to the ratio of the intensity of the tone to theaverage intensity per cycle of the noise.

The American National Standards Institute defines thecritical band �Sec. 6.26�3 as

that frequency band of sound, being part of acontinuous-spectrum noise covering a wide band,that contains sound power equal to that of a puretone centered in the critical band and just audible inthe presence of the wideband noise;and the critical ratio �Fletcher critical band� �Sec.

6.27�3 as thedifference �dB� between the sound pressure level ofa pure tone just audible in the presence of a con-tinuous noise of constant spectral density and thesound pressure spectrum level for that noise.Numerically, if It denotes the intensity of the tone and

PSDn the power spectral density �intensity per Hertz� ofwideband noise at the levels where the tone is just audiblethrough the noise, then the critical ratio CR becomes

CR = 10 log10It

PSDn. �2�

Remembering Fletcher’s concept of auditory filters, onlya certain part �critical band� of this wideband noise is effec-tive at masking the tone. Under the equal-power assumptionthat the intensity of the tone It equals the total intensity of thenoise in the corresponding auditory filter, of width �f , whenthe tone is just audible, so-called Fletcher critical bands canbe computed. The equal-power assumption is

It = PSDn �f . �3�

© 2008 Acoustical Society of America24�4�/2216/8/$23.00

content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17

Page 2: Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration

Redistri

Substituting into Eq. �2� yields the critical ratio in deci-bels:

CR = 10 log10 �f . �4�

Solving for �f , the Fletcher critical band is

�f = 10CR/10. �5�

Fletcher critical bands are estimates of auditory filterwidths from experiments that measured critical ratios. Criti-cal bands without Fletcher’s name in front are usually mea-sured with direct techniques, such as band-widening experi-ments. In humans,4 bottlenose dolphins,5 and three species ofpinnipeds listening in air,6 critical bands were wider thanFletcher critical bands by a factor 2.5, 5.6, and 3.2—14.2,respectively. The technique used might affect the results;critical bands measured by the two-tone-masking methodshowed no consistent relationship to critical ratios and wereoccasionally narrower than Fletcher critical bands in a harborseal.7

The difference between the critical bandwidth and theFletcher critical bandwidth can be explained in various ways.In the general power-spectrum model of masking,8 the equal-power assumption is discarded. The signal detection thresh-old is instead assumed to correspond to a critical signal-to-noise power ratio K within the auditory filter:

It/�PSDn�f� = K . �6�

Under the equal-power assumption, K=1, see Eq. �3�. In thepower-spectrum model of masking, K is about 0.4 forhumans,4 indicating that tones can be detected when theirpower is less than that of the noise in the correspondingcritical band.

Computing critical bandwidths with Eqs. �5� and �6� as-sumes that auditory filters are rectangular. Outside of theband �f , noise power does not add to the masking of a tonein the center; inside of �f , the noise �if distributed evenlyover all frequencies� masks independent of frequency. Toallow for nonrectangular auditory filter shapes, a weightingfunction W�f� is introduced in the power-spectrum model ofmasking:9

It = K� PSDn�f�W�f�df . �7�

Rather than being rectangular, the various weightingfunctions commonly used slope at the low-frequency and thehigh-frequency boundaries of the critical band. The actualwidth of the auditory filter is then defined as either the half-power width or the equivalent rectangular bandwidth�ERB�.10 The latter is computed by equating the area under-neath W�f� with the area underneath a rectangle of the sameheight as W�f�.

B. Signal duration

Signal duration can affect signal detection. Below somecritical signal duration, the detection threshold increases withdecreasing signal length. For longer signals, the thresholdremains constant. In humans and some terrestrial mammals,

the critical duration of signals at communication frequencies

J. Acoust. Soc. Am., Vol. 124, No. 4, October 2008

bution subject to ASA license or copyright; see http://acousticalsociety.org/

is about 150–500 ms.10–14 Similar values were measuredfrom bottlenose dolphins at communication and echolocationfrequencies.15 One beluga whale showed threshold increasesfor single tones at 60 kHz that were shorter than 100 ms;16

thresholds were lower for tones in rapid succession, in linewith temporal integration theory �see the following�. In aharbor seal, underwater thresholds increased for tones shorterthan about 100 ms.17 A California sea lion and a harbor sealin air had critical tone durations of 300 ms.18

These observations are usually explained by the tempo-ral integration �also called temporal summation� model19,20.The auditory system is assumed to behave like a power in-tegrator resulting in a 3 dB decrease in detection thresholdfor every doubling of signal duration below a few hundredmilliseconds.21 Experiments on temporal resolution, modula-tion and gap detection, however, produce much shorter tem-poral integration times of the order of 1–5 ms.10 This dis-crepancy has become known as the integration-resolutionparadox22. One solution to this paradox is provided by themultiple-looks model23. In this model, the auditory filterscompute outputs over short time windows �approximately3 ms�. These “looks” are stored in short-term memory �witha decay constant of a few hundred milliseconds� and are usedintelligently for signal detection and discrimination. For sig-nals much longer than 3 ms or for short signals in repetition,more looks are available to the listener, resulting in improvedperformance and lower thresholds.

The above-quoted temporal integration studies with ma-rine mammals measured signal duration at detection thresh-old in the absence of noise; or in the presence of continuousbroadband noise. In the presence of noncontinuous but inter-mittent noise, the masked signal detection threshold dependson the duration of the gaps in the noise and the spectralenergy distribution.

C. Masking in beluga whales

Few studies have addressed masking in beluga whales.Johnson et al.24 trained one beluga whale to detect pure tonesin broadband noise. Fletcher critical bands were computedby integrating the noise spectral density over frequency, untilthe noise power equaled the tone power at detection thresh-old. Fletcher critical bands �Eq. �4�� were about 18 dB fortones below 1 kHz, and increased in bandwidth for tones ofhigher frequency. Finneran et al.25 trained one beluga whaleto detect pure tones in notched noise of varying notch width.Assuming a rounded exponential function for W�f�, theycomputed ERBs as an estimate of auditory filter widths. At20 and 30 kHz, the ERBs were on average 9.1% and 15.3%of the center frequency, respectively. ERBs were larger athigher frequencies and at higher noise levels.

In a previous study, the author trained one beluga whaleto detect complex beluga vocalizations in different types ofnatural and anthropogenic noise26,27. The goal was to esti-mate the extent of the zone of masking around icebreakers inthe wild. Since its publication, the author has repeatedly beenasked whether the results of this study could be used to infer

more generic characteristics such as critical ratios or critical

Christine Erbe: Beluga critical ratios and temporal integration 2217

content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17

Page 3: Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration

Redistri

bands of the beluga auditory system. This article addressesthis question.

II. METHODS AND RESULTS

A. Behavioral experiments

From 1995 to 1997 one beluga whale �Aurora� wastrained and used for masking experiments at the VancouverAquarium.26,27 This animal detected calls of its species infour types of noise: bubbler system noise and propeller cavi-tation noise �both as emitted by an icebreaker�, naturally oc-curring, thermal ice-cracking noise, and Gaussian whitenoise �Fig. 1�. Icebreaker and ice-cracking noises had beenrecorded in the Arctic, Gaussian white noise was createddigitally. White noise, because of its flat frequency spectrum,would appear all black in Fig. 1 and is therefore not shown.The noise recordings were played continuously in the poolfor an entire session, by repeating the 2 s clips shown in Fig.1. The broadband noise level was about 160 dB re 1 �Pa at1 m for all four noises. The beluga vocalization was ran-domly inserted into the noise, with its level varying in 3 dBsteps according to a titration paradigm. The beluga was sta-

FIG. 1. �Color online� Spectrograms of �a� bubbler system noise and �b�propeller cavitation noise, both as emitted by an icebreaker, �c� naturallyoccurring, thermal ice-cracking noise, and �d� a beluga vocalization.

tioned against a bar 1 m in front of the sound projector and

2218 J. Acoust. Soc. Am., Vol. 124, No. 4, October 2008

bution subject to ASA license or copyright; see http://acousticalsociety.org/

indicated signal detection by a go/no-go response. The calldetection threshold in the absence of noise was also mea-sured in order to compute the threshold shift in the presenceof the above-noted four noises. Masking was 36.5, 33.0,26.6, and 32.0 dB for bubbler noise, propeller cavitationnoise, thermal ice-cracking noise, and white noise, respec-tively. Details of the behavioral experiment were publishedearlier.26,27

B. Critical ratios

Gaussian white noise has a constant power spectral den-sity �Fig. 2� and is thus an ideal noise for the computation ofcritical ratios. The call consisted of a fundamental tone at800 Hz with harmonics and a weaker nonharmonic sidetone.The power spectral densities of the call and the noise atdetection threshold were integrated over frequency in adja-cent bandpass filters. The powers of the call and the noise atthe outputs of the filters were compared. Under the equal-power assumption �Eq. �3��, when the power of the callequaled the power of the noise, the bandpass filter approxi-mated the width of the beluga auditory filter, or the Fletchercritical band. The widths of the bandpass filters were variedas 1 /n, from 1 octave wide, to 1

2 octave, 13 octave, 1

4 octave,etc. Bandpass filters were centered at the main call frequen-cies. For Gaussian white noise, the narrower the filters, thelower the power output at each filter �Fig. 2�. For pure tonesand the tones in the beluga call, the power at the output ofthe filter does not decrease with decreasing filter width, un-less the filter becomes narrower than the tone bandwidth.Therefore, at some filter width, the powers of the call and thenoise become equal. For the following analysis, call levels

FIG. 2. The bottom line shows the power spectral density of Gaussian whitenoise. This is equal to the sound intensity level �SIL� in a series of adjacentbandpass filters of constant 1 Hz width. The next line above is the SILintegrated into adjacent bandpass filters that were 1

12 of an octave wide. Ona linear frequency plot, these bands get wider with higher frequency, and theoutput of the filter increases accordingly. One-third octave bands are widerthan 1

12 octave bands, therefore powers at the output are higher. Each octaveband is as wide as three 1

3 octave bands and twelve 112 octave bands, there-

fore the power at the octave filters is highest.

were computed only over the times of the call “phonemes.”

Christine Erbe: Beluga critical ratios and temporal integration

content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17

Page 4: Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration

Redistri

From Fig. 1, the call consisted of six bursts �phonemes�. Thesix bursts were extracted and quiet times in between werediscarded for the computation of power spectral densities aswell as band levels.

Figure 3 shows 14 octave levels of white noise and the

beluga call at detection threshold. At the output of the 14

octave band centered at the 800 Hz tone of the call, the callpower equalled the noise power. For wider bands, the noisepower exceeded the call power; for narrower bands, the callpower exceeded the noise power. If the beluga auditory filterat 800 Hz is 1

4 of an octave wide or narrower, then the animalheard the 800 Hz tone of the call at the behaviorally mea-sured call detection threshold. If the beluga auditory filter atthe call frequencies is 1

11 of an octave wide or narrower, thenthe animal heard both the 800 Hz tone and the 1600 Hz har-monic at the detection threshold. No matter what the band-width, the higher tones of the call did not match the noisepower at detection threshold. Under the equal-power as-sumption of masking, the higher tones of the call were thusnot audible at detection threshold. The animal cued on thelow-frequency part of the call.

Bubbler system noise is continuous and near white at

FIG. 3. Top: Band levels of white noise and the beluga call at detectionthreshold. Bottom: Band levels of bubbler noise and the beluga call at de-tection threshold. Bands are centered at the call tones. At the wider band-widths, only the 800 Hz tone of the call was audible; at the narrower band-widths, the 1600 Hz tone became audible, too �based on the equal-powerassumption of masking�. Higher call components were never audible, nomatter how wide the auditory filter.

low frequencies. It was therefore also used for the estimation

J. Acoust. Soc. Am., Vol. 124, No. 4, October 2008

bution subject to ASA license or copyright; see http://acousticalsociety.org/

of Fletcher critical bands. Figure 3 shows 15 octave band

levels of bubbler noise and the call at detection threshold. Atthis bandwidth, the power of the 800 Hz tone equaled that ofthe noise. For 1

9 octave bands, the 800 Hz tone surpassed thenoise in power, and the 1600 Hz harmonic just equaled thenoise power. None of the higher tones of the call were able tomatch the noise power, no matter what the bandwidth.

In summary for white noise and bubbler noise, if theanimal cued on the lowest call component at 800 Hz, then itsFletcher critical bands were 1

5 of an octave wide or narrower.If the animal cued on the lowest two components �800 and1600 Hz�, then its Fletcher critical bands were 1

11 of an oc-tave wide or narrower. Figure 4 shows the maximumFletcher critical bands at 800 Hz �CR=21 and 20 dB� and1600 Hz �CR=17 and 18 dB� for both noises. The only otherpublished Fletcher critical band measurements in belugas atthese low frequencies are shown as stars.24 The only pub-lished measurements of beluga ERBs25 are shown as squares.Critical ratios increase with frequency. The data in Fig. 4 ispooled from three single beluga whales. Critical ratios ofAurora computed under the assumption that she heard boththe 800 and the 1600 Hz component of the call �circles� fitthe beluga data of Johnson et al. better than the critical ratioscomputed under the assumption that Aurora cued on only the800 Hz component �triangles�. It seems therefore more likelythat Aurora cued on the lowest two frequencies of the belugacall rather than on only the very lowest frequency. The Qvalue �quality factor� of the auditory filter centered at f0

=1600 Hz was Q= f0 /�f =13 based on masking by bubblernoise, and 16 based on white noise.

C. Additional masking by ambient noise

When measuring detection thresholds and computingFletcher critical bands, it is important to examine whetherambient noise adds to the masking by the test noises. Ambi-ent noise at the Vancouver aquarium where the behavioral

FIG. 4. �Color online� Maximum Fletcher critical bands at 800 Hz ��� and1600 Hz ��� for Gaussian white and bubbler noise from the beluga of thisstudy. Fletcher critical bands of Johnson et al. �Ref. 24� from one beluga ���.ERBs of Finneran et al. �Ref. 25� from one beluga ���.

experiments were carried out was very variable. The test

Christine Erbe: Beluga critical ratios and temporal integration 2219

content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17

Page 5: Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration

Redistri

pool was close enough to the machinery room for low-frequency noise to be easily detectable inside the water.Overflow and filtration equipment switched on and off auto-matically. The pool bordered public parkland, where path-way construction took place during 2 months. Last but notleast, a quarter of the data were collected during the rainyseason. Ambient noise measurements were done regularlyprior to the masking experiments. Figure 5 shows thesmoothed envelopes of the lowest and highest ambient noisePSDs.

The beluga audiogram �dotted line� is the mean of fourpublished audiograms.26,28–30 Pure tone detection thresholds�shown as stars� were also measured for the beluga Auroraused for the masking study. During these audiogram ses-sions, the equipment around the test pool was switched off;ambient noise levels corresponded to the lower envelope.Power spectral densities of ambient noise have to be at leasta critical ratio below the audiogram for pure tone detectionthresholds not to be masked. The above-determined maxi-mum critical ratios were 21 dB if the animal cued on 800 Hzand 18 dB if the animal cued on 1600 Hz. At 1 and 10 kHz,ambient noise was not a critical ratio below the audiogram.At 500 Hz and 5 kHz, ambient noise was just about a criticalratio below the audiogram. The pure tone detection thresh-olds of the beluga whale Aurora, even though they fit in wellwith the other published audiograms, might therefore havebeen masked, yielding a masked audiogram. Fletcher criticalband levels of the higher PSD envelope are plotted in Fig. 5as well, showing that ambient noise in the pool at its loudestlevel was audible to the beluga over the full bandwidth.However, spectrum density levels of ambient noise were wellbelow those of the test noises. Therefore, while ambientnoise could have masked the audiogram measurements, it did

FIG. 5. Ambient noise in the pool: Power spectral densities �dB re1 �Pa2 /Hz� fell into the gray shaded area; also shown are 1

5 and 111 octave

band levels �dB re 1 �Pa� of the upper envelope. Further plotted are themean of all published beluga audiograms and pure tone detection thresholdsof the beluga Aurora at four frequencies.

not add to the masking of the test noises.

2220 J. Acoust. Soc. Am., Vol. 124, No. 4, October 2008

bution subject to ASA license or copyright; see http://acousticalsociety.org/

D. Masked signal duration

Revisiting Fig. 1, while white noise and bubbler systemnoise were temporally continuous, thermal ice-crackingnoise and propeller cavitation noise were intermittent. It wasargued earlier that the masking potential of these noises wasless than that of continuous noise, because an animal mightdetect pieces of the call through the quieter gaps in thenoise.26,27 How much of the call did the animal hear at de-tection threshold?

Figure 6�a� shows percentiles of the thermal ice-cracking noise �in 1

11 octave band levels� as it was played inthe pool. From top to bottom, the dashed lines correspond tothe 95th, 75th, 50th, 25th, and 8th percentiles. The noiseconsisted of 2 s samples which were played continuously inrepetition. Ninety five percent of the time, the noise bandlevels were below the 95th percentile curve, etc. The 2 s callwas inserted at random times; its level variation followed atitration paradigm. One-eleventh octave band levels of thecall at detection threshold are plotted as a solid line in Fig. 6.Assuming 1

11 octave wide critical bands at 1600 Hz, 75% ofthe time the noise was quiet enough for the 1600 Hz peak ofthe call to be audible �under the equal-power assumption�.Given that the call phonemes only occupied about 44%

FIG. 6. �Color online� Percentiles of thermal ice-cracking noise and propel-ler cavitation noise �in 1

11 octave band levels� as played in the pool. Fromtop to bottom, the dotted lines correspond to the 95th, 75th, 50th, 25th, and8th percentiles. One-eleventh octave band levels of the call at detectionthreshold are shown as a solid line. The gray shaded area is the area under-neath the mean of the published beluga audiograms.

�880 ms� of the 2 s call, the animal would have heard the

Christine Erbe: Beluga critical ratios and temporal integration

content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17

Page 6: Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration

Redistri

1600 Hz component for on average 33% �=75% �44% � ofthe time, or for 660 ms. Assuming 1

11 octave wide criticalbands over the full bandwidth of the call, 25% of the time thenoise was quiet enough for all of the spectral peaks �with theexception of the minor 7.5 kHz component� to be audible.The animal would have heard the full bandwidth of the pho-nemes for 220 ms.

Band levels of propeller cavitation noise were below the1600 Hz peak of the call 75% of the time �Fig. 6�b��. As inthe above-mentioned case of thermal ice-cracking noise, the1600 Hz call component would have been audible for660 ms. Assuming 1

11 octave wide bands over the full band-width of the call, the 800 Hz peak would have been audiblefor slightly less of the time, with the high-frequency compo-nents hardly ever audible.

III. DISCUSSION

While the data used in this article were obtained from acaptive whale, the audiogram measurement showed that thisanimal had “normal” hearing �compared to other captive ani-mals�; and there was no reason to believe that its signal de-tection capabilities would differ from animals in the wild.While an enclosed pool has its own frequency response, po-tentially amplifying and damping certain frequencies at cer-tain locations, the effect would be the same for signal andnoise, therefore, the signal-to-noise ratio in the beluga’s criti-cal bands does not change. A full analysis of reverberation inthe pool was done at the time of the behavioralexperiments.26

A. Critical ratios

Call detection thresholds in four types of noises weremeasured behaviorally with a trained beluga whale �Aurora�.Two types of noises were continuous and broadband: artifi-cially created, Gaussian white noise and bubbler systemnoise recorded from an icebreaker in the field. Band levels ofvarying width were plotted for the call and noises at detec-tion threshold. Only for the two lowest frequencies of thecall was there a bandwidth for which the noise level equaledthe call level. In other words, under the equal-power assump-tion of masking, only the two lowest frequencies could havebeen audible to the whale at detection threshold. If the ani-mal cued on the 800 Hz component of the call, then itsFletcher critical bands were 1

5 of an octave wide or narrower.If the animal cued on both the 800 and the 1600 Hz compo-nents, then its Fletcher critical bands were 1

11 of an octavewide or narrower. Critical ratios at these frequencies havebeen published for one other beluga whale 24. One-eleventhoctave bands fit the published data better than 1

5 octavebands; therefore, Aurora likely cued on 1600 Hz as well ason 800 Hz.

B. Masked signal duration

Two types of noises were intermittent and broadband:propeller cavitation noise recorded from an icebreaker andnaturally occurring, thermal ice-cracking noise. Using 1

11 oc-

tave band filters, intensity level percentiles of the noises

J. Acoust. Soc. Am., Vol. 124, No. 4, October 2008

bution subject to ASA license or copyright; see http://acousticalsociety.org/

were plotted and compared to the call level at threshold. Forboth thermal ice-cracking noise and propeller cavitationnoise, the most intense frequency of the call �1600 Hz�reached the 75th percentile of the noise. In other words, for75% of the time, the noise was quiet enough for the animalto detect the 1600 Hz component of the call—under theequal-power assumption. Considering the durations of thephonemes in the call, the animal would have heard the1600 Hz component for a total length of 660 ms at detectionthreshold—in both propeller cavitation and ice-crackingnoise. The animal might have cued on different frequencieswhen presented with different masking noises. However, the1600 Hz component was audible for the largest percentageof time in both situations. This analysis is based on the out-comes of the critical ratio analysis, i.e., assuming Fletchercritical bands were 1

11 of an octave wide. For wider criticalbands, the call levels in Fig. 6 would remain roughly thesame, while the noise levels would increase. The call fre-quencies would hence reach lower percentiles, decreasing theaudible signal duration at detection threshold.

Taking a closer look at the gap structure of the twointermittent noises �Fig. 1�, natural ice-cracking noise con-sisted of sharp pulses with up to 700 ms of quiet gaps inbetween. Propeller cavitation noise consisted of 12 pulses /s,leaving quiet gaps for less than 60 ms. Each phoneme in thebeluga call had a duration of about 150 ms. The animalwould have heard a number of phonemes uninterruptedlythrough the relatively long gaps in ice-cracking noise. But itwould not have heard uninterrupted phonemes through theshort gaps in propeller noise. In propeller noise, Aurorawould have detected a series of short chunks of phonemes.As the signal was inserted randomly into the continuousnoise, the time when the signal happened in the noise was astochastic variable. On average, at detection threshold, theanimal would have heard 660 ms of the 1600 Hz component.This is the combined length of the chunks of signal surpass-ing the noise, summed over the 2 s duration of the call. Theother components of the call would have been audible forshorter durations.

While these durations are of the order of the criticalsignal lengths measured with terrestrial and marinemammals,10–18 no mammal tested to date has indicated tem-poral integration times of up to 2 s. For the detection ofsignals through gaps in noise, it does not make sense for thesubject to integrate energy over long time periods, because ifthe integration time extends beyond the gap, then the noiseenergy would quickly outgrow the integrated signal energy,hence increasing the masking. A temporal integration modelseems implausible.

While the behaviorally measured, absolute masking �Eq.�1�� differed for ice-cracking �26.6 dB� and propeller cavita-tion noise �33.0 dB�, the call detection threshold in thesenoises was such that the total duration of audible call com-ponents was the same �660 ms�, despite the very differentgap pattern of these two noises. A theory that can explain thisobservation is the theory of multiple looks. If each look lastsonly a few millisecond23 then it does not matter how thesignal is chopped up, as long as the gaps in the noise �mini-

mum 60 ms here� are not shorter than each look. Propeller

Christine Erbe: Beluga critical ratios and temporal integration 2221

content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17

Page 7: Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration

Redistri

noise chopped the signal into more and shorter chunks thanice-cracking noise, but Aurora had the same number of looksat the signal in both noises: all looks combining to 660 ms.

C. Adaptations for masking

It is interesting to note that the slope of the call peaksroughly follows the slope of the audiogram as well as theslope of the naturally occurring, thermal ice-cracking noise�Fig. 6�a��. While the two anthropogenic ship noises and theartificial white noise masked the high-frequency componentsof the call more than the low-frequency components, “forc-ing” the animal to cue on the low-frequency componentsduring the detection task, Arctic ambient noise enabled theanimal to hear the full bandwidth of the call at detectionthreshold. It is debatable whether this is coincidental or someadaptation of hearing and/or communication signals to natu-ral ambient noise.

A few studies have shown an adaptation of marine mam-mal calls to prevalent background noise. Right whales inareas with substantial low-frequency shipping noise shiftedtheir call frequencies up over the lifespan of individual ani-mals, probably to compensate for masking.31 A comparisonof whistles from three separate populations of Indo-Pacificbottlenose dolphins showed that animals in habitats with lessnoise produced whistles over a broader range of frequenciesand with greater modulations than animals in noisyhabitats.32 Bottlenose dolphins in Florida increased theirwhistle repetition rate upon watercraft approach, which wassuggested as a compensation for masking.33 Beluga whalesin the St. Lawrence increased the production of falling tonalcalls and pulsed calls, increased the repetition of calls, andshifted call frequencies up in the presence of ship noise.34

They also increased the call level �Lombard response�.35

Killer whales in Puget Sound increased their call durations inthe presence of boats.36

In the case of pinnipeds, calls are subject to masking byconspecifics, particularly during the breeding season. Harpseals have been shown to use frequency and temporal sepa-ration in conjunction with a wide vocal repertoire to reducemasking.37 Polar seals further use predominantly downsweptcalls,38 and downsweeps have lower detection thresholdsthan upsweeps.39 Directional hearing capabilities help reducemasking when the signal and the noise arrive from differentdirections.40–42 Another antimasking strategy is the use ofwithin-call repetition.43 The beluga call used for the currentmasking study consisted of six identical �in the sense of hav-ing the same time and frequency characteristics� phonemes.In line with multiple-looks theory, within-call repetition pro-vides a larger number of looks, resulting in a lower detectionthreshold. The theory has been shown to hold in the case ofsignals with characteristics of communication sounds. In hu-mans, the discrimination of syllables �each lasting 500 ms�improved when the stimuli were repeated.44,45

D. Comodulation masking release

The power-spectrum model of masking assumes that thelistener focuses on the critical band around the signal fre-

quency and that the detection threshold is related to a certain

2222 J. Acoust. Soc. Am., Vol. 124, No. 4, October 2008

bution subject to ASA license or copyright; see http://acousticalsociety.org/

signal-to-noise ratio at the output of this filter. Across-filtercomparisons, however, can aid signal detection under certainconditions in humans.10 Given a narrow-band or pure-tonesignal and a broadband noise, when the noise is amplitudemodulated, and if this modulation is coherent or correlatedacross different frequency bands, then the masking is re-duced compared to the case of unmodulated or incoherentlymodulated noise. This is termed the comodulation release ofmasking46 �CMR�.

The current study used four broadband noises. Gaussianwhite noise and bubbler noise are not amplitude modulated;propeller noise and ice-cracking noise, however, exhibit anamplitude modulation—coherently across frequency. Themodulation rate for propeller noise was about 12 Hz and wasconstant in time. The modulation rate for ice-cracking noisewas lower but random. The absolute masking as defined inEq. �1� was lowest for ice-cracking noise �26.6 dB�, fol-lowed by Gaussian noise �32.0 dB�, then propeller noise�33.0 dB�, then bubbler noise �36.5 dB�. The weakestmasker was a comodulated noise, the strongest masker wasan unmodulated noise. The masking of propeller and Gauss-ian noise was similar. However, the four maskers were nor-malized by their total power over 2 s and 22 kHz. Gaussiannoise had more energy at high frequencies outside of the callfrequencies than the other three noises, thus reducing its on-band masking effect. While the data support a CMR in ma-rine mammals, the amount of this release cannot be derivedfrom the data. The noises used in this study differed in amultitude of variables �frequency, time, power spectral den-sity, modulation, etc.�; and the signal had a complex fre-quency and time structure. An animal in a complex acousticenvironment might use a combination of differing strategiesto reduce masking. To test for particular strategies or models,signals and maskers need to be chosen such as to reduce thenumber of variables.

1American National Standard: Acoustical Terminology ANSI S1.1 �Ameri-can National Standards Institute, Inc., and Acoustical Society of America,New York, 1994�.

2H. Fletcher, “Auditory patterns,” Rev. Mod. Phys. 12, 47–65 �1940�.3American National Standard: Bioacoustical Terminology ANSI S3.20�American National Standards Institute, Inc., and Acoustical Society ofAmerica, New York, 1995�.

4B. Scharf, “Critical bands,” in Foundations of Modern Auditory Theory,edited by J. V. Tobias �Academic, New York, 1970�, Vol. 1, pp. 159–202.

5W. W. L. Au and P. W. B. Moore, “Critical ratio and critical bandwidth forthe Atlantic bottlenose dolphin,” J. Acoust. Soc. Am. 88, 1635–1638�1990�.

6B. L. Southall, R. J. Schusterman, and D. Kastak, “Auditory masking inthree pinnipeds: Aerial critical ratios and direct critical bandwidth mea-surements,” J. Acoust. Soc. Am. 114, 1660–1666 �2003�.

7S. D. Turnbull and J. M. Terhune, “White noise and pure tone masking ofpure tone thresholds of a harbour seal listening in air and underwater,”Can. J. Zool. 68, 2090–2097 �1990�.

8R. D. Patterson and B. C. J. Moore, “Auditory filters and excitation pat-terns as representations of frequency resolution,” in Frequency Selectivityin Hearing, edited by B. C. J. Moore �Academic, New York, 1986�, pp.123–127.

9B. R. Glasberg and B. C. J. Moore, “Derivation of auditory filter shapesfrom notched-noise data,” Hear. Res. 47, 103–138 �1990�.

10B. C. J. Moore, An Introduction to the Psychology of Hearing �Academic,San Diego, 1997�.

11J. W. Hughes, “The threshold of audition for short periods of stimulation,”Proc. R. Soc. London, Ser. B 133, 486–490 �1946�.

12

W. R. Garner and G. A. Miller, “The masked threshold of pure tones as a

Christine Erbe: Beluga critical ratios and temporal integration

content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17

Page 8: Critical ratios of beluga whales (Delphinapterus leucas) and masked signal duration

Redistri

function of duration,”J. Exp. Psychol. 37, 293–303 �1947�.13T. D. Clack, “Effect of signal duration on the auditory sensitivity of hu-

mans and monkeys �Macaca mulatta�,” J. Acoust. Soc. Am. 40, 1140–1146 �1966�.

14R. R. Fay, Hearing in Vertebrates: A Psychophysics Databook �Hill-FayAssociates, Winnetka, IL, 1988�.

15C. S. Johnson, “Relation between absolute threshold and duration-of-tonepulses in the bottlenosed porpoise,” J. Acoust. Soc. Am. 43, 757–763�1968�.

16C. S. Johnson, “Hearing thresholds for periodic 60-kHz tone pulses in thebeluga whale,” J. Acoust. Soc. Am. 89, 2996–3001 �1991�.

17J. M. Terhune, “Detection thresholds of a harbour seal to repeated under-water high-frequency, short-duration sinusoidal pulses,” Can. J. Zool. 66,1578–1582 �1988�.

18M. M. Holt, B. L. Southall, D. Kastak, R. J. Schusterman, and C.Reichmuth-Kastak, “Temporal integration in a California sea lion and aharbor seal: Estimates of aerial auditory sensitivity as a function of signalduration,” J. Acoust. Soc. Am. 116, 2531 �2004� �Abstract�.

19R. Plomp and M. A. Bouman, “Relation between hearing and duration fortone pulses,” J. Acoust. Soc. Am. 31, 749–758 �1959�.

20R. Plomp, “Hearing thresholds for periodic tone pulses,” J. Acoust. Soc.Am. 33, 1561–1567 �1961�.

21G. M. Gerken, V. K. H. Bhat, and M. H. Hutchison-Clutter, “Auditorytemporal integration and the power-function model,” J. Acoust. Soc. Am.88, 767–778 �1990�.

22E. de Boer, “Auditory time constants: A paradox?,” in Time Resolution inAuditory Systems, edited by A. Michelson �Springer, Berlin, 1975�, pp.141–158.

23N. F. Viemeister and G. H. Wakefield, “Temporal integration and multiplelooks,” J. Acoust. Soc. Am. 90, 858–865 �1991�.

24C. S. Johnson, M. W. McManus, and D. Skaar, “Masked tonal hearingthresholds in the beluga whale,” J. Acoust. Soc. Am. 85, 2651–2654�1989�.

25J. J. Finneran, C. E. Schlundt, D. A. Carder, and S. H. Ridgway, “Auditoryfilter shapes for the bottlenose dolphin �Tursiops truncates� and the whitewhale �Delphinapterus leucas� derived with notched noise,” J. Acoust.Soc. Am. 112, 322–328 �2002�.

26C. Erbe and D. M. Farmer, “Masked hearing thresholds of a beluga whale�Delphinapterus leucas� in icebreaker noise,” Deep-Sea Res., Part II 45,1373–1388 �1998�.

27C. Erbe, “Detection of whale calls in noise: Performance comparison be-tween a beluga whale, human listeners, and a neural network,” J. Acoust.Soc. Am. 108, 297–303 �2000�.

28F. T. Awbrey, J. A. Thomas, and R. A. Kastelein, “Low-frequency under-water hearing sensitivity in belugas Delphinapterus leucas,” J. Acoust.Soc. Am. 84, 2273–2275 �1988�.

29C. S. Johnson, M. W. McManus, and D. Skaar, “Masked tonal hearingthresholds in the beluga whale,” J. Acoust. Soc. Am. 85, 2651–2654�1989�.

J. Acoust. Soc. Am., Vol. 124, No. 4, October 2008

bution subject to ASA license or copyright; see http://acousticalsociety.org/

30M. J. WhiteJr., J. Norris, D. Ljungblad, K. Baron, and G. diSciara, “Au-ditory thresholds of two beluga whales Delphinapterus leucas,” ReportNo. 78-109, Hubbs/SeaWorld Research Institute for Naval Ocean SystemCenter, San Diego, 1978, pp. 35.

31S. E. Parks, C. W. Clark, and P. L. Tyack, “Short- and long-term changesin right whale calling behavior: The potential effects of noise on acousticcommunication,” J. Acoust. Soc. Am. 122, 3725–3731 �2007�.

32T. Morisaka, M. Shinohara, F. Nakahara, and T. Akamatsu, “Effects ofambient noise on the whistles of Indo-Pacific bottlenose dolphin popula-tions,” J. Mammal. 86, 541–546 �2005�.

33K. C. Buckstaff, “Effects of watercraft noise on the acoustic behavior ofbottlenose dolphins, Tursiops truncatus, in Sarasota Bay, Florida,” MarineMammal Sci. 20, 709–725 �2004�.

34V. Lesage, C. Barrette, M. C. S. Kingsley, and B. Sjare, “The effect ofvessel noise on the vocal behavior of belugas in the St. Lawrence RiverEstuary, Canada,” Marine Mammal Sci. 15, 65–84 �1999�.

35P. M. Scheifele, S. Andrew, R. A. Cooper, and M. Darre, “Indication of aLombard vocal response in the St. Lawrence River beluga,” J. Acoust.Soc. Am. 117, 1486–1492 �2005�.

36A. D. Foote, R. W. Osborne, and A. R. Hoelzel, “Whale-call response tomasking boat noise,” Nature �London� 428, 910 �2004�.

37A. Serrano and J. M. Terhune, “Antimasking aspects of harp seal �Pago-philus groenlandicus� underwater vocalizations,” J. Acoust. Soc. Am. 112,3083–3090 �2002�.

38J. Terhune, “Antimasking strategies of underwater vocalizations and hear-ing abilities of polar seals,” J. Acoust. Soc. Am. 116, 2555 �2004� �Ab-stract�.

39S. Turnbull and J. M. Terhune, “Descending frequency swept tones havelower thresholds than ascending frequency swept tones for a harbor seal�Phoca vitulina� and human listeners,” J. Acoust. Soc. Am. 96, 2631–2636�1994�.

40M. M. Holt and R. J. Schusterman, “Spatial release from masking of aerialtones in pinnipeds,” J. Acoust. Soc. Am. 121, 1219–1225 �2007�.

41S. D. Turnbull, “Changes in masked thresholds of a harbor seal Phocavitulina associated with angular separation of signal and noise sources,”Can. J. Zool. 72, 1863–1866 �1994�.

42D. E. Bain and M. E. Dahlheim, “Effects of masking noise on detectionthresholds of killer whales,” in Marine Mammals and the Exxon Valdez,edited by T. R. Loughlin �Academic, San Diego, 1994�, pp. 243–256.

43A. Serrano and J. M. Terhune, “Within-call repetition may be an anti-masking strategy in underwater calls of harp seals �Pagophilus groen-landicus�,” Can. J. Zool. 79, 1410–1413 �2001�.

44R. Frush Holt and A. Earley Carney, “Multiple looks in speech sounddiscrimination in adults,” J. Speech Lang. Hear. Res. 48, 922–943 �2005�.

45R. Frush Holt and A. Earley Carney, “Developmental effects of multiplelooks in speech sound discrimination,” J. Speech, Language, J. SpeechLang. Hear. Res. 50, 1404–1424 �2007�.

46J. W. Hall, M. P. Haggard, and M. A. Fernandes, “Detection in noise byspectro-temporal pattern analysis,” J. Acoust. Soc. Am. 76, 50–56 �1984�.

Christine Erbe: Beluga critical ratios and temporal integration 2223

content/terms. Download to IP: 129.22.67.107 On: Sat, 22 Nov 2014 04:22:17