robert baumgartner, piotr majdak, and bernhard laback · robert baumgartner, piotr majdak, and...

1
1. INTRODUCTION Monaural spectral cues Essential for sound localization in sagittal planes – Fig. 1 Described by head-related transfer functions (HRTFs) Binaural weighting of monaural spectral cues – Fig. 2 Larger relative weight of ipsilateral side increasing with lateral eccen- tricity Morimoto (2001): experiments with uni/bilaterally occluded pinna cavities Macpherson & Sabin (2007): virtual auditory space stimuli with bilaterally competing HRTFs Continuous binaural weighting function derived by fitting (least squared error) sigmoid function to anchor points: Potential reasons for lateral dependence: Weighting acc. to interaural difference in time or level (ITD/ILD) No, lateral dependence also for stimuli with constant ITDs or ILDs (Macpherson & Sabin, 2007) Weighting acc. to reliability of spectral cues in terms of (1) spatial uniqueness (to be tested in anechoic space) (2) general robustness to diffuse background noise Approach: model predictions with various binaural weighting configurations in listening conditions with and without diffuse background noise 2. LOCALIZATION MODEL Structure of the localization model – Fig. 3 Spectral auditory processing of target and template (1) Directional transfer functions (DTFs) (2) Spectral analysis: Gammatone filter bank, temporal average, logarithmic amplitude → Spectral profile (3) PSGE: Positive spectral gradient extraction (Inspired by cat DCN func- tionality) → Internal cue representation Spatial mapping (4) Comparison process with each template entry: Absolute differences of internal cue representations averaged across frequency bands (5) Spectral sensitivity: Listener-specific ability to discriminate spectral cues → Monaural similarity indices (6) Binaural weighting: Combination of monaural similarity estima- tions relatively weighted acc. to lateral angle – Fig. 2 (7) Sensorimotor mapping: Gaussian response scatter constant in elevation accounts for lateral compression of polar dimension (8) Normalization to probability mass vector (PMV): Assumption of dis- crete distribution of similarity indices being proportional to distribution of polar-angle responses (9) Computation of expectancy values for psychoacoustic performance met- rics Evaluated for: Lateral dependence of localization performance – Fig. 4 Various effects of modifications of DTFs or target sounds on localization performance (Baumgartner et al., 2014) Implementation provided in the Auditory Modeling Toolbox (AMT; http://sf.net/projects/amtoolbox/) as baumgartner2014 3. METHODS Subjects: 23 normal-hearing listeners (14 female, 9 male, 19-46 years old) Free-field HRTFs measured individually at distance of 1.2 m for elevations from −30° to 80°, with 10°-spacing between 70° and 80°, and 5°-spacing elsewhere, and azimuths all around the listener with at least 2.5°-spacing within ±45° and 5°-spacing elsewhere Stimuli: virtual auditory space, 500 ms of white noise, 50±5 dB re hearing threshold for target sound from frontal direction Apparatus: virtual visual environment, manual pointer Training: Visual training (c.f. ego-shooter game) and auditory training (300 trials with feedback) Psychoacoustic performance metrics: Quadrant error rate (QE): Relative occurrence of target-to-response devia- tions > 90°, i.e., localization confusions. RMS local polar errors (PE): Combined measure of accuracy and precision of local responses (i.e., QE removed). Measures of predictive power of the model: e RMS : RMS of residues between actual and predicted performances r: Pearson's correlation coefficient between actual and predicted perfor- mances Configurations of binaural weighting stage: Binaural: weighting derived from psychoacoustic experiments ( Φ = 13°) Ipsilateral: only ipsilateral information considered ( Φ +0°) Contralateral: only contralateral information considered ( Φ -0°) Diffuse background noise: Gaussian white noise at various SPL added to DTF-filtered stimuli Signal-to-noise ratios (SNRs) tested within -20 to 40 dB in steps of 2 dB defined with respect to frontal direction 4. RESULTS Effect of binaural weighting in anechoic space – Tab. 1 Minor effect on predictive power Very similar average performance predicted for the ipsilateral ear only and for the contralateral ear only AMT: exp_baumgartner2014('tab3') Effect of background noise – Fig. 5 Contralateral (re ipsilateral) degradation beginning at SNRs < 20 dB and most prominent at SNRs around 0 dB Increasing degradation with increasing eccentricity AMT: exp_baumgartner2014('fig5_baumgartner2015aro') 5. DISCUSSION Minor effect of binaural weighting in noiseless anechoic environment: Indication for similarity of spatial uniqueness between ipsi- and contralat- eral cues Potential limitation of our approach: absolute hearing threshold not mod- eled, potential degradation of contralateral cues in case of quiet sounds Decreasing reliability of contralateral cues with increasing lateral eccentric- ity in noisy environment: Consistent with Shinn-Cunningham et al. (2005) who found increasing spectral magnitude derivatives (i.e., decreasing smoothness) in contralat- eral BRIRs with increasing lateral eccentricity in reverberant space Most pronounced at SNRs around 0 dB due to ceiling and floor effects at low and high SNRs, respectively 6. CONCLUSIONS In noiseless anechoic space, contralateral spectral cues provide similar spa- tial uniqueness as ipsilateral cues. In noisy environments, the contralateral degradation in reliability increases with lateral eccentricity and is most pronounced at SNRs around 0 dB. Lateral dependence of binaural weighting seems to be a consequence of degraded robustness in noisy environments rather than degraded spa- tial uniqueness of contralateral spectral cues. 7. REFERENCES Baumgartner, R., Majdak, P., Laback, B. (2014).”Modeling sound-source localization in sagittal planes for human listeners.” J Acoust Soc Am 136, 791-802. Macpherson, E. A., and Sabin, A.T. (2007). "Binaural weighting of monaural spectral cues for sound local- ization." J Acoust Soc Am 121, 3677-3688. Morimoto (2001). “The contribution of two ears to the perception of vertical angle in sagittal planes.” J Acoust Soc Am 109, 1596-1603. Shinn-Cunningham, B.G., Kopco, N., Martin, T.J. (2005). "Localizing nearby sound sources in a classroom: Binaural room impulse responses." J Acoust Soc Am 117, 3100-3115. The Reliability of Contralateral Spectral Cues for Sound Localization in Sagittal Planes Robert Baumgartner, Piotr Majdak, and Bernhard Laback Acoustics Research Institute, Austrian Academy of Sciences, Austria PS-133 38 th Annual Mid- Winter Meeting of the Association for Research in Otolaryngology February 21-25 2015 Baltimore, MD Electronic copy Corresponding author: Robert Baumgartner, Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Wien, Austria E-Mail: [email protected] http://www.kfs.oeaw.ac.at This work was supported by the Austrian Science Fund (FWF P 24124). Fig. 3: Structure of the localization model. Tab. 1: Effect of binaural weighting on residues (e RMS ) and correlations (r) of predictions, and pre- dicted across-listener average of performance metrics (Avg.). Note the remarkably small difference between the ipsilateral and contralateral condition. RMS local polar errors Quadrant error rate eRMS r Avg. eRMS r Avg. Binaural 3.4° 0.72 32.6° 3.4% 0.81 9.4% Ipsilateral 3.4° 0.72 32.5° 3.4% 0.80 9.2% Contralateral 3.3° 0.71 32.6° 4.7% 0.77 10.6% Fig. 2: Binaural weighting functions. A: Functions derived from results from [1] Morimoto (2001), and [2] Macpherson & Sabin (2007). B: Ipsilateral only. C: Contralateral only. Fig. 1: Interaural-polar coordinate system. Polar angle Lateral angle A B C Fig. 4: Lateral dependence of localization performance: Experimental results vs. model predictions. Fig. 5: Effect of background noise on reliability of contralateral cues for various lateral eccentrici- ties. Top row: Across-listener averages of performance measures for contralateral ear. Bottom row: Contralateral re ipsilateral averages of performance measures. w left ( φ )= ( 1 + e φ Φ ) 1 and w right ( φ )= 1w left ( φ ) with Φ=13 °

Upload: others

Post on 11-Aug-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Robert Baumgartner, Piotr Majdak, and Bernhard Laback · Robert Baumgartner, Piotr Majdak, and Bernhard Laback Acoustics Research Institute, Austrian Academy of Sciences, Austria

1. INTRODUCTION

➢ Monaural spectral cues

• Essential for sound localization in sagittal planes – Fig. 1

• Described by head-related transfer functions (HRTFs)

➢ Binaural weighting of monaural spectral cues – Fig. 2

• Larger relative weight of ipsilateral side increasing with lateral eccen-tricity

• Morimoto (2001): experiments with uni/bilaterally occluded pinna cavities

• Macpherson & Sabin (2007): virtual auditory space stimuli with bilaterally competing HRTFs

• Continuous binaural weighting function derived by fitting (least squared error) sigmoid function to anchor points:

➢ Potential reasons for lateral dependence:

• Weighting acc. to interaural difference in time or level (ITD/ILD)

No, lateral dependence also for stimuli with constant ITDs or ILDs (Macpherson & Sabin, 2007)

• Weighting acc. to reliability of spectral cues in terms of

(1) spatial uniqueness (to be tested in anechoic space)

(2) general robustness to diffuse background noise

➢ Approach: model predictions with various binaural weighting configurations in listening conditions with and without diffuse background noise

2. LOCALIZATION MODEL

➢ Structure of the localization model – Fig. 3

• Spectral auditory processing of target and template

(1) Directional transfer functions (DTFs)

(2) Spectral analysis: Gammatone filter bank, temporal average, logarithmic amplitude → Spectral profile

(3) PSGE: Positive spectral gradient extraction (Inspired by cat DCN func-tionality) → Internal cue representation

• Spatial mapping

(4) Comparison process with each template entry: Absolute differences of internal cue representations averaged across frequency bands

(5) Spectral sensitivity: Listener-specific ability to discriminate spectral cues → Monaural similarity indices

(6) Binaural weighting: Combination of monaural similarity estima-tions relatively weighted acc. to lateral angle – Fig. 2

(7) Sensorimotor mapping: Gaussian response scatter constant in elevation accounts for lateral compression of polar dimension

(8) Normalization to probability mass vector (PMV): Assumption of dis-crete distribution of similarity indices being proportional to distribution of polar-angle responses

(9) Computation of expectancy values for psychoacoustic performance met-rics

➢ Evaluated for:

• Lateral dependence of localization performance – Fig. 4

• Various effects of modifications of DTFs or target sounds on localization performance (Baumgartner et al., 2014)

➢ Implementation provided in the Auditory Modeling Toolbox (AMT; http://sf.net/projects/amtoolbox/) as baumgartner2014

3. METHODS

➢ Subjects: 23 normal-hearing listeners (14 female, 9 male, 19-46 years old)

➢ Free-field HRTFs measured individually at distance of 1.2 m for elevations from −30° to 80°, with 10°-spacing between 70° and 80°, and 5°-spacing elsewhere, and azimuths all around the listener with at least 2.5°-spacing within ±45° and 5°-spacing elsewhere

➢ Stimuli: virtual auditory space, 500 ms of white noise, 50±5 dB re hearing threshold for target sound from frontal direction

➢ Apparatus: virtual visual environment, manual pointer

➢ Training: Visual training (c.f. ego-shooter game) and auditory training (300 trials with feedback)

➢ Psychoacoustic performance metrics:

• Quadrant error rate (QE): Relative occurrence of target-to-response devia-tions > 90°, i.e., localization confusions.

• RMS local polar errors (PE): Combined measure of accuracy and precision of local responses (i.e., QE removed).

➢ Measures of predictive power of the model:

• eRMS: RMS of residues between actual and predicted performances

• r: Pearson's correlation coefficient between actual and predicted perfor-mances

➢ Configurations of binaural weighting stage:

• Binaural: weighting derived from psychoacoustic experiments (Φ = 13°)

• Ipsilateral: only ipsilateral information considered (Φ → +0°)

• Contralateral: only contralateral information considered (Φ → -0°)

➢ Diffuse background noise:

• Gaussian white noise at various SPL added to DTF-filtered stimuli

• Signal-to-noise ratios (SNRs) tested within -20 to 40 dB in steps of 2 dB defined with respect to frontal direction

4. RESULTS

➢ Effect of binaural weighting in anechoic space – Tab. 1

• Minor effect on predictive power

• Very similar average performance predicted for the ipsilateral ear only and for the contralateral ear only

• AMT: exp_baumgartner2014('tab3')➢ Effect of background noise – Fig. 5

• Contralateral (re ipsilateral) degradation beginning at SNRs < 20 dB and most prominent at SNRs around 0 dB

• Increasing degradation with increasing eccentricity

• AMT: exp_baumgartner2014('fig5_baumgartner2015aro')

5. DISCUSSION

➢ Minor effect of binaural weighting in noiseless anechoic environment:

• Indication for similarity of spatial uniqueness between ipsi- and contralat-eral cues

• Potential limitation of our approach: absolute hearing threshold not mod-eled, potential degradation of contralateral cues in case of quiet sounds

➢ Decreasing reliability of contralateral cues with increasing lateral eccentric-ity in noisy environment:

• Consistent with Shinn-Cunningham et al. (2005) who found increasing spectral magnitude derivatives (i.e., decreasing smoothness) in contralat-eral BRIRs with increasing lateral eccentricity in reverberant space

• Most pronounced at SNRs around 0 dB due to ceiling and floor effects at low and high SNRs, respectively

6. CONCLUSIONS

➢ In noiseless anechoic space, contralateral spectral cues provide similar spa-tial uniqueness as ipsilateral cues.

➢ In noisy environments, the contralateral degradation in reliability increases with lateral eccentricity and is most pronounced at SNRs around 0 dB.

➢ Lateral dependence of binaural weighting seems to be a consequence of degraded robustness in noisy environments rather than degraded spa-tial uniqueness of contralateral spectral cues.

7. REFERENCES

Baumgartner, R., Majdak, P., Laback, B. (2014).”Modeling sound-source localization in sagittal planes for human listeners.” J Acoust Soc Am 136, 791-802.

Macpherson, E. A., and Sabin, A.T. (2007). "Binaural weighting of monaural spectral cues for sound local-ization." J Acoust Soc Am 121, 3677-3688.

Morimoto (2001). “The contribution of two ears to the perception of vertical angle in sagittal planes.” J Acoust Soc Am 109, 1596-1603.

Shinn-Cunningham, B.G., Kopco, N., Martin, T.J. (2005). "Localizing nearby sound sources in a classroom: Binaural room impulse responses." J Acoust Soc Am 117, 3100-3115.

The Reliability of Contralateral Spectral Cues for Sound Localization in Sagittal Planes

Robert Baumgartner, Piotr Majdak, and Bernhard LabackAcoustics Research Institute, Austrian Academy of Sciences, Austria

PS-13338th Annual Mid-

Winter Meeting of the

Association for Research inOtolaryngology

February 21-25 2015

Baltimore, MD

Electronic copy

Corresponding author: Robert Baumgartner, Acoustics Research Institute, Austrian Academy of Sciences, Wohllebengasse 12-14, A-1040 Wien, Austria

E-Mail: [email protected] http://www.kfs.oeaw.ac.at

This work was supported by the Austrian Science Fund (FWF P 24124).

Fig. 3: Structure of the localization model.

Tab. 1: Effect of binaural weighting on residues (eRMS) and correlations (r) of predictions, and pre-dicted across-listener average of performance metrics (Avg.). Note the remarkably small difference between the ipsilateral and contralateral condition.

RMS local polar errors Quadrant error rate

eRMS r Avg. eRMS r Avg.

Binaural 3.4° 0.72 32.6° 3.4% 0.81 9.4%

Ipsilateral 3.4° 0.72 32.5° 3.4% 0.80 9.2%

Contralateral 3.3° 0.71 32.6° 4.7% 0.77 10.6%

Fig. 2: Binaural weighting functions. A: Functions derived from results from [1] Morimoto (2001), and [2] Macpherson & Sabin (2007). B: Ipsilateral only. C: Contralateral only.

Fig. 1: Interaural-polar coordinate system.

Polar angle

Lateral angle

A

B

C

Fig. 4: Lateral dependence of localization performance: Experimental results vs. model predictions.

Fig. 5: Effect of background noise on reliability of contralateral cues for various lateral eccentrici-ties. Top row: Across-listener averages of performance measures for contralateral ear. Bottom row: Contralateral re ipsilateral averages of performance measures.

w left(φ )=(1+e−

φΦ )

−1

and w right(φ )=1−wleft(φ ) with Φ=13 °