2011 binaural analysis/synthesis of interior aircraft sounds

Upload: philippe-aubert-gauthier

Post on 03-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 2011 BINAURAL ANALYSIS/SYNTHESIS OF INTERIOR AIRCRAFT SOUNDS

    1/4

    2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 16-19, 2011, New Paltz, NY

    BINAURAL ANALYSIS/SYNTHESIS OF INTERIOR AIRCRAFT SOUNDS

    Charles Verron Philippe-Aubert Gauthier Jennifer Langlois Catherine Guastavino

    Multimodal Interaction Lab - McGill University - Montreal, CanadaCentre for Interdisciplinary Research on Music Media and Technology - Montreal, Canada

    Groupe Acoustique Universit de Sherbrooke - Sherbrooke, Canada

    ABSTRACT

    A binaural sinusoids+noise synthesis model is proposed for repro-ducing interior aircraft sounds. First, a method for spectral and spa-tial characterization of binaural interior aircraft sounds is presented.This characterization relies on a stationarity hypothesis and involvesfour estimators: left and right power spectra, interaural coherenceand interaural phase difference. Then we present two extensions of the classical sinusoids+noise model for the analysis and synthesis

    of stationary binaural sounds. First, we propose a binaural estima-tor using relevant information in both left and right channels forpeak detection. Second, the residual modeling is extended to inte-grate two interaural spatial cues, namely coherence and phase dif-ference. The resulting binaural sinusoids+noise model is evaluatedon a recorded aircraft sound.

    Index Terms Binaural modeling, aircraft sound, spectral en-velope, interaural coherence, interaural phase difference.

    1. INTRODUCTION

    Simulation of aircraft noises has received an increased attentionin the last decade, primarily as a means to generate stimuli forstudies on noise annoyance to aircraft yover noises. Synthesis

    methods have been proposed to reproduce such sounds by broad-band, narrowband, and sinusoidal components, including the time-varying aircraft position relative to the observer, directivity patterns,Doppler shift, atmospheric and ground effects [1, 2]. Realistic syn-thesis of interior aircraft sounds have, however, received less at-tention, despite its potential for exible sound generation in ightsimulators. Sources of noise in aircraft cabins were reviewed in[3]. The primary sources are the aircraft engines and the turbulentboundary layer noise. A promising approach for simulating aircraftsounds as a combination of sinusoidal and noisy components havebeen proposed in [4] for monophonic signals. Compared to rawaircraft recordings, such model has the advantage to enable para-metric transformations: individual modications of the sinusoidaland noisy components allow to investigate precisely their impact onpassengers comfort.

    In this paper, we investigate the analysis/synthesis of binauralinterior aircraft sounds. Our contribution here is to extend previousmodels to reproduce both spectral and spatial (binaural) character-istics of the aircraft sounds. To do so, we propose an extension tothe sinusoids+noise analysis/synthesis model: we present a binauralpeak detection estimator and propose two additional binaural cuesfor the residual modeling: the interaural coherence and the interau-ral phase difference.

    Correspondance: [email protected]

    Figure 1: Binaural recordings positions in a CRJ900 Bombardieraircraft. Three rows have been recorded (04, 12, and 22) with 7positions per row (indicated by green dots).

    The paper is organized as follows: rst we present spectraland spatial descriptors used for characterization of binaural aircraftsounds, then we present the binaural analysis/synthesis algorithmand its evaluation on a recorded aircraft sound.

    2. CHARACTERIZATION OF BINAURAL AIRCRAFTSOUNDS

    The original sound recordings were provided by Bombardier.Twenty-one binaural recordings were made at different positionsinside a CRJ900 Bombardier aircraft (see gure 1). The data wererecorded at 16 bits and 48kHz with a SQuadriga recorder from Head Acoustics with binaural microphones BHS I Binaural Head-set mounted on a human head. All recordings were 16 seconds longand the ight conditions were constant across the measurements:height 35,000 feet, speed Mach 0.77. To characterize these binau-ral recordings, we rst compute the short-time Fourier transform(STFT) dened for a monophonic signal x [n ],

    m [1; M ],k [0; N a

    1] by:

    X (m, k ) =N a 1

    n =0

    wa (n )x (mM a + n )e j 2

    kN a

    n

    where wa is an analysis window of size N a , M a is the analysis hopsize and M is the total number of blocs. STFT X L and X R arecalculated for Left and Right signals respectively. Figure 4 illus-trates the magnitude of the STFT for the binaural signal recordedin the aircraft at position SAG (row 22). It shows that interioraircraft sounds have stationary spectral properties. They contain

  • 8/11/2019 2011 BINAURAL ANALYSIS/SYNTHESIS OF INTERIOR AIRCRAFT SOUNDS

    2/4

    2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 16-19, 2011, New Paltz, NY

    time (s)

    f r e q u e n c y

    ( H z

    )

    0 5 10 15

    10

    100

    1k

    10k

    time (s)

    f r e q u e n c y

    ( H z

    )

    0 5 10 15

    10

    100

    1k

    10k

    ! 80

    ! 60

    ! 40

    ! 20

    0

    Figure 2: Binaural signal recorded at position SAG (row 22). The time-frequency representations (N a = 8192) of the left and right signalsexhibit the salient components of interior aircraft sounds: spectral lines (sinusoids) and stationary broadband noise. The presence of sinusoidalcomponents with very close frequency (e.g., around 100Hz) appears as a slow amplitude modulation.

    sinusoidal components (spectral lines) and broadband backgroundnoise. The presence of sinusoids at very close frequencies can beseen as slow amplitude modulation in the STFT. We will see in sec-tion 3.2 that detecting precisely these close frequencies require avery high frequency precision. A compact representation can beobtained by characterizing binaural aircraft sounds in terms of sta-tionary spectral and spatial cues. For each sound we compute leftand right power spectra (also called spectral envelopes throughoutthe text) and two frequency-dependant binaural cues: the interau-ral coherence (IC) and interaural phase difference (IPD). IPD anddifferences between left and right power spectra are related to theperceived position of sound sources while IC is related to perceivedauditory source width [5, 6]. Due to the stochastic nature of aircraftsounds, a single short-time spectrum is not sufcient to estimatethese quantities properly. However the estimation can be done byaveraging short-time spectra across time, as presented below.

    The power spectrum is estimated with the Welchs method [7].The STFT (calculated with an analysis hop size M a = N a2 ) is time-averaged to get the estimate of the power spectrum:

    S (k) = 1M M

    m =1

    | X (m, k ) |2

    Spectral envelopes S L and S R are calculated for Left and Rightsignals respectively. Note that the frequency-dependant interaurallevel difference (ILD) can be deduced from the binaural spectralenvelopes by: I LD (k) = S R (k) S L (k) .The coherence function [8] gives a measure of correlation be-tween two signals, per frequency bin. It is a real function of fre-quency, with values ranging between 0 and 1. Here it is estimatedbetween Right and Left signals by:

    IC (k) = | M m =1 X R (m, k ) X L (m, k ) |

    2

    M m =1 | X R (m, k ) |2

    M m =1 | X L (m, k ) |2

    The phase difference between Right and Left signals is esti-mated by:

    IP D (k ) = M

    m =1

    X R (m, k ) X L (m, k )

    When analyzing these signal properties, the choice of the analy-sis window is critical since it has a direct impact on the spectral res-olution. For our study we considered the digital prolate spheroidalwindow (DPSW) family which is a particular case of the discreteprolate spheroidal sequences developed by Slepian [9]. We choose

    an optimal frequency concentration under f c = 3. 5N a , resulting ina DPSW window having nearly all its energy contained in its rstlobe which is 7-point large.

    Based on these spectral and spatial estimators, we propose asinusoids+noise model for interior aircraft sounds: a binaural sinu-soidal extraction method is proposed, then the residual is modeledin terms of spectral envelopes, IC and IPD cues.

    3. SYNTHESIS MODEL

    3.1. Sinusoids+noise model

    In [10] an analysis/synthesis system based on a deterministic plusstochastic decomposition of a monophonic sound was presented.The deterministic part d(t ) is a sum of I (t ) sinusoids whose in-stantaneous amplitude a i (t ) and frequency f i (t ) vary slowly intime, while the stochastic part is modeled as a time-varying l-tered noise s (t ). This deterministic plus stochastic modeling (alsocalled sinusoids+noise model) has been used extensively for analy-sis, parametric transformation and synthesis of speech, musical andenvironmental sounds (see for example [11]). Extracting relevant

    information from both left and right channels can improve the re-liability of the sinusoids+noise analysis. This was used in [12] toimprove partial tracking. Here we propose a binaural peak detectionmethod using combined left/right information. We also extend theresidual modeling to the binaural case, by considering IC and IPDcues.

    3.2. Binaural extraction of stationary sinusoidal components

    Among the various estimation methods available in the literature forpower spectrum estimation and sinusoidal detection (see for exam-ple [13, 14, 15]) we choose the modied periodogram rst proposedby Welch [7] (see section 2). The proposed binaural peak detectionis performed on an average (binaural) Welch spectral estimator:

    S LR (k) = 1M

    M

    m =1

    | X L (m, k ) |2 + | X R (m, k ) |2

    2 (1)

    A sinusoid is detected in S LR (k) at frequency k i if three con-ditions are satised:

    S LR (k i ) is a local maximum k [k i 2K, k i ] | 20 log (S LR (k i )) > 20 log (S LR (k )) +

    k [k i , k i +2 K ] | 20 log (S LR (k i )) > 20 log (S LR (k )) +

    where is the peak detection threshold, set to 9 (dB) in thisstudy.

  • 8/11/2019 2011 BINAURAL ANALYSIS/SYNTHESIS OF INTERIOR AIRCRAFT SOUNDS

    3/4

  • 8/11/2019 2011 BINAURAL ANALYSIS/SYNTHESIS OF INTERIOR AIRCRAFT SOUNDS

    4/4

    2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics October 16-19, 2011, New Paltz, NY

    further investigated in a formal listening test reported in [17]. Theresults conrmed that the original and resynthesized sounds wereindistinguishable for a window size greater or equal to 1024 sam-ples.

    101

    102

    103

    104

    ! 60

    ! 40

    ! 20

    0

    p o w e r s p e c

    t r u m

    L ( d B )

    refsyntherror

    101

    102

    103

    104

    ! 60

    ! 40

    ! 20

    0

    p o w e r s p e c

    t r u m

    R ( d B )

    101

    102

    103

    104

    0

    0.2

    0.4

    0.6

    0.8

    1

    I C

    101

    102

    103

    104

    ! 2

    0

    2

    frequency (Hz)

    I P D

    ( r a

    d )

    Figure 4: Result of the binaural analysis/synthesis algorithm for abinaural signal recorded at position SAG (row 22) in the aircraft.

    4. CONCLUSION

    We presented a binaural model for interior aircraft sounds. Theanalysis stage uses a binaural peak detection to extract sinusoidal

    components and the residual is modeled by spectral envelopes, in-teraural coherence and phase difference. The model was validatedon in-ight binaural recordings. Our method reproduces correctlythe spectral and spatial cues of the original sounds. A formal listen-ing test is presented in a companion paper [17] to further validatethe model perceptually.

    5. ACKNOWLEDGEMENT

    This research was jointly supported by an NSERC grant (CRDPJ357135-07) and research funds from CRIAQ, Bombardier and CAE

    to A. Berry and C. Guastavino.

    6. REFERENCES

    [1] D. A. McCurdy and R. E. Grandle, Aircraft noise synthesissystem, NASA technical memorandum 89040, 1987.

    [2] D. Berckmans, K. Janssens, H. V. der Auweraer, P. Sas, andW. Desmet, Model-based synthesis of aircraft noise to quan-tify human perception of sound quality and annoyance, Jour-nal of Sound and Vibration , vol. 311, no. 3-5, pp. 11751195,2008.

    [3] J. F. Wilby, Aircraft interior noise, Journal of Sound and Vibration , vol. 190, no. 3, pp. 545564, 1996.

    [4] K. Janssens, A. Vecchio, and H. V. der Auweraer, Synthesisand sound quality evaluation of exterior and interior aircraftnoise, Aerospace Science and Technology , vol. 12, no. 1, pp.114124, 2008.

    [5] J. Blauert, Spatial Hearing . The MIT press edition, 1997.

    [6] C. Faller, Parametric multichannel audio coding: synthesisof coherence cues, IEEE Transactions on Audio, Speech, and Language Processing , vol. 14, no. 1, pp. 299310, 2006.

    [7] P. D. Welch, The Use of Fast Fourier Transform for the Esti-mation of Power Spectra: A Method Based on Time Averag-ing Over Short, Modied Periodograms, IEEE Transactionson Audio and Electroacoustics , vol. 15, pp. 7073, 1967.

    [8] J. O. Smith, Mathematics of the Discrete Fourier Transform(DFT) . W3K Publishing, 2007. [Online]. Available: http: //www.w3k.org/books/

    [9] D. Slepian, Prolate spheroidal wave functions, Fourier anal-ysis, and uncertainty. V- The discrete case, Bell System Tech-nical Journal , vol. 57, pp. 13711430, 1978.

    [10] X. Serra and J. O. Smith, Spectral modeling synthesis: Asound analysis/synthesis system based on a deterministic plus

    stochastic decomposition, Computer Music Journal , vol. 14,no. 4, pp. 1224, 1990.

    [11] J. W. Beauchamp, Ed., Analysis, Synthesis, and Perception of Musical Sounds: Sound of Music . Springer, 2007.

    [12] M. Raspaud and G. Evangelista, Binaural partial tracking,in Proc. of the 11th Int. Conference on Digital Audio Effects(DAFx08) , 2008.

    [13] J. G. Proakis and D. G. Manolakis, Digital signal processing:Principles, algorithms, and applications , 3rd ed. EnglewoodCliffs, NJ.: Prentice Hall International, 1996.

    [14] H. C. So, Y. T. Chan, Q. Ma, and P. C. Ching, Comparisonof various periodograms for sinusoid detection and frequencyestimation, IEEE Transactions on Aerospace and Electronic

    Systems , vol. 35, pp. 945950, 1999.[15] S. Marchand, Advances in spectral modeling of musical

    sound, Habilitation Diriger des Recherches, Universit Bor-deaux 1, 2008.

    [16] [Online]. Available: http://mil.mcgill.ca/waspaa2011/model/

    [17] J. Langlois, C. Verron, P.-A. Gauthier, and C. Guastavino,Perceptual evaluation of interior aircraft sound models, inProceedings of the IEEE Workshop on Applications of SignalProcessing to Audio and Acoustics (WASPAA 2011) , 2011.