Noise and Echo Control forImmersive Voice Communication in Spacesuits
Noise and Echo Control forImmersive Voice Communication in Spacesuits
9/2/2010
Yiteng (Arden) Huang
WeVoice, Inc., Bridgewater, New Jersey, USA
Presented as a keynote speech on the International Workshop on Acoustic Echo and Noise Control (IWAENC) in Tel Aviv, Israelon September 2, 2010
2 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
About the Project
Financially sponsored by the NASA SBIR (Small Business Innovation Research) program
Phase I feasibility research: Jan. 2008 – July 2008
Phase II prototype development: Jan. 2009 – Jan. 2011
Other team members:
• Jingdong Chen, WeVoice, Inc., Bridgewater, New Jersey, USA
• Scott Sands, NASA Glenn Research Center (GRC), Cleveland, Ohio, USA
• Jacob Benesty, University of Quebec, Montreal, Quebec, Canada
Financially sponsored by the NASA SBIR (Small Business Innovation Research) program
Phase I feasibility research: Jan. 2008 – July 2008
Phase II prototype development: Jan. 2009 – Jan. 2011
Other team members:
• Jingdong Chen, WeVoice, Inc., Bridgewater, New Jersey, USA
• Scott Sands, NASA Glenn Research Center (GRC), Cleveland, Ohio, USA
• Jacob Benesty, University of Quebec, Montreal, Quebec, Canada
3 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Outline
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
4 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Section 1
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
5 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Requirements of In-Suit Audio
Speech Quality and Intelligibility:
90% word identification rate
Hearing Protection:
Limits total noise dose, hazard noise, and on-orbit continuous and impulse noise for waking and sleeping periods
Noise loads are very high during launch and orbital maneuvers.
Audio Control and Interfaces:
Provides manual silencing features and volume controls
Operation at Non-Standard Barometric Pressure Levels (BPLs):
Operates effectively between 30 kPa and 105 kPa
Speech Quality and Intelligibility:
90% word identification rate
Hearing Protection:
Limits total noise dose, hazard noise, and on-orbit continuous and impulse noise for waking and sleeping periods
Noise loads are very high during launch and orbital maneuvers.
Audio Control and Interfaces:
Provides manual silencing features and volume controls
Operation at Non-Standard Barometric Pressure Levels (BPLs):
Operates effectively between 30 kPa and 105 kPa
6 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Current In-Suit Audio System
Chin Cup
MicrophoneModule
Microphone Boom
Skullcap
PerspirationAbsorptionArea
Helmet
Helmet Ring
Earpiece
Current Solution: Communication Carrier Assembly (CCA) Audio System
Current Solution: Communication Carrier Assembly (CCA) Audio System
7 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Extravehicular Mobility Unit (EMU) CCA
• For shuttle and International Space
Station (ISS) operations
• For shuttle and International Space
Station (ISS) operations
Source: O. Sands, NASA GRC
Interconnect wiring Nylon/spondex top
Teflon sidepiece and pocket
Electret Microphone
Interface cable and connector
Electret Microphon
e
Ear seal
Ear cup
• A large gain applied to the outbound
speech for sufficient sound volume at
low static pressure levels (30 kPa)
leads to clipping and strong distortion
during operations near sea-level
BPL.
• A large gain applied to the outbound
speech for sufficient sound volume at
low static pressure levels (30 kPa)
leads to clipping and strong distortion
during operations near sea-level
BPL.
8 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Advanced Crew Escape Suit (ACES) CCA
Source: O. Sands, NASA GRC
Dynamic Microphones
• For shuttle launch and entry operations• For shuttle launch and entry operations
• Hearing protection provided by the ACES
CCA may not be sufficient.
• Hearing protection provided by the ACES
CCA may not be sufficient.
9 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Developmental CCA
• The active earpieces will be used in conjunction with the CCA ear cups during launch
and other high noise events and can be removed for other suited operations.
• The active earpieces alone nearly provide the required level of hearing protection.
• The active earpieces will be used in conjunction with the CCA ear cups during launch
and other high noise events and can be removed for other suited operations.
• The active earpieces alone nearly provide the required level of hearing protection.
Noise Canceling
Microphones
Active In-Canal Earpieces
Sou
rce:
O. S
ands
, NA
SA
GR
C
Sou
rce:
O. S
ands
, NA
SA
GR
C
Ear Cups
10 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
CCA Systems: Pros
• High outbound speech intelligibility and quality, SNR near optimum
Use close-talking microphones
A high degree of acoustic isolation between the in-suit noise and the suit subject’s vocalizations
A high degree of acoustic isolation between the inbound and outbound signals
The human body does NOT transmit vibration-borne noise
• Provide very good hearing protection.
• High outbound speech intelligibility and quality, SNR near optimum
Use close-talking microphones
A high degree of acoustic isolation between the in-suit noise and the suit subject’s vocalizations
A high degree of acoustic isolation between the inbound and outbound signals
The human body does NOT transmit vibration-borne noise
• Provide very good hearing protection.
11 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
• The microphones need to be close to the mouth of a suited subject.
• A number of recognized logistical issues and inconveniences:
Cannot adjust the cap and the microphone booms during EVA operations, which can last from 4 to 8 hours
The close-talking microphones interfere with the suited subject’s eating and drinking, and are susceptible to contamination.
The communication cap needs to fit well. Caps in a variety of different sizes need to be built and maintained, e.g., 5 sizes for EMU caps.
Wire fatigue for the microphone booms
• These problems cannot be resolved with incremental improvements to the basic
design of the CCA systems.
• The microphones need to be close to the mouth of a suited subject.
• A number of recognized logistical issues and inconveniences:
Cannot adjust the cap and the microphone booms during EVA operations, which can last from 4 to 8 hours
The close-talking microphones interfere with the suited subject’s eating and drinking, and are susceptible to contamination.
The communication cap needs to fit well. Caps in a variety of different sizes need to be built and maintained, e.g., 5 sizes for EMU caps.
Wire fatigue for the microphone booms
• These problems cannot be resolved with incremental improvements to the basic
design of the CCA systems.
CCA Systems: Cons
12 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Stakeholder Interviews
• The CCA ear cups produce pressure points that cause discomfort.
• Microphone arrays and helmet speakers are suggested to be used.
• Suit subject comfort should be maximized as much as possible, given that other constraints can be met (relaxed and traded off):
Clear two-way voice communications
Hearing protection from the fan noise in the life support system ventilation loop
Properly containing and managing hair and sweat inside the helmet
Adequate SNR for the potential use of automatic speech recognition for the suit’s information system
• The CCA ear cups produce pressure points that cause discomfort.
• Microphone arrays and helmet speakers are suggested to be used.
• Suit subject comfort should be maximized as much as possible, given that other constraints can be met (relaxed and traded off):
Clear two-way voice communications
Hearing protection from the fan noise in the life support system ventilation loop
Properly containing and managing hair and sweat inside the helmet
Adequate SNR for the potential use of automatic speech recognition for the suit’s information system
13 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Two Alternative Architectural Options for In-Suit Audio
1. Integrated Audio (IA):
Instead of being
housed in a separate
subassembly, both the
microphones and the
speakers are
integrated into the
suit/helmet.
2. Hybrid Approach:
Employs the inbound
portion of a CCA
system with the
outbound portion of an
IA system.
1. Integrated Audio (IA):
Instead of being
housed in a separate
subassembly, both the
microphones and the
speakers are
integrated into the
suit/helmet.
2. Hybrid Approach:
Employs the inbound
portion of a CCA
system with the
outbound portion of an
IA system.
Helmet Speaker
14 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Section 2
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
15 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Noise from Outside the Spacesuit
• During launch, entry descent, and landing:
Impulse noise < 140 dBSPL, Hazard noise < 105 dBA
• On orbit:
Impulse noise: < 140 dBSPL waking hours and < 83 dBSPL sleeping
Limits on continuous on-orbit noise levels by frequency:
• Remark: During EVA operations, ambient noise is at most a minor problem.
• During launch, entry descent, and landing:
Impulse noise < 140 dBSPL, Hazard noise < 105 dBA
• On orbit:
Impulse noise: < 140 dBSPL waking hours and < 83 dBSPL sleeping
Limits on continuous on-orbit noise levels by frequency:
• Remark: During EVA operations, ambient noise is at most a minor problem.
Band Center Frequency (Hz) 63 125 250 500 1k 2k 4k 8k 16k
Sound Pressure Level (dB) 72 65 60 56 53 51 50 48 48
SPL (dB) 85 – 95 75 – 85 65 – 75 55 – 65
PerceptionVery High Noise: speech almost
impossible to hear
High Noise: speech is difficult to hear
Medium Noise: Must Raise Voice to
be Heard
Low Noise: speech is easy
to hear
Typical
Environments
Construction SiteLoud Machine ShopNoisy Manufacturing
Assembly LineCrowded Bus/Transit
Waiting AreaVery Noisy Restaurant/Bar
Department StoreBand/Public Area
Supermarket
Doctor’s OfficeHospital
Hotel Lobby
16 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Structure-Borne Noise Inside the Spacesuit
• Four noise sources (Begault & Hieronymus 2007):
1. Airflow and air inlet hissing noise, as well as fan/pump noise due to required air supply and circulation
2. Arm, leg, and hip bearing noise
3. Suit-impact noise, e.g., footfall
4. Swishing-like noise due to air movement caused by walking (since the suits are closed pressure environments)
• Four noise sources (Begault & Hieronymus 2007):
1. Airflow and air inlet hissing noise, as well as fan/pump noise due to required air supply and circulation
2. Arm, leg, and hip bearing noise
3. Suit-impact noise, e.g., footfall
4. Swishing-like noise due to air movement caused by walking (since the suits are closed pressure environments)
• For CCA systems, since the suit subject’s body does not transmit bearing and impact noise, only airflow-related noise needs to be controlled.
• For Integrated Audio (IA) systems, microphones are mounted directly on the suit structure and vibration noise is loud.
• For CCA systems, since the suit subject’s body does not transmit bearing and impact noise, only airflow-related noise needs to be controlled.
• For Integrated Audio (IA) systems, microphones are mounted directly on the suit structure and vibration noise is loud.
17 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Acoustic Challenges
• Complicated noise field:
Temporal domain: Has both stationary and non-stationary noise
Spectral domain: Inherently wideband
Spatial domain: Near field; Possibly either directional or dispersive
• Highly reverberant enclosure:
The helmet is made of highly reflective materials.
Strong reverberation dramatically reduces the intelligibility of speech uttered by the suit subject and degrades the performance of an automatic speech recognizer.
Strong reverberation leads to a more dispersive noise field, which makes beamforming less effective.
18 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Section 3
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
19 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
4
3
2
1
Proposed Noise Control Scheme for IA/Hybrid Systems
5
Adaptive Noise
Cancellation
Adaptive Noise
Cancellation
Beamforming
MultichannelNoise
Reduction
Acoustic Source Localization
Acoustic Source Localization
Head Position Calibration
Head Position CalibrationHead Motion
Tracker
Single Channel
Noise Reduction
Single Channel
Noise Reduction
Outbound Speech
Mouth range and incident angle with respect to the microphone array
Noise Reference
Microphone Array
20 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Current Research Focus
4
3
2
1 Beamforming
MultichannelNoise
Reduction
Single Channel
Noise Reduction
Single Channel
Noise Reduction
Outbound Speech
Microphone Array
21 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Beamforming: Far-Field vs. Near-Field
. . .d
θ
...hN h2 h1...
Σ
Y(f, ψ, rs)
XN(f) X2(f) X1(f)
ψ
Far-Field NoisePlane Waves
…V(f, ψ)
S(f, rs)
Near-Field Sound Source
rs
12N
. . .
…
d
ψ
(N-1
)·d·co
s(ψ)
Plane Waves
θ
...hN h2 h1...
Σ
Y(f, ψ, θ)
XN(f) X2(f) X1(f)
…
S(f, θ)
V(f, ψ)
Far-Field NoiseFar-Field Sound Source
of Interest
12N
22 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Fixed Beamformer vs. Adaptive Beamformer
Microphone Array BeamformersMicrophone Array Beamformers
Fixed BeamformersFixed Beamformers Adaptive BeamformersAdaptive Beamformers
Delay-and-SumDelay-and-Sum Filter-and-SumFilter-and-Sum MVDR (Capon)MVDR (Capon) LCMV (Frost)/GSCLCMV (Frost)/GSC
Noise Field?Stationary, Known before the design Time Varying, Unknown
Isotropic noise generally assumed
Reverberation?Not Concerned Significant
Delay-and-Sum
• Simple
• Non-uniform directional responses over a wide spectrum of frequencies
Delay-and-Sum
• Simple
• Non-uniform directional responses over a wide spectrum of frequencies
Filter-and-Sum
• Complicated
• Uniform directional responses over a wide spectrum of frequencies: good for wideband signals, like speech
Filter-and-Sum
• Complicated
• Uniform directional responses over a wide spectrum of frequencies: good for wideband signals, like speech
MVDR (Capon)
• Only the TDOAs of the interested speech source need to be known – simple requirements.
• Reverberation causes the signal cancellation problem.
• Time-domain or frequency-domain
MVDR (Capon)
• Only the TDOAs of the interested speech source need to be known – simple requirements.
• Reverberation causes the signal cancellation problem.
• Time-domain or frequency-domain
LCMV (Frost)/GSC
• The impulse responses (IRs) from the source to the microphones have to be known or estimated.
• Errors in the IRs lead to the signal cancellation problem.
LCMV (Frost)/GSC
• The impulse responses (IRs) from the source to the microphones have to be known or estimated.
• Errors in the IRs lead to the signal cancellation problem.
23 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Comments on Traditional Microphone Array Beamforming
• For incoherent noise sources, the gain in SNR is low if the number of microphones is small.
• For coherent noise sources whose directions are different from that of the speech source, a theoretically optimal gain in SNR can be high but is difficult to obtain due to a number of practical limitations:
Unavailability of precise a priori knowledge of the acoustic impulse responses from the speech sources to the microphones.
Inconsistent responses of the microphones across the array.
• For coherent noise sources that are in the same direction as the speech source, beamforming (as a spatial filter) is ineffective.
• For incoherent noise sources, the gain in SNR is low if the number of microphones is small.
• For coherent noise sources whose directions are different from that of the speech source, a theoretically optimal gain in SNR can be high but is difficult to obtain due to a number of practical limitations:
Unavailability of precise a priori knowledge of the acoustic impulse responses from the speech sources to the microphones.
Inconsistent responses of the microphones across the array.
• For coherent noise sources that are in the same direction as the speech source, beamforming (as a spatial filter) is ineffective.
24 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Multichannel Noise Reduction (MCNR)
x1,s(k)Only Denoising
. . ....
MCNRMCNR
xN(k) x2(k) x1(k)
s(k)
12N
v(k)
. . .gN g2 g1
• Beamformer: Spatial Filtering
• Array Setup: Calibration is necessary – possibly time/effort consuming
• Beamformer: Spatial Filtering
• Array Setup: Calibration is necessary – possibly time/effort consuming
• MCNR: Statistical Filtering
• Array Setup: No need to strictly demand a specific array geometry/pattern
• MCNR: Statistical Filtering
• Array Setup: No need to strictly demand a specific array geometry/pattern
• A conceptual comparison of beamforming and MCNR:• A conceptual comparison of beamforming and MCNR:
s(k)
. . . d
...BeamformingBeamforming
xN(k) x2(k) x1(k)
s(k)
Speech Sourceof Interest
12N
Noisev(k)
. . .Impulse ResponsesImpulse Responses
gN g2 g1
Dereverberation and Denoising
Knowledge related to the source position or gn
Knowledge related to the source position or gn
• Signal Model:• Signal Model:
25 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Frequency-Domain MVDR Filter for MCNR
• The problem formulation:
• The MVDR filter:
• A more practical implementation:
where
• Similar to traditional single-channel noise reduction methods, the noise PSD matrix is estimated during silent periods and the signal PSD matrix is estimated during speech periods.
• The problem formulation:
• The MVDR filter:
• A more practical implementation:
where
• Similar to traditional single-channel noise reduction methods, the noise PSD matrix is estimated during silent periods and the signal PSD matrix is estimated during speech periods.
26 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Comparison of the MVDR Filters for Beamforming and MCNR
• Note: In the implementation of the MVDR-MCNR, the channel responses do not need to be known.
• Note: In the implementation of the MVDR-MCNR, the channel responses do not need to be known.
• The acoustic impulse responses can at best be estimated up to a scale:• The acoustic impulse responses can at best be estimated up to a scale:
wherewhere denotes the true response vector.denotes the true response vector.
Leads to speech distortion.Leads to speech distortion.
• MVDR for MCNR:• MVDR for MCNR:• MVDR for Beamforming (BF):• MVDR for Beamforming (BF):
27 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Distortionless Multichannel Wiener Filter for MCNR
• Use what we called the spatial prediction:
• Formulate the following optimization problem:
where
• The distortionless multichannel Wiener (DW) filter for MCNR:
• The optimal Wiener solution for the non-causal spatial prediction filters:
where So,
• It was found that
• Use what we called the spatial prediction:
• Formulate the following optimization problem:
where
• The distortionless multichannel Wiener (DW) filter for MCNR:
• The optimal Wiener solution for the non-causal spatial prediction filters:
where So,
• It was found that
28 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Single-Channel Noise Reduction (SCNR) for Post-Filtering
• Beamforming: The Wiener filter (the optimal solution in the MMSE sense) can be factorized as
• Beamforming: The Wiener filter (the optimal solution in the MMSE sense) can be factorized as
MVDR BeamformerMVDR Beamformer Wiener Filter for SCNRWiener Filter for SCNR
• MCNR: Again, the Wiener filter can be factorized as• MCNR: Again, the Wiener filter can be factorized as
Note: For a complete and detailed development of this factorization, please refer to Eq. (6.117) of the following book. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Berlin, Germany: Springer, 2008.Note: For a complete and detailed development of this factorization, please refer to Eq. (6.117) of the following book. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Berlin, Germany: Springer, 2008.
MVDR for MCNRMVDR for MCNR Wiener Filter for SCNRWiener Filter for SCNR
Note: For a complete and detailed development of this factorization, please refer to Eq. (3.19) of the following book. M. Brandstein and D. Ward, eds, Microphone Arrays: Signal Processing Techniques and Applications, Berlin, Germany:
Sprinter, 2001.
Note: For a complete and detailed development of this factorization, please refer to Eq. (3.19) of the following book. M. Brandstein and D. Ward, eds, Microphone Arrays: Signal Processing Techniques and Applications, Berlin, Germany:
Sprinter, 2001.
29 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Single-Channel Noise Reduction (SCNR)
• The signal model:
• SCNR filter:
• Error signal:
• MSE cost function:
• The Wiener filter:
where
and
• Other SCNR methods: Parametric Wiener filter, Tradeoff filter.
• The signal model:
• SCNR filter:
• Error signal:
• MSE cost function:
• The Wiener filter:
where
and
• Other SCNR methods: Parametric Wiener filter, Tradeoff filter.
• A well-known feature: Noise reduction is achieved at the cost of adding speech distortion.
• A well-known feature: Noise reduction is achieved at the cost of adding speech distortion.
30 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
New Idea for SCNR
• A second-order complex circular random variable (CCRV) has:
which implies that and its conjugate are uncorrelated.
• In general, speech is not a second-order CCRV:
• But noise is a second-order CCRV if stationary, and not otherwise.
• A second-order complex circular random variable (CCRV) has:
which implies that and its conjugate are uncorrelated.
• In general, speech is not a second-order CCRV:
• But noise is a second-order CCRV if stationary, and not otherwise.
• Examine
• This is similar to the signal model of a two-element microphone array. So there is a chance to reduce noise without adding any speech distortion.
• Examine
• This is similar to the signal model of a two-element microphone array. So there is a chance to reduce noise without adding any speech distortion.
Correlated but not completely coherent Uncorrelated or coherent
31 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Widely Linear Wiener Filter
• New filter for SCNR:
• Error signal:
• Widely linear MSE:
• Then the widely linear Wiener filter or MVDR type of filters can be developed.
• New filter for SCNR:
• Error signal:
• Widely linear MSE:
• Then the widely linear Wiener filter or MVDR type of filters can be developed.
32 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Section 4
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
33 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Computational Platform/Technology Selection
Three platforms under consideration:
• ASIC
• DSP
• FPGA
Trade-off among performance, power consumption, size, and costs
Three platforms under consideration:
• ASIC
• DSP
• FPGA
Trade-off among performance, power consumption, size, and costs
Four competing factors:
• The count of transistors employed
• The number of clock cycles required
• The time taken to develop an application
• Nonrecurring engineering (NRE) costs
Four competing factors:
• The count of transistors employed
• The number of clock cycles required
• The time taken to develop an application
• Nonrecurring engineering (NRE) costs
ASIC• Low numbers of transistors
and clock cycles
• Long development time and high NRE costs
• Effective in performance, power, and size, but not in cost
ASIC• Low numbers of transistors
and clock cycles
• Long development time and high NRE costs
• Effective in performance, power, and size, but not in cost
DSP• Low development and
NRE costs
• Low power consumption
• More efforts to convert the design to ASICs
DSP• Low development and
NRE costs
• Low power consumption
• More efforts to convert the design to ASICs
FPGA• Not suited to processing sequential
conditional data flow, but efficient in concurrent applications
• Support faster I/O than DSPs
• One step closer to ASIC than DSP
• High development cost due to performance optimization
FPGA• Not suited to processing sequential
conditional data flow, but efficient in concurrent applications
• Support faster I/O than DSPs
• One step closer to ASIC than DSP
• High development cost due to performance optimization
34 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Mic. Powering
Circuit
83
2
1GND
HOT
COLD3
2
1
System Block Diagram
DB25Female
XLRFemale
XLRMaleMIC CAPSULE
DB25Male
FPGA Board
Mic. Powering
Circuit
13
2
1GND
HOT
COLD3
2
1
8-ch 24-bit
48kHz ADC
8-ch 24-bit
48kHz ADC
Mic. Preamps
G
G
G
G
G
G
G
G
Jumpers(for Gain Control)
Altera
FPGA
Altera
FPGA
JTAG (Male)
SDRAMSDRAM SDRAMSDRAM
. . . . . .
.
.
.
Digital Output Interface(USB 2.0)
. . .
Power Mgmt ICPower
Mgmt IC
PowerJack
An
alo
g I
np
ut
. . .
Fla
shF
lash.
.
.
Mic. Powering
Circuit
23
2
1GND
HOT
COLD3
2
1
Mic. Powering
Circuit
33
2
1GND
HOT
COLD3
2
1
Mic. Powering
Circuit
43
2
1GND
HOT
COLD3
2
1
Mic. Powering
Circuit
53
2
1GND
HOT
COLD3
2
1
Mic. Powering
Circuit
63
2
1GND
HOT
COLD3
2
1
Mic. Powering
Circuit
73
2
1GND
HOT
COLD3
2
1
35 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
FPGA Board Block Diagram
OPA1632 (1)OPA1632 (1)
OPA1632 (2)OPA1632 (2)
OPA1632 (8)OPA1632 (8)
ADS1278ADS1278
EPCS16EPCS16
Altera Cyclone III
EP3C55F484C8
FPGA
Altera Cyclone III
EP3C55F484C8
FPGA
16 MB SDRAM (×32)
16 MB SDRAM (×32)
16 MB SDRAM (×32)
16 MB SDRAM (×32)
16 MB Flash (×16)
16 MB Flash (×16)
50 MHz XTAL
24.576 MHz XTAL
USB 2.0 (High Speed) User
LED/IOs
3.3 V
36 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Prototype FPGA Board: the Top View
Phantom Power Feeding
Mic. Pream Gain Jumpers
OPA1632
REF1004 ADS1278
User LEDsEPCS16
S
User I/Os
JTAG
FT2232H USB 2.0 Jack
12 MHz Crystal
GND
TPS65053
Flash
DC Power Jack
Power LED
SDRAMsCyclone III FPGA
Analog Power DC 9V
Analog Power DC 5V
DB25
174.8 mm × 101 mm174.8 mm × 101 mm
37 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Prototype FPGA Board: the Bottom View
OPA1632
50 MHz Clock Oscillator (OSC2)
24.576 MHz Clock Oscillator (OSC1)
38 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
FPGA System Development Flow Adopted in the Project
System on Programmable Chip (SoPC) + C/C++ Programming:
1) Use SoPC Builder to construct a soft-core NIOS II processor embedded on the Altera FPGA
2) Develop software/DSP systems in C/C++ on the NIOS II processor
System on Programmable Chip (SoPC) + C/C++ Programming:
1) Use SoPC Builder to construct a soft-core NIOS II processor embedded on the Altera FPGA
2) Develop software/DSP systems in C/C++ on the NIOS II processor
• Advantages:
Short development cycle/time
Low cost
High reliability
Reusability of intellectual property
• Advantages:
Short development cycle/time
Low cost
High reliability
Reusability of intellectual property
• Drawbacks:
Poor efficiency and low performance:
Efficiency can be improved by identifying those time-consuming functions (e.g., FFT and IFFT) and accelerating them with the tool of C2H (C-to-Hardware)
• Drawbacks:
Poor efficiency and low performance:
Efficiency can be improved by identifying those time-consuming functions (e.g., FFT and IFFT) and accelerating them with the tool of C2H (C-to-Hardware)
CPU (NIOS II)
ROM RAM
I/O
UART DSP
39 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
a
b
d
c a
b
d
c a
b
d
c a
b
d
c a
b
d
c a
b
d
c a
b
d
c
Analog Device ADMP402 MEMS Microphones: 2.5 mm × 3.35 mm
1 72 3 4 5 6
5 mm 5 mm
5 m
m5
mm
20 mm 20 mm7 Subarrays Pin 18
Pin 1
XG-MPC-MEMS
MEMS Microphone Array
123456789101112131415161718
Samsung 18-pin Connector
40 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
MEMS Microphone Array Box
Pin 1
Pin 18
Samsung 18-pin Connector
Wevoice MEMS Microphone Array
76
54
32
135 mm
12.5 mm
155 mm
41 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Section 5
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
42 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
FPGA Program Flowchart
data in & preprocessing
MCNR+SCNR
4-ch FFT
1-ch IFFT
overlap add
USB trans.
data in & preprocessing
MCNR+SCNR
4-ch FFT
1-ch IFFT
overlap add
USB trans.
time (ms)
t t+4 t+81 time frame
Nios II Soft Core
FFT/IFFT Processor
To USB To USBFrom ADCFrom ADC
FPGA
Processing delay < 8 ms
. . . . . .
43 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
IA System Windows Host Software
• Programmed with Microsoft Visual C++
• Direct Sound is used to play back audio (speech).
• Programmed with Microsoft Visual C++
• Direct Sound is used to play back audio (speech).
Splash window of the programSplash window of the program
44 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
IA System Windows Host GUI: Multitrack View
45 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
IA System Windows Host GUI: Single-Track View
46 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
IA System Windows Host GUI: Playing Back
47 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Section 6
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
48 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
The Portable, Real-Time Demo System
FPGA BoardPower Supply: Linear DC 12-20V/1A
Suited Subject
DB25Connectors
PC
USB 2.0 Cable
MEMS Microphone Array
Audio Cable
49 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Section 7
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
1. Problem Identification and Research Motivation
2. Problem Analysis and Technical Challenges
3. Noise Control with Microphone Arrays
4. Hardware Development
5. Software Development
6. A Portable, Real-Time Demonstration System
7. Towards Immersive Voice Communication in Spacesuits
50 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
What is and Why do we want Immersive Communication?
Telecommunication helps people collaborate and share information by cutting across the following 3 separations/constraints:
Long distance
Real time
Physical boundaries
Modern telecommunication technologies are successful so far in transcending the first two constraints: i.e., the long-distance and real-time constraints.
Immersive communication offers an feeling of being together and sharing a common environment during collaboration.
Immersive communication targets at breaking the physical boundaries, which is the “last mile” problem in communication.
Telecommunication helps people collaborate and share information by cutting across the following 3 separations/constraints:
Long distance
Real time
Physical boundaries
Modern telecommunication technologies are successful so far in transcending the first two constraints: i.e., the long-distance and real-time constraints.
Immersive communication offers an feeling of being together and sharing a common environment during collaboration.
Immersive communication targets at breaking the physical boundaries, which is the “last mile” problem in communication.
51 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
What need to be solved for immersive communication systems?
Single-Channel Acoustic Echo CancellationSingle-Channel Acoustic Echo Cancellation
52 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
What need to be solved for immersive communication systems?
Multichannel Acoustic Echo CancellationMultichannel Acoustic Echo Cancellation
53 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Synthesized Stereo
Audio Mixing
System
What need to be solved for immersive communication systems?
54 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
What need to be solved for immersive communication systems?
BeamformingBeamforming Blind Source SeparationBlind Source Separation
55 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
What need to be solved for immersive communication systems?
Acoustic Source
Localization and
Tracking
Acoustic Source
Localization and
Tracking
56 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
What need to be solved for immersive communication systems?
Stereophony System
for Spatial Sound
Reproduction
57 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
What need to be solved for immersive communication systems?
Wave Field
Synthesis
58 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Why Immersive Voice Communication in Spacesuits?
Immersive voice communication exploits human’s binaural hearing.
Provides enhanced situational awareness for a suited crewmember:
Can improve the productivity of collaboration among the crewmembers
Can produce potential safety benefits
Crew comfort can be optimized.
Immersive voice communication exploits human’s binaural hearing.
Provides enhanced situational awareness for a suited crewmember:
Can improve the productivity of collaboration among the crewmembers
Can produce potential safety benefits
Crew comfort can be optimized.
59 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
What Problems Need to be Solved?
• Stereo/Multichannel Acoustic Echo Cancellation (MCAEC)
• Integration of MCAEC and MCNR
• Three Dimensional (3D) Audio
• Stereo/Multichannel Acoustic Echo Cancellation (MCAEC)
• Integration of MCAEC and MCNR
• Three Dimensional (3D) Audio
60 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010
Conclusions
• While it has been more than 40 years since Neil Armstrong landed on the Moon, the astronauts are still using the communication carrier assembly (CCA) based audio system for voice communication in spacesuits.
• The new spacesuit design is going to take advantage of the most recent advances in multichannel acoustic and speech signal processing for echo and noise control and meanwhile with significantly improved crew comfort and ease of use.
Noise reduction with microphone arrays
Multichannel echo cancellation
Integrated echo and noise control
3D audio
• We explained the difference between the traditional beamforming method and what we called the multichannel noise reduction approach.
• We presented an intuitive interpretation of the widely linear Wiener filter for single-channel noise reduction.
• We described a new application of immersive communication in space exploration, ancillary to its mainstream use in commercial telecommunication systems.
• While it has been more than 40 years since Neil Armstrong landed on the Moon, the astronauts are still using the communication carrier assembly (CCA) based audio system for voice communication in spacesuits.
• The new spacesuit design is going to take advantage of the most recent advances in multichannel acoustic and speech signal processing for echo and noise control and meanwhile with significantly improved crew comfort and ease of use.
Noise reduction with microphone arrays
Multichannel echo cancellation
Integrated echo and noise control
3D audio
• We explained the difference between the traditional beamforming method and what we called the multichannel noise reduction approach.
• We presented an intuitive interpretation of the widely linear Wiener filter for single-channel noise reduction.
• We described a new application of immersive communication in space exploration, ancillary to its mainstream use in commercial telecommunication systems.