Download - Noise and Echo Control for Immersive Voice Communication in Spacesuits

Noise and Echo Control forImmersive Voice Communication in Spacesuits

Noise and Echo Control forImmersive Voice Communication in Spacesuits

9/2/2010

Yiteng (Arden) Huang

WeVoice, Inc., Bridgewater, New Jersey, USA

[email protected]

Presented as a keynote speech on the International Workshop on Acoustic Echo and Noise Control (IWAENC) in Tel Aviv, Israelon September 2, 2010

2 | Huang: Noise and Echo Control for Immersive Voice Communication in Spacesuits All Rights Reserved © WeVoice, Inc. 2010

About the Project

Financially sponsored by the NASA SBIR (Small Business Innovation Research) program

Phase I feasibility research: Jan. 2008 – July 2008

Phase II prototype development: Jan. 2009 – Jan. 2011

Other team members:

• Jingdong Chen, WeVoice, Inc., Bridgewater, New Jersey, USA

• Scott Sands, NASA Glenn Research Center (GRC), Cleveland, Ohio, USA

• Jacob Benesty, University of Quebec, Montreal, Quebec, Canada

Financially sponsored by the NASA SBIR (Small Business Innovation Research) program

Phase I feasibility research: Jan. 2008 – July 2008

Phase II prototype development: Jan. 2009 – Jan. 2011

Other team members:

• Jingdong Chen, WeVoice, Inc., Bridgewater, New Jersey, USA

• Scott Sands, NASA Glenn Research Center (GRC), Cleveland, Ohio, USA

• Jacob Benesty, University of Quebec, Montreal, Quebec, Canada


Outline

1. Problem Identification and Research Motivation

2. Problem Analysis and Technical Challenges

3. Noise Control with Microphone Arrays

4. Hardware Development

5. Software Development

6. A Portable, Real-Time Demonstration System

7. Towards Immersive Voice Communication in Spacesuits









Section 1
















Requirements of In-Suit Audio

Speech Quality and Intelligibility:

90% word identification rate

Hearing Protection:

Limits total noise dose, hazard noise, and on-orbit continuous and impulse noise for waking and sleeping periods

Noise loads are very high during launch and orbital maneuvers.

Audio Control and Interfaces:

Provides manual silencing features and volume controls

Operation at Non-Standard Barometric Pressure Levels (BPLs):

Operates effectively between 30 kPa and 105 kPa

Speech Quality and Intelligibility:

90% word identification rate

Hearing Protection:

Limits total noise dose, hazard noise, and on-orbit continuous and impulse noise for waking and sleeping periods

Noise loads are very high during launch and orbital maneuvers.

Audio Control and Interfaces:

Provides manual silencing features and volume controls

Operation at Non-Standard Barometric Pressure Levels (BPLs):

Operates effectively between 30 kPa and 105 kPa


Current In-Suit Audio System

Chin Cup

MicrophoneModule

Microphone Boom

Skullcap

PerspirationAbsorptionArea

Helmet

Helmet Ring

Earpiece

Current Solution: Communication Carrier Assembly (CCA) Audio System

Current Solution: Communication Carrier Assembly (CCA) Audio System


Extravehicular Mobility Unit (EMU) CCA

• For shuttle and International Space

Station (ISS) operations

• For shuttle and International Space

Station (ISS) operations

Source: O. Sands, NASA GRC

Interconnect wiring Nylon/spondex top

Teflon sidepiece and pocket

Electret Microphone

Interface cable and connector

Electret Microphon

e

Ear seal

Ear cup

• A large gain applied to the outbound

speech for sufficient sound volume at

low static pressure levels (30 kPa)

leads to clipping and strong distortion

during operations near sea-level

BPL.

• A large gain applied to the outbound

speech for sufficient sound volume at

low static pressure levels (30 kPa)

leads to clipping and strong distortion

during operations near sea-level

BPL.


Advanced Crew Escape Suit (ACES) CCA

Source: O. Sands, NASA GRC

Dynamic Microphones

• For shuttle launch and entry operations• For shuttle launch and entry operations

• Hearing protection provided by the ACES

CCA may not be sufficient.

• Hearing protection provided by the ACES

CCA may not be sufficient.


Developmental CCA

• The active earpieces will be used in conjunction with the CCA ear cups during launch

and other high noise events and can be removed for other suited operations.

• The active earpieces alone nearly provide the required level of hearing protection.

• The active earpieces will be used in conjunction with the CCA ear cups during launch

and other high noise events and can be removed for other suited operations.

• The active earpieces alone nearly provide the required level of hearing protection.

Noise Canceling

Microphones

Active In-Canal Earpieces

Sou

rce:

O. S

ands

, NA

SA

GR

C

Sou

rce:

O. S

ands

, NA

SA

GR

C

Ear Cups


CCA Systems: Pros

• High outbound speech intelligibility and quality, SNR near optimum

Use close-talking microphones

A high degree of acoustic isolation between the in-suit noise and the suit subject’s vocalizations

A high degree of acoustic isolation between the inbound and outbound signals

The human body does NOT transmit vibration-borne noise

• Provide very good hearing protection.

• High outbound speech intelligibility and quality, SNR near optimum

Use close-talking microphones

A high degree of acoustic isolation between the in-suit noise and the suit subject’s vocalizations

A high degree of acoustic isolation between the inbound and outbound signals

The human body does NOT transmit vibration-borne noise

• Provide very good hearing protection.


• The microphones need to be close to the mouth of a suited subject.

• A number of recognized logistical issues and inconveniences:

Cannot adjust the cap and the microphone booms during EVA operations, which can last from 4 to 8 hours

The close-talking microphones interfere with the suited subject’s eating and drinking, and are susceptible to contamination.

The communication cap needs to fit well. Caps in a variety of different sizes need to be built and maintained, e.g., 5 sizes for EMU caps.

Wire fatigue for the microphone booms

• These problems cannot be resolved with incremental improvements to the basic

design of the CCA systems.

• The microphones need to be close to the mouth of a suited subject.

• A number of recognized logistical issues and inconveniences:

Cannot adjust the cap and the microphone booms during EVA operations, which can last from 4 to 8 hours

The close-talking microphones interfere with the suited subject’s eating and drinking, and are susceptible to contamination.

The communication cap needs to fit well. Caps in a variety of different sizes need to be built and maintained, e.g., 5 sizes for EMU caps.

Wire fatigue for the microphone booms

• These problems cannot be resolved with incremental improvements to the basic

design of the CCA systems.

CCA Systems: Cons


Stakeholder Interviews

• The CCA ear cups produce pressure points that cause discomfort.

• Microphone arrays and helmet speakers are suggested to be used.

• Suit subject comfort should be maximized as much as possible, given that other constraints can be met (relaxed and traded off):

Clear two-way voice communications

Hearing protection from the fan noise in the life support system ventilation loop

Properly containing and managing hair and sweat inside the helmet

Adequate SNR for the potential use of automatic speech recognition for the suit’s information system

• The CCA ear cups produce pressure points that cause discomfort.

• Microphone arrays and helmet speakers are suggested to be used.

• Suit subject comfort should be maximized as much as possible, given that other constraints can be met (relaxed and traded off):

Clear two-way voice communications

Hearing protection from the fan noise in the life support system ventilation loop

Properly containing and managing hair and sweat inside the helmet

Adequate SNR for the potential use of automatic speech recognition for the suit’s information system


Two Alternative Architectural Options for In-Suit Audio

1. Integrated Audio (IA):

Instead of being

housed in a separate

subassembly, both the

microphones and the

speakers are

integrated into the

suit/helmet.

2. Hybrid Approach:

Employs the inbound

portion of a CCA

system with the

outbound portion of an

IA system.

1. Integrated Audio (IA):

Instead of being

housed in a separate

subassembly, both the

microphones and the

speakers are

integrated into the

suit/helmet.

2. Hybrid Approach:

Employs the inbound

portion of a CCA

system with the

outbound portion of an

IA system.

Helmet Speaker


Section 2
















Noise from Outside the Spacesuit

• During launch, entry descent, and landing:

Impulse noise < 140 dBSPL, Hazard noise < 105 dBA

• On orbit:

Impulse noise: < 140 dBSPL waking hours and < 83 dBSPL sleeping

Limits on continuous on-orbit noise levels by frequency:

• Remark: During EVA operations, ambient noise is at most a minor problem.

• During launch, entry descent, and landing:

Impulse noise < 140 dBSPL, Hazard noise < 105 dBA

• On orbit:

Impulse noise: < 140 dBSPL waking hours and < 83 dBSPL sleeping

Limits on continuous on-orbit noise levels by frequency:

• Remark: During EVA operations, ambient noise is at most a minor problem.

Band Center Frequency (Hz) 63 125 250 500 1k 2k 4k 8k 16k

Sound Pressure Level (dB) 72 65 60 56 53 51 50 48 48

SPL (dB) 85 – 95 75 – 85 65 – 75 55 – 65

PerceptionVery High Noise: speech almost

impossible to hear

High Noise: speech is difficult to hear

Medium Noise: Must Raise Voice to

be Heard

Low Noise: speech is easy

to hear

Typical

Environments

Construction SiteLoud Machine ShopNoisy Manufacturing

Assembly LineCrowded Bus/Transit

Waiting AreaVery Noisy Restaurant/Bar

Department StoreBand/Public Area

Supermarket

Doctor’s OfficeHospital

Hotel Lobby


Structure-Borne Noise Inside the Spacesuit

• Four noise sources (Begault & Hieronymus 2007):

1. Airflow and air inlet hissing noise, as well as fan/pump noise due to required air supply and circulation

2. Arm, leg, and hip bearing noise

3. Suit-impact noise, e.g., footfall

4. Swishing-like noise due to air movement caused by walking (since the suits are closed pressure environments)

• Four noise sources (Begault & Hieronymus 2007):

1. Airflow and air inlet hissing noise, as well as fan/pump noise due to required air supply and circulation

2. Arm, leg, and hip bearing noise

3. Suit-impact noise, e.g., footfall

4. Swishing-like noise due to air movement caused by walking (since the suits are closed pressure environments)

• For CCA systems, since the suit subject’s body does not transmit bearing and impact noise, only airflow-related noise needs to be controlled.

• For Integrated Audio (IA) systems, microphones are mounted directly on the suit structure and vibration noise is loud.

• For CCA systems, since the suit subject’s body does not transmit bearing and impact noise, only airflow-related noise needs to be controlled.

• For Integrated Audio (IA) systems, microphones are mounted directly on the suit structure and vibration noise is loud.


Acoustic Challenges

• Complicated noise field:

Temporal domain: Has both stationary and non-stationary noise

Spectral domain: Inherently wideband

Spatial domain: Near field; Possibly either directional or dispersive

• Highly reverberant enclosure:

The helmet is made of highly reflective materials.

Strong reverberation dramatically reduces the intelligibility of speech uttered by the suit subject and degrades the performance of an automatic speech recognizer.

Strong reverberation leads to a more dispersive noise field, which makes beamforming less effective.


Section 3
















4

3

2

1

Proposed Noise Control Scheme for IA/Hybrid Systems

5

Adaptive Noise

Cancellation

Adaptive Noise

Cancellation

Beamforming

MultichannelNoise

Reduction

Acoustic Source Localization

Acoustic Source Localization

Head Position Calibration

Head Position CalibrationHead Motion

Tracker

Single Channel

Noise Reduction

Single Channel

Noise Reduction

Outbound Speech

Mouth range and incident angle with respect to the microphone array

Noise Reference

Microphone Array


Current Research Focus

4

3

2

1 Beamforming

MultichannelNoise

Reduction

Single Channel

Noise Reduction

Single Channel

Noise Reduction

Outbound Speech

Microphone Array


Beamforming: Far-Field vs. Near-Field

. . .d

θ

...hN h2 h1...

Σ

Y(f, ψ, rs)

XN(f) X2(f) X1(f)

ψ

Far-Field NoisePlane Waves

…V(f, ψ)

S(f, rs)

Near-Field Sound Source

rs

12N

. . .

…

d

ψ

(N-1

)·d·co

s(ψ)

Plane Waves

θ

...hN h2 h1...

Σ

Y(f, ψ, θ)

XN(f) X2(f) X1(f)

…

S(f, θ)

V(f, ψ)

Far-Field NoiseFar-Field Sound Source

of Interest

12N


Fixed Beamformer vs. Adaptive Beamformer

Microphone Array BeamformersMicrophone Array Beamformers

Fixed BeamformersFixed Beamformers Adaptive BeamformersAdaptive Beamformers

Delay-and-SumDelay-and-Sum Filter-and-SumFilter-and-Sum MVDR (Capon)MVDR (Capon) LCMV (Frost)/GSCLCMV (Frost)/GSC

Noise Field?Stationary, Known before the design Time Varying, Unknown

Isotropic noise generally assumed

Reverberation?Not Concerned Significant

Delay-and-Sum

• Simple

• Non-uniform directional responses over a wide spectrum of frequencies

Delay-and-Sum

• Simple

• Non-uniform directional responses over a wide spectrum of frequencies

Filter-and-Sum

• Complicated

• Uniform directional responses over a wide spectrum of frequencies: good for wideband signals, like speech

Filter-and-Sum

• Complicated

• Uniform directional responses over a wide spectrum of frequencies: good for wideband signals, like speech

MVDR (Capon)

• Only the TDOAs of the interested speech source need to be known – simple requirements.

• Reverberation causes the signal cancellation problem.

• Time-domain or frequency-domain

MVDR (Capon)

• Only the TDOAs of the interested speech source need to be known – simple requirements.

• Reverberation causes the signal cancellation problem.

• Time-domain or frequency-domain

LCMV (Frost)/GSC

• The impulse responses (IRs) from the source to the microphones have to be known or estimated.

• Errors in the IRs lead to the signal cancellation problem.

LCMV (Frost)/GSC

• The impulse responses (IRs) from the source to the microphones have to be known or estimated.

• Errors in the IRs lead to the signal cancellation problem.


Comments on Traditional Microphone Array Beamforming

• For incoherent noise sources, the gain in SNR is low if the number of microphones is small.

• For coherent noise sources whose directions are different from that of the speech source, a theoretically optimal gain in SNR can be high but is difficult to obtain due to a number of practical limitations:

Unavailability of precise a priori knowledge of the acoustic impulse responses from the speech sources to the microphones.

Inconsistent responses of the microphones across the array.

• For coherent noise sources that are in the same direction as the speech source, beamforming (as a spatial filter) is ineffective.

• For incoherent noise sources, the gain in SNR is low if the number of microphones is small.

• For coherent noise sources whose directions are different from that of the speech source, a theoretically optimal gain in SNR can be high but is difficult to obtain due to a number of practical limitations:

Unavailability of precise a priori knowledge of the acoustic impulse responses from the speech sources to the microphones.

Inconsistent responses of the microphones across the array.

• For coherent noise sources that are in the same direction as the speech source, beamforming (as a spatial filter) is ineffective.


Multichannel Noise Reduction (MCNR)

x1,s(k)Only Denoising

. . ....

MCNRMCNR

xN(k) x2(k) x1(k)

s(k)

12N

v(k)

. . .gN g2 g1

• Beamformer: Spatial Filtering

• Array Setup: Calibration is necessary – possibly time/effort consuming

• Beamformer: Spatial Filtering

• Array Setup: Calibration is necessary – possibly time/effort consuming

• MCNR: Statistical Filtering

• Array Setup: No need to strictly demand a specific array geometry/pattern

• MCNR: Statistical Filtering

• Array Setup: No need to strictly demand a specific array geometry/pattern

• A conceptual comparison of beamforming and MCNR:• A conceptual comparison of beamforming and MCNR:

s(k)

. . . d

...BeamformingBeamforming

xN(k) x2(k) x1(k)

s(k)

Speech Sourceof Interest

12N

Noisev(k)

. . .Impulse ResponsesImpulse Responses

gN g2 g1

Dereverberation and Denoising

Knowledge related to the source position or gn

Knowledge related to the source position or gn

• Signal Model:• Signal Model:


Frequency-Domain MVDR Filter for MCNR

• The problem formulation:

• The MVDR filter:

• A more practical implementation:

where

• Similar to traditional single-channel noise reduction methods, the noise PSD matrix is estimated during silent periods and the signal PSD matrix is estimated during speech periods.

• The problem formulation:

• The MVDR filter:

• A more practical implementation:

where

• Similar to traditional single-channel noise reduction methods, the noise PSD matrix is estimated during silent periods and the signal PSD matrix is estimated during speech periods.


Comparison of the MVDR Filters for Beamforming and MCNR

• Note: In the implementation of the MVDR-MCNR, the channel responses do not need to be known.

• Note: In the implementation of the MVDR-MCNR, the channel responses do not need to be known.

• The acoustic impulse responses can at best be estimated up to a scale:• The acoustic impulse responses can at best be estimated up to a scale:

wherewhere denotes the true response vector.denotes the true response vector.

Leads to speech distortion.Leads to speech distortion.

• MVDR for MCNR:• MVDR for MCNR:• MVDR for Beamforming (BF):• MVDR for Beamforming (BF):


Distortionless Multichannel Wiener Filter for MCNR

• Use what we called the spatial prediction:

• Formulate the following optimization problem:

where

• The distortionless multichannel Wiener (DW) filter for MCNR:

• The optimal Wiener solution for the non-causal spatial prediction filters:

where So,

• It was found that

• Use what we called the spatial prediction:

• Formulate the following optimization problem:

where

• The distortionless multichannel Wiener (DW) filter for MCNR:

• The optimal Wiener solution for the non-causal spatial prediction filters:

where So,

• It was found that


Single-Channel Noise Reduction (SCNR) for Post-Filtering

• Beamforming: The Wiener filter (the optimal solution in the MMSE sense) can be factorized as

• Beamforming: The Wiener filter (the optimal solution in the MMSE sense) can be factorized as

MVDR BeamformerMVDR Beamformer Wiener Filter for SCNRWiener Filter for SCNR

• MCNR: Again, the Wiener filter can be factorized as• MCNR: Again, the Wiener filter can be factorized as

Note: For a complete and detailed development of this factorization, please refer to Eq. (6.117) of the following book. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Berlin, Germany: Springer, 2008.Note: For a complete and detailed development of this factorization, please refer to Eq. (6.117) of the following book. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Berlin, Germany: Springer, 2008.

MVDR for MCNRMVDR for MCNR Wiener Filter for SCNRWiener Filter for SCNR

Note: For a complete and detailed development of this factorization, please refer to Eq. (3.19) of the following book. M. Brandstein and D. Ward, eds, Microphone Arrays: Signal Processing Techniques and Applications, Berlin, Germany:

Sprinter, 2001.

Note: For a complete and detailed development of this factorization, please refer to Eq. (3.19) of the following book. M. Brandstein and D. Ward, eds, Microphone Arrays: Signal Processing Techniques and Applications, Berlin, Germany:

Sprinter, 2001.


Single-Channel Noise Reduction (SCNR)

• The signal model:

• SCNR filter:

• Error signal:

• MSE cost function:

• The Wiener filter:

where

and

• Other SCNR methods: Parametric Wiener filter, Tradeoff filter.

• The signal model:

• SCNR filter:

• Error signal:

• MSE cost function:

• The Wiener filter:

where

and

• Other SCNR methods: Parametric Wiener filter, Tradeoff filter.

• A well-known feature: Noise reduction is achieved at the cost of adding speech distortion.

• A well-known feature: Noise reduction is achieved at the cost of adding speech distortion.


New Idea for SCNR

• A second-order complex circular random variable (CCRV) has:

which implies that and its conjugate are uncorrelated.

• In general, speech is not a second-order CCRV:

• But noise is a second-order CCRV if stationary, and not otherwise.

• A second-order complex circular random variable (CCRV) has:

which implies that and its conjugate are uncorrelated.

• In general, speech is not a second-order CCRV:

• But noise is a second-order CCRV if stationary, and not otherwise.

• Examine

• This is similar to the signal model of a two-element microphone array. So there is a chance to reduce noise without adding any speech distortion.

• Examine

• This is similar to the signal model of a two-element microphone array. So there is a chance to reduce noise without adding any speech distortion.

Correlated but not completely coherent Uncorrelated or coherent


Widely Linear Wiener Filter

• New filter for SCNR:

• Error signal:

• Widely linear MSE:

• Then the widely linear Wiener filter or MVDR type of filters can be developed.

• New filter for SCNR:

• Error signal:

• Widely linear MSE:

• Then the widely linear Wiener filter or MVDR type of filters can be developed.


Section 4
















Computational Platform/Technology Selection

Three platforms under consideration:

• ASIC

• DSP

• FPGA

Trade-off among performance, power consumption, size, and costs

Three platforms under consideration:

• ASIC

• DSP

• FPGA

Trade-off among performance, power consumption, size, and costs

Four competing factors:

• The count of transistors employed

• The number of clock cycles required

• The time taken to develop an application

• Nonrecurring engineering (NRE) costs

Four competing factors:

• The count of transistors employed

• The number of clock cycles required

• The time taken to develop an application

• Nonrecurring engineering (NRE) costs

ASIC• Low numbers of transistors

and clock cycles

• Long development time and high NRE costs

• Effective in performance, power, and size, but not in cost

ASIC• Low numbers of transistors

and clock cycles

• Long development time and high NRE costs

• Effective in performance, power, and size, but not in cost

DSP• Low development and

NRE costs

• Low power consumption

• More efforts to convert the design to ASICs

DSP• Low development and

NRE costs

• Low power consumption

• More efforts to convert the design to ASICs

FPGA• Not suited to processing sequential

conditional data flow, but efficient in concurrent applications

• Support faster I/O than DSPs

• One step closer to ASIC than DSP

• High development cost due to performance optimization

FPGA• Not suited to processing sequential

conditional data flow, but efficient in concurrent applications

• Support faster I/O than DSPs

• One step closer to ASIC than DSP

• High development cost due to performance optimization


Mic. Powering

Circuit

83

2

1GND

HOT

COLD3

2

1

System Block Diagram

DB25Female

XLRFemale

XLRMaleMIC CAPSULE

DB25Male

FPGA Board

Mic. Powering

Circuit

13

2

1GND

HOT

COLD3

2

1

8-ch 24-bit

48kHz ADC

8-ch 24-bit

48kHz ADC

Mic. Preamps

G

G

G

G

G

G

G

G

Jumpers(for Gain Control)

Altera

FPGA

Altera

FPGA

JTAG (Male)

SDRAMSDRAM SDRAMSDRAM

. . . . . .

.

.

.

Digital Output Interface(USB 2.0)

. . .

Power Mgmt ICPower

Mgmt IC

PowerJack

An

alo

g I

np

ut

. . .

Fla

shF

lash.

.

.

Mic. Powering

Circuit

23

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

33

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

43

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

53

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

63

2

1GND

HOT

COLD3

2

1

Mic. Powering

Circuit

73

2

1GND

HOT

COLD3

2

1


FPGA Board Block Diagram

OPA1632 (1)OPA1632 (1)

OPA1632 (2)OPA1632 (2)

OPA1632 (8)OPA1632 (8)

ADS1278ADS1278

EPCS16EPCS16

Altera Cyclone III

EP3C55F484C8

FPGA

Altera Cyclone III

EP3C55F484C8

FPGA

16 MB SDRAM (×32)

16 MB SDRAM (×32)

16 MB SDRAM (×32)

16 MB SDRAM (×32)

16 MB Flash (×16)

16 MB Flash (×16)

50 MHz XTAL

24.576 MHz XTAL

USB 2.0 (High Speed) User

LED/IOs

3.3 V


Prototype FPGA Board: the Top View

Phantom Power Feeding

Mic. Pream Gain Jumpers

OPA1632

REF1004 ADS1278

User LEDsEPCS16

S

User I/Os

JTAG

FT2232H USB 2.0 Jack

12 MHz Crystal

GND

TPS65053

Flash

DC Power Jack

Power LED

SDRAMsCyclone III FPGA

Analog Power DC 9V

Analog Power DC 5V

DB25

174.8 mm × 101 mm174.8 mm × 101 mm


Prototype FPGA Board: the Bottom View

OPA1632

50 MHz Clock Oscillator (OSC2)

24.576 MHz Clock Oscillator (OSC1)


FPGA System Development Flow Adopted in the Project

System on Programmable Chip (SoPC) + C/C++ Programming:

1) Use SoPC Builder to construct a soft-core NIOS II processor embedded on the Altera FPGA

2) Develop software/DSP systems in C/C++ on the NIOS II processor

System on Programmable Chip (SoPC) + C/C++ Programming:

1) Use SoPC Builder to construct a soft-core NIOS II processor embedded on the Altera FPGA

2) Develop software/DSP systems in C/C++ on the NIOS II processor

• Advantages:

Short development cycle/time

Low cost

High reliability

Reusability of intellectual property

• Advantages:

Short development cycle/time

Low cost

High reliability

Reusability of intellectual property

• Drawbacks:

Poor efficiency and low performance:

Efficiency can be improved by identifying those time-consuming functions (e.g., FFT and IFFT) and accelerating them with the tool of C2H (C-to-Hardware)

• Drawbacks:

Poor efficiency and low performance:

Efficiency can be improved by identifying those time-consuming functions (e.g., FFT and IFFT) and accelerating them with the tool of C2H (C-to-Hardware)

CPU (NIOS II)

ROM RAM

I/O

UART DSP


a

b

d

c a

b

d

c a

b

d

c a

b

d

c a

b

d

c a

b

d

c a

b

d

c

Analog Device ADMP402 MEMS Microphones: 2.5 mm × 3.35 mm

1 72 3 4 5 6

5 mm 5 mm

5 m

m5

mm

20 mm 20 mm7 Subarrays Pin 18

Pin 1

XG-MPC-MEMS

MEMS Microphone Array

123456789101112131415161718

Samsung 18-pin Connector


MEMS Microphone Array Box

Pin 1

Pin 18

Samsung 18-pin Connector

Wevoice MEMS Microphone Array

76

54

32

135 mm

12.5 mm

155 mm


Section 5
















FPGA Program Flowchart

data in & preprocessing

MCNR+SCNR

4-ch FFT

1-ch IFFT

overlap add

USB trans.

data in & preprocessing

MCNR+SCNR

4-ch FFT

1-ch IFFT

overlap add

USB trans.

time (ms)

t t+4 t+81 time frame

Nios II Soft Core

FFT/IFFT Processor

To USB To USBFrom ADCFrom ADC

FPGA

Processing delay < 8 ms

. . . . . .


IA System Windows Host Software

• Programmed with Microsoft Visual C++

• Direct Sound is used to play back audio (speech).

• Programmed with Microsoft Visual C++

• Direct Sound is used to play back audio (speech).

Splash window of the programSplash window of the program


IA System Windows Host GUI: Multitrack View


IA System Windows Host GUI: Single-Track View


IA System Windows Host GUI: Playing Back


Section 6
















The Portable, Real-Time Demo System

FPGA BoardPower Supply: Linear DC 12-20V/1A

Suited Subject

DB25Connectors

PC

USB 2.0 Cable

MEMS Microphone Array

Audio Cable


Section 7
















What is and Why do we want Immersive Communication?

Telecommunication helps people collaborate and share information by cutting across the following 3 separations/constraints:

Long distance

Real time

Physical boundaries

Modern telecommunication technologies are successful so far in transcending the first two constraints: i.e., the long-distance and real-time constraints.

Immersive communication offers an feeling of being together and sharing a common environment during collaboration.

Immersive communication targets at breaking the physical boundaries, which is the “last mile” problem in communication.

Telecommunication helps people collaborate and share information by cutting across the following 3 separations/constraints:

Long distance

Real time

Physical boundaries

Modern telecommunication technologies are successful so far in transcending the first two constraints: i.e., the long-distance and real-time constraints.

Immersive communication offers an feeling of being together and sharing a common environment during collaboration.

Immersive communication targets at breaking the physical boundaries, which is the “last mile” problem in communication.


What need to be solved for immersive communication systems?

Single-Channel Acoustic Echo CancellationSingle-Channel Acoustic Echo Cancellation



Multichannel Acoustic Echo CancellationMultichannel Acoustic Echo Cancellation


Synthesized Stereo

Audio Mixing

System




BeamformingBeamforming Blind Source SeparationBlind Source Separation



Acoustic Source

Localization and

Tracking

Acoustic Source

Localization and

Tracking



Stereophony System

for Spatial Sound

Reproduction



Wave Field

Synthesis


Why Immersive Voice Communication in Spacesuits?

Immersive voice communication exploits human’s binaural hearing.

Provides enhanced situational awareness for a suited crewmember:

Can improve the productivity of collaboration among the crewmembers

Can produce potential safety benefits

Crew comfort can be optimized.

Immersive voice communication exploits human’s binaural hearing.

Provides enhanced situational awareness for a suited crewmember:

Can improve the productivity of collaboration among the crewmembers

Can produce potential safety benefits

Crew comfort can be optimized.


What Problems Need to be Solved?

• Stereo/Multichannel Acoustic Echo Cancellation (MCAEC)

• Integration of MCAEC and MCNR

• Three Dimensional (3D) Audio

• Stereo/Multichannel Acoustic Echo Cancellation (MCAEC)

• Integration of MCAEC and MCNR

• Three Dimensional (3D) Audio


Conclusions

• While it has been more than 40 years since Neil Armstrong landed on the Moon, the astronauts are still using the communication carrier assembly (CCA) based audio system for voice communication in spacesuits.

• The new spacesuit design is going to take advantage of the most recent advances in multichannel acoustic and speech signal processing for echo and noise control and meanwhile with significantly improved crew comfort and ease of use.

Noise reduction with microphone arrays

Multichannel echo cancellation

Integrated echo and noise control

3D audio

• We explained the difference between the traditional beamforming method and what we called the multichannel noise reduction approach.

• We presented an intuitive interpretation of the widely linear Wiener filter for single-channel noise reduction.

• We described a new application of immersive communication in space exploration, ancillary to its mainstream use in commercial telecommunication systems.

• While it has been more than 40 years since Neil Armstrong landed on the Moon, the astronauts are still using the communication carrier assembly (CCA) based audio system for voice communication in spacesuits.

• The new spacesuit design is going to take advantage of the most recent advances in multichannel acoustic and speech signal processing for echo and noise control and meanwhile with significantly improved crew comfort and ease of use.

Noise reduction with microphone arrays

Multichannel echo cancellation

Integrated echo and noise control

3D audio

• We explained the difference between the traditional beamforming method and what we called the multichannel noise reduction approach.

• We presented an intuitive interpretation of the widely linear Wiener filter for single-channel noise reduction.

• We described a new application of immersive communication in space exploration, ancillary to its mainstream use in commercial telecommunication systems.

Download - Noise and Echo Control for Immersive Voice Communication in Spacesuits

Top Related