acoustic echo cancellation for low cost applications alango approach interactive white paper by...

AAcoustic coustic EEcho cho CCancellation ancellation ffor or LLow ow CCost ost AApplicationspplications

Alango approach Alango approach

Interactive white paperInteractive white paper by Alexander A. Goldinby Alexander A. Goldin

2

Presentation roadmapPresentation roadmap

Acoustic echo cancellation in mobile voice communication

Place of acoustic echo cancellation in voice communication

Textbook acoustic echo cancellation

The real acoustic echo cancellation problem

Requirements for a practical Acoustic Echo Canceller

Convergence of Alango adaptive filter in double talk

Advantages of subband adaptive filtering: DSP clock

Advantages of subband adaptive filtering: Convergence

Logic of Alango residual echo suppressor

Advantages of subband residual echo suppression

Alango Acoustic Echo Canceller (all parts together)

Alango Acoustic Echo Canceller

3

Acoustic echo cancellation in mobile voice communicationAcoustic echo cancellation in mobile voice communication

Acoustic echo arises due to coupling between the speaker and the microphone of a communication device. Part of the signal from the far side reproduced by the device speaker is picked by the microphone and transmitted back to the far side by a communication link which may be wired or wireless, analog or digital. The far talker perceives this as echo which is in some cases simply annoying while in others completely prevents efficient voice communication.

Acoustic Echo Cancellation (AEC) should remove all noticeable echo of the far speech from the microphone signal while preserving the near speech quality. If it does a great job, the communication is called “true full-duplex” where both near and far side may talk and hear simultaneously. In some circumstances it can really be achieved, in others we can get close to it while in some extreme cases efficient echo cancellation is not possible and we have effectively resort to half-duplex communication. Ability to provide duplex communication is defined by AEC technology at hand, type of application, used acoustic components and acoustic design. Examples are provided by two pictures below.

For a mobile phone working in the speakerphone mode the speaker volume and the corresponding acoustic echo is much larger while the near talker voice is relatively weak. Often overdriven speaker has large distortions and possible mechanical resonances inside the phone cavity complicate the case. The task of acoustic echo cancellation becomes extremely challenging and the true full duplex communication is very difficult to achieve.

For a mobile phone working in the handset mode the speaker volume is relatively low so that the initial echo is rather small. With proper acoustic components and device acoustic design, the speaker distortions and the mechanical coupling between the speaker and the microphone are also small. As a result, the task of acoustic echo cancellation is relatively simple so that true full duplex communication rather easy to achieve.

Large speech/echo ratioSmall speaker distortions

Small speech/echo ratioLarge speaker distortions

4

Place of acoustic echo cancellation in voice communicationPlace of acoustic echo cancellation in voice communication

To/from far side

Reference signal

MA Acoustic Echo

canceller ADC DAC PA

Communication link

Acoustic Echo

Near side talk

Primary signal

Acoustic Echo Canceller is responsible for cleaning voice communication from acoustic echo.

To be able to differentiate between echo and near side talk, Acoustic Echo Canceller is provided with the reference, speaker signal. Assuming that the far and near talks are not correlated (as signals), Acoustic Echo Canceller compares the speaker (reference) and the microphone (primary) signals trying to remove all parts of the microphone signal that are “correlated” with the reference signal. The term “correlation” is used here in wide, human sense rather than in according to strict mathematical definition.

To improve “correlation” in both human and mathematical sense, Acoustic Echo Canceller must be the first signal processing block that gets the microphone signal. No non-linear processing such as automatic gain control or signal distortions such as signal clipping is allowed on the microphone signal. Accordingly, the reference signal must be the speaker signal just before digital-to-analog conversion. No other processing or clipping on the power amplifier should be performed on the analog signal. The speaker volume control must be implemented digitally before the speaker signal is taken for reference to Acoustic Echo Canceller.

5

Textbook acoustic echo cancellationTextbook acoustic echo cancellation

Algorithm goal:Algorithm goal: reduce the error (echo) signal reduce the error (echo) signal ee((nn) as much and fast as possible) as much and fast as possible

Primary signal - p(n)

MA DAC PA

Linear Acoustic Echo

- Adaptive

filter ADC

Voice from far side Signal to far side

Error e(n)

r(n) – reference signal)

0

( )h t r F t d

• Least Mean Squares (LMS)Least Mean Squares (LMS)• Normalized Least Mean Squares (NLMS)Normalized Least Mean Squares (NLMS)• Variable step NLMSVariable step NLMS• Recursive Least Squares (RLS)Recursive Least Squares (RLS)• Frequency domain LMSFrequency domain LMS• Affine Projection Algorithms (APA)Affine Projection Algorithms (APA)• Others …Others …

Do we need Do we need anything else ?anything else ?

Factors to consider:Factors to consider: convergence speed, missadjustment (error), complexity, memory convergence speed, missadjustment (error), complexity, memory

Variety of adaptive filtering algorithms is available:Variety of adaptive filtering algorithms is available:

Textbook algorithms address a simplified problem where:Textbook algorithms address a simplified problem where: - The primary (microphone) signal contains acoustic echo only. - The primary (microphone) signal contains acoustic echo only. - Echo is considered to be a convolution of the reference (speaker) signal - Echo is considered to be a convolution of the reference (speaker) signal rr((tt) with the echo path ) with the echo path FF((tt).).

Standard textbook approaches simulates the echo path by FIR filter with variable, adaptive coefficients. Standard textbook approaches simulates the echo path by FIR filter with variable, adaptive coefficients. The filtered reference signal that is an estimation of the echo signal is subtracted from the primary signal.The filtered reference signal that is an estimation of the echo signal is subtracted from the primary signal.

6

The The realreal acoustic echo cancellation problem acoustic echo cancellation problem

Textbook, adaptive filtering approach addresses an unrealistic problem:Textbook, adaptive filtering approach addresses an unrealistic problem: - Due to system non-linearities, only part of the echo signal is described by the convolution.The other part is non-linear. - Due to system non-linearities, only part of the echo signal is described by the convolution.The other part is non-linear. It is still perceived as echo (correlated to the reference signal in wide, human sense). It is still perceived as echo (correlated to the reference signal in wide, human sense). - Besides the echo, primary signal contains additional components such as near speech and noise. - Besides the echo, primary signal contains additional components such as near speech and noise.

5-10% loudspeaker distortion is “normal” for mobile devices. 5-10% loudspeaker distortion is “normal” for mobile devices. Simulating non-linear echo by linear filtering is not possible.Simulating non-linear echo by linear filtering is not possible.

Strong near talk signal in double talk situation (when both sides are active) Strong near talk signal in double talk situation (when both sides are active) may lead to adaptive filter divergence.may lead to adaptive filter divergence.

Near side talk, noise and non-linear echo represent “wrong” error signal Near side talk, noise and non-linear echo represent “wrong” error signal for the adaptive filter causing wrong adaptationfor the adaptive filter causing wrong adaptation

Primary signal - p(n)

DAC PA

Linear Acoustic Echo

Voice from far side Signal to far side

Error e(n)

r(n) – reference signal)

Non-Linear Acoustic Echo d t D r t

Near side talk

Non-linear distortions (speaker & mechanics)

0

( ) ,h t r F t t d

Noise MA ADC - Adaptive

filter

Strong near side noise (stationary and non-stationary) complicates using Strong near side noise (stationary and non-stationary) complicates using double-talk detector to disable adaptation in double talk (always ON where the far side is active).double-talk detector to disable adaptation in double talk (always ON where the far side is active).

7

Requirements for a practical Acoustic Echo CancellerRequirements for a practical Acoustic Echo Canceller

P rim ary s ignal - p (n )

D A C P A

L inear A coustic E cho

Far ta lk from far s ide N ear ta lk to far s ide

E rro r e (n )

r(n ) – re ference s igna l)

N on -L inear A coustic E cho d t D r t

N ear s ide ta lk

N on -linear d is to rtions (speaker & m echan ics)

0

( ) ,h t r F t t d

N oise M A A D C - A dap tive

f ilte r

R esidua l echo and no ise suppressor

To filter all echo and noise components while preserving the near talk, To filter all echo and noise components while preserving the near talk, a practical Acoustic Echo Canceller requiresa practical Acoustic Echo Canceller requires

- Adaptive filtering algorithm that: - Adaptive filtering algorithm that: Does not diverge from (or better converge to) the true echo simulation in double talk. Does not diverge from (or better converge to) the true echo simulation in double talk. Converges to the true echo in high near side noise. Converges to the true echo in high near side noise. Does not diverge from (or better converge to) the true echo in silence. Does not diverge from (or better converge to) the true echo in silence.

- Residual echo suppressor to block non-linear and residual linear echoes;- Residual echo suppressor to block non-linear and residual linear echoes;

- Noise Suppressor for attenuating ambient near side noise.- Noise Suppressor for attenuating ambient near side noise.

Besides performance requirements, a practical Acoustic Echo Canceller should:Besides performance requirements, a practical Acoustic Echo Canceller should: - Take as small computational resources as possible (DSP clock < 20MIPS with dual MAC DSP, RAM < 8KW).- Take as small computational resources as possible (DSP clock < 20MIPS with dual MAC DSP, RAM < 8KW).

- Do not Introduce a large processing delay (<50ms). - Do not Introduce a large processing delay (<50ms).

8

Combine from bands

Residual echo and noise suppression

Adaptive filtering Split to bands

Combiner

Adaptive Filter N

Adaptive Filter 1

Divider BPF 1 M

M M

Control logic

Filter

Residual Echo & Noise Suppressor N

Comfort

noise

BPF N

+

+

+

-

Divider BPF 1 M

M BPF N

BPF 1

M BPF 1

Residual Echo & Noise Suppressor 1

Alango Acoustic Echo Canceller

Alango Acoustic Echo CancellerAlango Acoustic Echo Canceller

Alango Acoustic Echo Canceling technology answers all the requirements for a practical echo canceller. It Alango Acoustic Echo Canceling technology answers all the requirements for a practical echo canceller. It implements a proprietary adaptive filtering algorithm converging in double talk. It includes a residual echo implements a proprietary adaptive filtering algorithm converging in double talk. It includes a residual echo suppressor that automatically tracks performance of echo canceller and suppress the echo if it is not suppressor that automatically tracks performance of echo canceller and suppress the echo if it is not masked by the near side voice. It also seamlessly integrates a high performance noise suppressor.masked by the near side voice. It also seamlessly integrates a high performance noise suppressor.

For the highest performance it utilizes subband processing where the input (primary and reference) For the highest performance it utilizes subband processing where the input (primary and reference) signals are first divided on multiple frequency subbands. Each band is processed independently and signals are first divided on multiple frequency subbands. Each band is processed independently and the outputs are combined into a full band output signal. Subband processing provides multiple the outputs are combined into a full band output signal. Subband processing provides multiple advantages over one (full) band processing. These advantages are explained and demonstrated in advantages over one (full) band processing. These advantages are explained and demonstrated in the following slides.the following slides.

Near sideNear side

Far sideFar side

Alango acoustic echo Alango acoustic echo canceling technology is fully canceling technology is fully

scalable such that the number scalable such that the number of subbands may be chosen to of subbands may be chosen to

provide the best tradeoff provide the best tradeoff between performance, MIPS, between performance, MIPS, memory and processing delay memory and processing delay

requirements.requirements.Standard options include Standard options include

8, 16, 32 subbands8, 16, 32 subbands

9

Convergence of Alango adaptive filter in double talkConvergence of Alango adaptive filter in double talk

Textbook (NLMS)adaptive filter

Far talk

Near sideNear sideFar sideFar side

Acoustic Echo Cancellation in continuous double talk posts significant challenge. Adaptive filtering algorithms try to remove all components of the Acoustic Echo Cancellation in continuous double talk posts significant challenge. Adaptive filtering algorithms try to remove all components of the primary signal that are correlated with the reference signal. Far and near talk signals are not (mathematically) correlated in a long run. However, the primary signal that are correlated with the reference signal. Far and near talk signals are not (mathematically) correlated in a long run. However, the real life speech signals will always be correlated to some extent when correlation is computed in a short time interval. In general, the shorter time real life speech signals will always be correlated to some extent when correlation is computed in a short time interval. In general, the shorter time interval, the larger random correlations between the two speech signals are. For fast initial convergence and good tracking of possible changes in the interval, the larger random correlations between the two speech signals are. For fast initial convergence and good tracking of possible changes in the echo path, the filter must have fast adaptation rate. However, during double talk a fast adaptation rate leads to wrong adaptation as the filter tries to echo path, the filter must have fast adaptation rate. However, during double talk a fast adaptation rate leads to wrong adaptation as the filter tries to adapt to random, short-time correlations between the near and far talk signals. If the problem is not addressed specifically, a good tradeoff between the adapt to random, short-time correlations between the near and far talk signals. If the problem is not addressed specifically, a good tradeoff between the convergence speed and double talk performance is not possible to achieve.convergence speed and double talk performance is not possible to achieve.

If If double talk detector is used in an Echo Canceller, it is supposed to resolve the problem of wrong adaptation by disabling or slowing the filter talk detector is used in an Echo Canceller, it is supposed to resolve the problem of wrong adaptation by disabling or slowing the filter adaptation when double talk is detected. Creating a reliable double talk detector is a challenge by itself. However, even a perfect detector does not adaptation when double talk is detected. Creating a reliable double talk detector is a challenge by itself. However, even a perfect detector does not provide the ultimate solution as changes in the echo path cannot be followed in double talk. Besides, with high level of near side noise (especially non-provide the ultimate solution as changes in the echo path cannot be followed in double talk. Besides, with high level of near side noise (especially non-stationary), there will be continues “double talk” situation so that no filter adaptation will occur.stationary), there will be continues “double talk” situation so that no filter adaptation will occur.

Alango proprietary adaptive filtering algorithm implements a control logic that provides robustness and convergence in double talk without explicitly Alango proprietary adaptive filtering algorithm implements a control logic that provides robustness and convergence in double talk without explicitly slowing down the filter adaptation rate. The audio example below demonstrate the algorithm performance in continuous double talk situation. Press the slowing down the filter adaptation rate. The audio example below demonstrate the algorithm performance in continuous double talk situation. Press the corresponding button to here the reference (far side) signal, the primary (microphone) signal before processing, the processing result by Alango full-corresponding button to here the reference (far side) signal, the primary (microphone) signal before processing, the processing result by Alango full-band adaptive filter as well as a textbook NLMS algorithm for comparison.band adaptive filter as well as a textbook NLMS algorithm for comparison.

Advantages of subband adaptive filtering and suppression of residual echo left after adaptive filtering are explained on next slides.Advantages of subband adaptive filtering and suppression of residual echo left after adaptive filtering are explained on next slides.

Adaptive Filter

Control logic

Filter

-

Acoustic echoAcoustic echo

No processing( mic. signal )

Alango adaptive filterNear speechNear speech

NoiseNoise

10

Advantages of subband adaptive filtering: Advantages of subband adaptive filtering: DSP clockDSP clock

Full band adaptive filterFull band adaptive filter

T=0.1s – filter time spanT=0.1s – filter time span

LLFF= F= FSS x T x T

Sample rate FSample rate FSS=8000=8000

Full band complexity: FFull band complexity: FSS x L x R x L x RLMSLMS= 8000x800x5 == 8000x800x5 = 32 MIPS32 MIPS

Subband complexity: N x ( Fs/M x L x CSubband complexity: N x ( Fs/M x L x CLMSLMS ) = 32x(250x25x20) = ) = 32x(250x25x20) = 4 MIPS 4 MIPS or 8 times reduction!!!or 8 times reduction!!!

Complexity of LMS type of algorithms is proportional to the filter length. Using RComplexity of LMS type of algorithms is proportional to the filter length. Using RLMS LMS = 5 as realistic LMS factor (instructions per filter coefficient per input = 5 as realistic LMS factor (instructions per filter coefficient per input

sample), we come to the following estimationsample), we come to the following estimation

Subband adaptive filtering scheme provides significant saving of DSP clock compared to the full band implementation where the adaptive filter covers the Subband adaptive filtering scheme provides significant saving of DSP clock compared to the full band implementation where the adaptive filter covers the same time span T. As an example, we’ll consider a real life voice communication scenario where the adaptive filter length corresponds to T=100ms. For the same time span T. As an example, we’ll consider a real life voice communication scenario where the adaptive filter length corresponds to T=100ms. For the standard sampling frequency Fstandard sampling frequency FSS=8KHz, the corresponding full band filter length L=8KHz, the corresponding full band filter length LFF will be 800 taps. will be 800 taps.

Let us consider the case when the input signals are divided on N complex frequency subbands, adaptive filtering is performed independently in each Let us consider the case when the input signals are divided on N complex frequency subbands, adaptive filtering is performed independently in each subband (Alango technology uses complex subband filters). For illustration, the subband decomposition stage is represented as Band Pass Filtering subband (Alango technology uses complex subband filters). For illustration, the subband decomposition stage is represented as Band Pass Filtering (BPF) followed by (BPF) followed by downsamplingdownsampling by factor M. To cover the same time span on the downsampled signals, the subband filters will have to be M times by factor M. To cover the same time span on the downsampled signals, the subband filters will have to be M times shorter. There are N such filters but the filters operate on complex signals with the sampling rate reduced by M.shorter. There are N such filters but the filters operate on complex signals with the sampling rate reduced by M.

BPF NBPF N MM MM

BPF 1BPF 1 MM MM

BPF NBPF N

BPF 1BPF 1Subband adaptive filterSubband adaptive filter

LLSS= F= FS S x T x T / M/ M

Subband adaptive filterSubband adaptive filter

FFSS=8000/M=8000/M

In Alango technology M=N. For M=32 the sampling rate in each subband is reduced to FIn Alango technology M=N. For M=32 the sampling rate in each subband is reduced to FSS=8000/32=250Hz and the filters are only L=800/32=25 taps length. =8000/32=250Hz and the filters are only L=800/32=25 taps length.

Complex operations are more consuming (4 real multiplications to implement one complex). Thus we’ll use a complex LMS factor as: CComplex operations are more consuming (4 real multiplications to implement one complex). Thus we’ll use a complex LMS factor as: CLMSLMS= R= RLMSLMS x4 = 20 . x4 = 20 .

Putting all together, we have:Putting all together, we have:

11

Advantages of subband adaptive filtering: Advantages of subband adaptive filtering: ConvergenceConvergence

Full band spectral rangeFull band spectral range

Subband spectral rangeSubband spectral range

To compare performance of Alango full band and subband adaptive To compare performance of Alango full band and subband adaptive filtering technologies on the same double talk signals, use the action filtering technologies on the same double talk signals, use the action buttonsbuttons

Fig.1 Typical speech spectrumFig.1 Typical speech spectrum

Fig.2 Typical speech spectrum divided on Fig.2 Typical speech spectrum divided on subbandssubbands

Microphone

Full band adaptive filter

Subband adaptive filter

Alango full band Alango full band

adaptive filteradaptive filter

Alango subband Alango subband adaptive filtersadaptive filters

Algorithms of LMS type are the most widely used due to their low complexity, low memory requirements and efficiency of DSP Algorithms of LMS type are the most widely used due to their low complexity, low memory requirements and efficiency of DSP implementation. However, convergence of LMS types of adaptive filtering algorithms is inverse proportional to the spectral implementation. However, convergence of LMS types of adaptive filtering algorithms is inverse proportional to the spectral diversity of its input signals (ratio of the strongest and weakest spectral components). A typical speech signal spectrum is shown diversity of its input signals (ratio of the strongest and weakest spectral components). A typical speech signal spectrum is shown on Fig.1 below and it is seen to have a relatively large spectral diversity with most energy concentrated in low frequency region on Fig.1 below and it is seen to have a relatively large spectral diversity with most energy concentrated in low frequency region (100-1000Hz). As such, full band LMS algorithms perform worse on speech signals than on a white noise. Subband processing (100-1000Hz). As such, full band LMS algorithms perform worse on speech signals than on a white noise. Subband processing divides the whole spectrum on narrow frequency subbands so that the spectral diversity in each band is much smaller (see divides the whole spectrum on narrow frequency subbands so that the spectral diversity in each band is much smaller (see Fig.2). As a result subband adaptive filters converge faster and better than an equivalent full band filter. Fig.2). As a result subband adaptive filters converge faster and better than an equivalent full band filter.

As it is heard from the examples, in real conditions not all acoustic As it is heard from the examples, in real conditions not all acoustic echo can be removed by adaptive filtering alone. The residual echo echo can be removed by adaptive filtering alone. The residual echo must be suppressed by other means. The logic and implementation must be suppressed by other means. The logic and implementation of Alango residual echo suppressor is discussed on next slides of Alango residual echo suppressor is discussed on next slides

12

Logic of Alango residual echo suppressorLogic of Alango residual echo suppressor

r(n) -speaker signal

p(n) before adaptive filter

Echo Return Loss (ERL)

Co

ntr

ol b

lock

Comfort noise

x(n) after adaptive filter

r(n)

V(n)

Q(n)

p(n)

x(n)

z(n)

Echo Return Loss Enhancement (ERLE)

R(n) speaker signal amplitude

P(n) mic signal amplitude

Residual echo

Est

imat

ed

echo

H(n

)

Cleaned echo

Near speech

Sp

eake

r si

gna

l

C(n)

E(n)

S(n)

V(n)P(n)

Alango residual echo suppression works in the same frequency subbands as the adaptive filters and it is based on Alango residual echo suppression works in the same frequency subbands as the adaptive filters and it is based on the following general principal: the following general principal: ““attenuate a frequency band if the residual echo in it is not masked by near talkattenuate a frequency band if the residual echo in it is not masked by near talk ”. ”.

Make decision:Make decision:IfIf “Signal to echo” ratio“Signal to echo” ratio TT((nn)/)/EERR((nn)) is larger than a thresholdis larger than a threshold

ThenThen pass the signalpass the signalElseElse substitute comfort noise of amplitude substitute comfort noise of amplitude NN((nn))

Estimate:Estimate:• Speaker (reference) signal amplitude:Speaker (reference) signal amplitude: RR((nn))• Microphone (primary) signal amplitude:Microphone (primary) signal amplitude: PP((nn))• Echo Return Loss (ERL):Echo Return Loss (ERL): VV((nn)) • Echo Return Loss Enhancement (ERLE):Echo Return Loss Enhancement (ERLE): QQ((nn))• Initial Echo amplitude:Initial Echo amplitude: EEII((nn)= )= RR((nn) x ) x VV((nn))

• Residual Echo amplitude:Residual Echo amplitude: EERR((nn))= = EE((nn) x ) x QQ((nn))

• Near Talk amplitude:Near Talk amplitude: TT((nn))= = PP((nn) - ) - EEII((nn))

• Near talk signal to residual echo ratio:Near talk signal to residual echo ratio: TT((nn)/)/EERR((nn))

• Estimate noise amplitudeEstimate noise amplitude NN((nn))

In real life situations Acoustic echo cannot be sufficiently eliminated by an adaptive filter alone. The adaptive filter must be In real life situations Acoustic echo cannot be sufficiently eliminated by an adaptive filter alone. The adaptive filter must be followed by Residual Echo Suppressor must attenuating the residual echo to a level where it is unnoticeable to the person on followed by Residual Echo Suppressor must attenuating the residual echo to a level where it is unnoticeable to the person on the far side. Residual Echo Suppressor is inherently a the far side. Residual Echo Suppressor is inherently a nonlinearnonlinear processor as most of the remaining echo arises due to system processor as most of the remaining echo arises due to system (mainly speaker) nonlinearities and it is not linearly related to the reference signal. As such, Residual Echo Suppressor logic (mainly speaker) nonlinearities and it is not linearly related to the reference signal. As such, Residual Echo Suppressor logic cannot be described mathematically making it the most challenging block of a practical Acoustic Echo Canceller. cannot be described mathematically making it the most challenging block of a practical Acoustic Echo Canceller.

For the place of Residual Echo Suppressor in Acoustic Echo For the place of Residual Echo Suppressor in Acoustic Echo Canceller, see Slides 8,14. The structure of Alango Residual Echo Canceller, see Slides 8,14. The structure of Alango Residual Echo Suppressor subband block is shown on the left and its logic is Suppressor subband block is shown on the left and its logic is explained below. Remember that all signals are subband signals.explained below. Remember that all signals are subband signals.

13

Advantages of subband residual echo suppressionAdvantages of subband residual echo suppression

Comfort noise only

Original microphone signal

Residual Echo

Near speech

Comfort noise

Original microphone signal

Residual Echo

Near speech

Microphone signal amplitude

Spectrum after adaptive filterSpectrum after adaptive filter

Spectrum after full band residual echo suppressorSpectrum after full band residual echo suppressor

Spectrum after subband residual echo suppressorSpectrum after subband residual echo suppressor

After adaptive filter

Full band echosuppressor withcomfort noise

Subband echosuppressor without

comfort noise

Subband echosuppressor withcomfort noise

In the full band implementation the microphone may be either closed or open depending on In the full band implementation the microphone may be either closed or open depending on either the full residual echo is masked by a near side speech or not. If the spectrum of the either the full residual echo is masked by a near side speech or not. If the spectrum of the residual echo and the near side speech do not overlap, no masking occurs and the residual echo and the near side speech do not overlap, no masking occurs and the microphone channel will be closed each time there is an activity on the speaker channel. microphone channel will be closed each time there is an activity on the speaker channel. This essentially leads to half-duplex performance. The situation after the echo canceller and This essentially leads to half-duplex performance. The situation after the echo canceller and the corresponding full band decision is depicted on the upper and middle figures on the left.the corresponding full band decision is depicted on the upper and middle figures on the left.In the subband implementation (see the bottom figure), the decision are taken independently In the subband implementation (see the bottom figure), the decision are taken independently for each band. This allows passing part of the near side speech even when there is a strong for each band. This allows passing part of the near side speech even when there is a strong residual echo in some frequency region. In general, this is the region where the energy of residual echo in some frequency region. In general, this is the region where the energy of the near speech is small so that associated distortions are not actually noticeable.the near speech is small so that associated distortions are not actually noticeable.Whenever the whole microphone channel or some of its frequency bands are closed, Whenever the whole microphone channel or some of its frequency bands are closed, comfort noise should be substituted instead. Without it even the best residual echo comfort noise should be substituted instead. Without it even the best residual echo suppressor will sound “choppy”. The be unnoticeable, spectral properties of comfort noise suppressor will sound “choppy”. The be unnoticeable, spectral properties of comfort noise mast match those of the real ambient one. Subband implementation makes it easier.mast match those of the real ambient one. Subband implementation makes it easier.

Press corresponding buttons to compare full band and subband implantations of residual Press corresponding buttons to compare full band and subband implantations of residual echo suppressor working on the same adaptive filter output echo suppressor working on the same adaptive filter output

Performing Residual Echo Suppression in Performing Residual Echo Suppression in narrow frequency bands provides significant narrow frequency bands provides significant advantages over the full band advantages over the full band implementation. implementation.

From From adaptive filteradaptive filter

Full band Full band residual echo suppressorresidual echo suppressor

Subband Subband residual echo suppressorresidual echo suppressor

Comfort noise

Control

Comfort noise

Control

Comfort noise

Control

Band 1Band 1

Band NBand N

14

Combine from bands

Suppress residual echo

Filter echo Split to bands

Combiner

Adaptive Filter N

Adaptive Filter 1

Divider BPF 1 M

M M

Control logic

Filter

Residual Echo & Noise Suppressor N

Comfort

noise

BPF N

+

+

+

-

Divider BPF 1 M

M BPF N

BPF 1

M BPF 1

Residual Echo & Noise Suppressor 1

Alango Acoustic Echo Canceller (all parts together)Alango Acoustic Echo Canceller (all parts together)

Speaker

Near sideNear side

Far sideFar side

No processing(mic. Signal)

Subband adaptive filter

Comfort noiseSubband echo

suppressorNoise

Suppression

Acoustic echo

Near voice

Noise

On this slide we can see and hear all components of Alango Acoustic Echo canceller On this slide we can see and hear all components of Alango Acoustic Echo canceller working together in a continuous double talk situation with some noise at the near working together in a continuous double talk situation with some noise at the near side. Press the corresponding buttons to listen for processing results after each side. Press the corresponding buttons to listen for processing results after each stage.stage.

15

Alango Acoustic Echo Canceller (all parts together)Alango Acoustic Echo Canceller (all parts together)

To some extent, it is possible to evaluate Acoustic Echo Cancellation technology off-line, using signals prerecorded in specific conditions. However, the final test should always be done in real time where two persons speak from near and far sides. Human sound perception depends in a large extent on weather the person. As such, only in real time talk one is able to understand the real quality in double talk.

Real time evaluation requires establishing a voice communication link. However, when using a voice communication network, the processing being done by Acoustic Echo Canceller is integrated with network processing (e.g. line echo canceling), delays, packet losses and other problems. As such, the results depend on the current network conditions To remove these addition processing components, Alango developed a special evaluation kit where the communication line is simulated by a long cable. The structure of the evaluation kit and its photograph are shown on the pictures below.

The kit consists of the DSP box where the technology is implemented. It may be one of DSP evaluation boards that are supported by Alango technology. The DSP box is connected to an interface box with four amplifiers (to mic amplifiers and two power amplifiers). On one side the interface box is connected to acoustic components of the voice terminal where the technology is supposed to cancel acoustic echoes. On the other side it is connected to a standard telephone handset via 10m cable. The cable is long enough to go to another room to minimize acoustic coupling between far and near sides. The kit is really “plug and talk” so that no dialing is necessary. It is also easy to try the technology with different acoustic components packed into different enclosures.

Near mic Far

mic

Hands-free car speaker and microphone or mobile terminal

Near speaker Far

speaker

Near side room Far side room

DSP box with Acoustic Echo/Noise Cancellation

technology

Interface box Far mic

Far speaker

10 meters cable

Speaker volume

16

www.alango.com

Contact Contact [email protected]@alango.com

acoustic echo cancellation for low cost applications alango approach interactive white paper by...

Documents

corresponding acoustic

acoustic echo cancellation

echo path

noticeable echo

initial echo

error echo signal en

reference signal

device acoustic design