seminar report on contraction mapping in adaptive filters

A

Seminar reportOn

Contraction Mapping: An Important

Property In Adaptive Filters

By

Srinivas Doddi(132070001)

MTECH (SEM-2) CONTROL SYSTEM,

DEPARTMENT OF ELECTRICAL ENGINEERING,

V.J.T.I., MUMBAI-400019.

CONTENT:

1) Abstract.

2) Introduction.

3) Adaptive Filters

A. Transversal Filter

B. Configurations.

4) Contraction Mapping

5) Theorem

A. Proof

B. Example

C. LMS Algorithm

D. convergence in LMS

6) Conclusion.

7) References

1. ABSTRACT

In this paper we show that many adaptive filters used for system

identification are contraction mappings. Applying deterministic

methods we give conditions under which algorithms, like Least Mean

Square, Normalized Least Mean Square, Modified Least Mean Square

with Delayed update, Modified Filtered-X Least Mean Square, Affine

Projection, and Recursive Least Square are a contraction mapping

contracting.

2. INTRODUCTION

Digital signal processing (DSP) has been a major player in the

current technical advancements such as noise filtering, system

identification, and voice prediction. Standard DSP techniques,

however, are not enough to solve these problems quickly and obtain

acceptable results. Adaptive filtering techniques must be implemented

to promote accurate solutions and a timely convergence to that

solution..

An adaptive filter is a system with a linear filter that has a

transfer function controlled by variable parameters and a means to

adjust those parameters according to an optimization algorithm. The

closed loop adaptive filter uses feedback in the form of an error signal

to refine its transfer function.

3. Adaptive Filter

Discrete-time (or digital) filters are ubiquitous in todays signal

processing applications. Filters are used to achieve desired spectral

characteristics of a signal, to reject unwanted signals, like noise or

interferers, to reduce the bit rate in signal transmission, etc. The

notion of making filters adaptive, i.e., to alter parameters

(coefficients) of a filter according to some algorithm, tackles the

problems that we might not in advance know, e.g., the characteristics

of the signal, or of the unwanted signal, or of a systems influence on

the signal that we like to compensate. Adaptive filters can adjust to

unknown environment, and even track signal or system

characteristics.

4). Adaptive Transversal Filters

In a transversal filter of length N, as depicted in fig. 1, at each time n the output sample y[n] is computed by a weighted sum of the current and delayed input samples x[n], x[n − 1]..N 1

c*knxn k

k 0

Here, the ck[n] are time dependent filter coefficients (we use the

complex conjugated coefficients c k[n] so that the derivation of the

adaption algorithm is valid for complex signals, too). This equation

re-written in vector form, using

x[n] = x[n], x[n − 1], . . . , x[n − N + 1] _T, the tap-input vector at time n, and c[n] = c0[n], c1[n], . . . , cN−1[n] the coefficient vector at time n, is y[n] = c[n]x[n].

Both x[n] and c[n] are column vectors of length N, c[n] = (c)T [n] is the hermitian of vector c[n] (each element is conjugated , and the column vector is transposed T into a row vector)

In the special case of the coefficients c[n] not depending on time n:

c[n] = c the transversal filter structure is an FIR filter of length N.

Here, we will, however, focus on the case that the filter coefficients

are variable, and are adapted by an adaptation algorithm.

Adaptive Filtering System Configurations

There are four major types of adaptive filtering configurations;

adaptive system identification, adaptive noise cancellation, adaptive

linear prediction, and adaptive inverse system. All of the above

systems are similar in the implementation of the algorithm, but

different in system configuration. All 4 systems have the same general

parts; an input x(n), a desired result d(n), an output y(n), an adaptive

transfer function w(n), and an error signal e(n) which is the difference

between the desired output u(n) and the actual output y(n). In addition

to these parts, the system identification and the inverse system

configurations have an unknown linear system u(n) that can receive

an input and give a linear output to the given input.

Adaptive System Identification Configuration

The adaptive system identification is primarily responsible for

determining a discrete estimation of the transfer function for an

unknown digital or analog system. The same input x(n) is applied to

both the adaptive filter and the unknown system from which the

outputs are compared

The output of the adaptive filter y(n) is subtracted from the output of

the unknown system resulting in a desired signal d(n). The resulting

difference is an error signal e(n) used to manipulate the filter

coefficients of the adaptive system trending towards an error signal of

zero

Adaptive Noise Cancellation Configuration

The second configuration is the adaptive noise cancellation configuration as shown in figure 2. In this configuration the input

x(n), a noise source N1(n), is compared with a desired signal d(n),

which consists of a signal s(n) corrupted by another noise N0(n). The

adaptive filter coefficients adapt to cause the error signal to be a noiseless version of the signal s(n).

Adaptive Linear Prediction Configuration

Adaptive linear prediction is the third type of adaptive configuration

(see figure 3). This configuration essentially performs two operations.

The first operation, if the output is taken from the error signal e(n), is

linear prediction. The adaptive filter coefficients are being trained to

predict, from the statistics of the input signal x(n), what the next input

signal will be. The second operation, if the output is taken from y(n),

is a noise filter similar to the adaptive noise cancellation outlined in

the previous section.

As in the previous section, neither the linear prediction output nor the

noise cancellation output will converge to an error of zero. This is true

for the linear prediction output because if the error signal did

converge to zero, this would mean that the input signal x(n) is entirely

deterministic, in which case we would not need to transmit any

information at all.

Adaptive Inverse System Configuration

The goal of the adaptive filter here is to model the inverse of the

unknown system u(n). The way this filter works is as follows. The

input x(n) is sent through the unknown filter u(n) and then through the

adaptive filter resulting in an output y(n). The input is also sent

through a delay to attain d(n). As the error signal is converging to

zero, the adaptive filter coefficients w(n) are converging to the inverse

of the unknown system u(n).

Contraction Mapping Theorem

Here we prove a very important fixed point theorem.

Definition1. Let f : X → X be a map of a metric space to itself. A point a ∈ X is called fixed point of f if f(a) = a. Recall that a metric space (X, d) is said to be complete if every Cauchy sequence in X converges to a point X.

Definition2. Let (X, dX) and (Y, dY ) be metric spaces. A map φ: X → Y is called contraction if there exists a positive number c<1 such that dY (φ(x), φ(y)) ≤ cdX (x, )y for allx, y∈ X.

Theorem1 (Contraction mapping theorem) Let (X, d) be a complete metric space. If φ :X → X is a contraction, then φ has a unique fixed point.

Proof. By definition of contraction, there exists a number c∈(0, 1) such that d(φ(x), φ(y)) ≤ cd(x, y). (1)

Let a0 ∈ X be an arbitrary point, and define a sequence a inductively by setting an+1 = φ(an). We claim that n is Cauchy. To see this, first note that for any ≥ 1, we have by (1) d that

(n+1 , an) = d(φ(an), φ(an−1)) ≤ cd(an, an−1), and so we can check easily by induction that d(an+1, an) ≤ cn d(a1, a0) (2)

for all n ≥ 1. This and the triangle inequality then gives m that > n for≥ 1,

d(am, an) ≤ d(am, am−1) + d(am−1, am−2) + . . . + d(an+1, an)≤(cm−1+ c m−2+ . . . + cn)d(a1, a0) n≤cd(a1, a0). (3)

This shows that d(am, an) → 0 as n, m → ∞, i.e. that{an} is Cauchy as claimed. Since (X, )d is complete, there exists a∈ X such that an → a. Being a contraction, φ is continuous, and hence

φ(a) = lim φ(an) = liman+1 = a.

n→∞n→∞

Thus a is a fixed point φ. of If b ∈ X is also a fixed point φ, of then d(a, b) = d(φ(a), φ(b)) ≤ cd(a, b)

which implies, since <1, that d(a, b) = 0 and hence a that = b. Thus the fixed point is unique.

Least Mean Squares Gradient Approximation Method

Given an adaptive filter with an input x(n), an impulse response w(n) and an output y(n) you will get a mathematical relation for the

T

transfer function of the system y(n) = w (n)x(n) and

x(n) = [x(n), x(n-1), x(n-2), ... , x(n-(N-1))]

T

where w (n) = [w0(n), w1(n), w2(n) ... wN-1(n)] are the time domainth

coefficients for an N order FIR filter. Note in the above equation and

throughout a boldface letter represents a vector ant the super script T

represents the transpose of a real valued vector or matrix. Using an

estimate of the ideal cost function the following equation can be2

derived. w(n+1) = w(n) - μΔE[e ](n).

In the above equation w(n+1) represents the new coefficient values2

for the next time interval, μ is a scaling factor, and E[e ](n) is the ideal cost function with respect to the vector w(n). From the above formula one can derive the estimate for the ideal cost function

w(n+1) = w(n) - μe(n)x(n) where e(n) = d(n) - y(n). andT

y(n) = x (n)w(n). In the above equation μ is sometimes multiplied by

2, but here we will assume it is absorbed by the μ factor. In summary,

in the Least Mean Squares Gradient Approximation Method, often

referred to as the Method of Steepest Descent, a guess based on the

current filter coefficients is made, and the gradient vector, the

derivative of the MSE with respect to the filter coefficients, is

calculated from the guess. Then a second guess is made at the tap-

weight vector by making a change in the present guess in a direction opposite to the gradient vector. This process is repeated until the derivative of the MSE is zero.

Convergence of the LMS Adaptive Filter

The convergence characteristics of the LMS adaptive filter is related to the autocorrelation of the input process as defined by

TRx = E[x(n)x (n)]

There are a two conditions that must be satisfied in order for the system to converge. These conditions include:

• The autocorrelation matrix, Rx, must be positive definite.

• 0 < μ < 1/λmax., where λmax is the largest eigenvalue of Rx

In addition, the rate of convergence is related to the eigenvalue spread.

This is defined using the condition number of Rx, defined as κ =

λmax/λmin, where λmin is the minimum eigenvalue of Rx. The fastest

convergence of this system occurs when κ = 1, corresponding to white

noise. This states that the fastest way to train a LMS adaptive system is

to use white noise as the training input. As the noise becomes more and

more colored, the speed of the training will decrease.

Conclusion

Hence we have studied the concept of contraction mapping over various adaptive filters. And case studied how the sequence converges in all the diverse adaptive filters.

References

[1] Simon Haykin: “Adaptive Filter Theory”, Third Edition, Prentice-

Hall, Inc., Upper Saddle

River, NJ, 1996.

[2] Bernard Widrow and Samuel D. Stearns: “Adaptive Signal Processing”, Prentice-Hall,

Inc., Upper Saddle River, NJ, 1985.

[3] Edward A. Lee and David G. Messerschmitt: “Digital Communication”, Kluwer Academic

Publishers, Boston, 1988.

[4] Steven M. Kay: “Fundamentals of Statistical Signal Processing— Detection Theory”, Volume

2, Prentice-Hall, Inc., 1998.

seminar report on contraction mapping in adaptive filters

Documents