|l0 norm constraint lms algorithm for sparse system identificationlms algorithm for sparse system

Upload: jessica-sanson

Post on 13-Feb-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/23/2019 |l0 Norm Constraint LMS Algorithm for Sparse System IdentificationLMS Algorithm for Sparse System

    1/4

    774 IEEE SIGNAL PROCESSING LETTERS, VOL. 16, NO. 9, SEPTEMBER 2009

    Norm Constraint LMS Algorithmfor Sparse System Identification

    Yuantao Gu, Jian Jin, and Shunliang Mei

    AbstractIn order to improve the performance of Least MeanSquare (LMS) based system identification of sparse systems, anew adaptive algorithm is proposed which utilizes the sparsityproperty of such systems. A general approximating approach on

    norma typical metric of system sparsity, is proposed andintegrated into the cost function of the LMS algorithm. This inte-gration is equivalent to add a zero attractor in the iterations, bywhich the convergence rate of small coefficients, that dominate thesparse system, can be effectively improved. Moreover, using par-tial updating method, the computational complexity is reduced.The simulations demonstrate that the proposed algorithm caneffectively improve the performance of LMS-based identificationalgorithms on sparse system.

    Index Terms

    norm, adaptive filter, least mean square (LMS),sparsity.

    I. INTRODUCTION

    ASPARSE system is defined whose impulse responsecontains many near-zero coefficients and few large ones.

    Sparse systems, which exist in many applications, such as Dig-ital TV transmission channels [1] and the echo paths [2], can befurther divided to general sparse systems [Fig. 1(a)] and clus-tering sparse systems [Fig. 1(b), ITU-T G.168]. A clusteringsparse system consists of one or more clusters, wherein a cluster

    is defined as a gathering of large coefficients. For example, theacoustic echo path is a typical single clustering sparse system,while the echo path of satellite links is a multiclustering systemwhich includes several clusters.

    There are many adaptive algorithms for system identifica-tion, such as Least Mean Squares (LMS) and Recursive LeastSquares (RLS) [3]. However, these algorithms have no partic-ular advantage in sparse system identification due to no use ofsparse characteristic. In recent decades, some algorithms haveexploited the sparse nature of a system to improve the iden-tification performance [2], [4][10]. As far as we know, thefirst of them is Adaptive Delay Filters (ADF) [4], which lo-cates and adapts each selected tap-weight according to its im-portance. Then, the concept of proportionate updating was origi-

    nally introduced for echo cancellation application by Duttweiler[2]. The underlying principle of Proportionate Normalized LMS(PNLMS) is to adapt each coefficient with an adaptation gainproportional to its own magnitude. Based on PNLMS, there ex-ists many improved PNLMS algorithms, such as IPNLMS [5]and IIPNLMS [6]. Besides the above mentioned algorithms,

    Manuscript received March 24, 2009, revised May 11, 2009. First publishedJune05, 2009; current versionpublished July09, 2009. Thiswork was supportedin part by the National Natural Science Foundation of China (NSFC 60872087and NSFCU0835003). The associate editor coordinating the review of thisman-uscript and approving it for publication was Prof. Jen-Tzung Chien.

    The authors are with the Department of Electronic Engineering, TsinghuaUniversity, Beijing 100084, China (e-mail: [email protected]).

    Digital Object Identifier 10.1109/LSP.2009.2024736

    Fig. 1. Typical sparse system.

    there are various improved LMS algorithms on clustering sparsesystem [7], [8], [10]. These algorithms locate and track nonzerocoefficients by dynamically adjusting the length of the filter. Theconvergence behaviors of these algorithms depend on the spanof clusters (the length from the first nonzero coefficient to thelast one in an impulse response). When the span is long andclose to the maximum length of the filter or the system has mul-tiple clusters, these algorithms have no advantage compared to

    the traditional algorithms.Motivated by Least Absolutely Shrinkage and Selection Op-

    erator (LASSO) [11] and the recent research on CompressiveSensing (CS) [12], a new LMS algorithm with norm con-straint is proposed in order to accelerate the sparse system iden-tification. Specifically, by exerting the constraint to the standardLMS cost function, the solution will be sparse and the gradientdescent recursion will accelerate the convergence of near-zerocoefficients in the sparse system. Furthermore, using partial up-dating method, the additional computational complexity causedby norm constraint is far reduced. Simulations show that thenew algorithm performs well for the sparse system identifica-tion.

    II. NEWLMS ALGORITHM

    The estimation error of the adaptive filter output with respectto the desired signal is

    (1)

    where and

    denote thecoefficient vector and input vector, respectively, is the timeinstant, and is the filter length. In traditional LMS the costfunction is defined as squared error .

    By minimizing the cost function, the filter coefficients are

    updated iteratively,

    1070-9908/$25.00 2009 IEEE

  • 7/23/2019 |l0 Norm Constraint LMS Algorithm for Sparse System IdentificationLMS Algorithm for Sparse System

    2/4

    GUet al.: NORM CONSTRAINT LMS ALGORITHM FOR SPARSE SYSTEM IDENTIFICATION 775

    (2)

    where is the step-size of adaptation.The research on CS shows that sparsity can be best repre-

    sented by norm, in which constraint the sparsest solution is

    acquired. This suggests that a norm penalty on the filter co-efficients can be incorporated to the cost function when the un-known parameters are sparse. The new cost function is definedas

    (3)

    where denotes norm that counts the number of nonzeroentries in , and is a factor to balance the new penaltyand the estimation error. Considering that norm minimizationis a Non-Polynomial (NP) hard problem, norm is generallyapproximated by a continuous function. A popular approxima-tion [13] is

    (4)

    where the two sides are strictly equal when the parameter ap-proaches infinity. According to (4), the proposed cost functioncan be rewritten as

    (5)

    By minimizing (5), the new gradient descent recursion offilter coefficients is

    (6)

    where and is a component-wise sign functiondefined as

    elsewhere. (7)

    To reduce the computational complexity of (6), especiallythat caused by the last term, the first order Taylor series expan-sions of exponential functions is taken into consideration,

    elsewhere. (8)

    It is to be noted that the approximation of (8) is bounded tobe positive because the exponential function is larger than zero.Thus (6) can be approximated as

    (9)

    where

    elsewhere.

    (10)

    TABLE IPSEUDO-CODES OF -LMS

    The algorithm described by (9) is denoted as -LMS. Its im-plementation costs more than traditional LMS due to the lastterm in the right side of (9). It is necessary, therefore, to reducethe computational complexity further. Because the value of thelast term does not change significantly during the adaptation, theidea of partial updating [14], [15] can be used. Here the simplestmethod of sequential LMS is adopted. That is, at each iteration,

    one in coefficients (where is a given integer in advance)is updated with the latest , while those calculated inthe previous iterations are used for the other coefficients. Thus,the excessive computational complexity of the last term is onein of the original method. More detailed discussion on par-tial update can be found in [14]. The final algorithm is describedusing MATLAB like pseudo-codes in Table I.

    In addition, the proposed norm constraint can be readilyadopted to improve most LMS variants, e.g., NLMS [3], whichmay be more attractive than LMS because of its robustness. Thenew recursion of -NLMS is

    (11)

    where is the regularization parameter.

    III. BRIEFDISCUSSION

    The recursion of filter coefficients in the traditional LMS canbe expressed as

    (12)

    where the filter coefficients are updated along the negative gra-dient direction. Equation (9) can be presented in the similar way,

    (13)

    where zero attraction means the last term in (9), ,which imposes an attraction to zero on small coefficients. Par-ticularly, referring to Fig. 2, after each iteration, a filter weightwill decrease a little when it is positive, or increase a little whenit is negative. Therefore, it seems that in space of tap coef-ficients,an attractor, which attracts the nonzero vectors, existsat the coordinate origin. The range of attraction depends on theparameter, .

    The function ofzero attractorleads to the performance im-provement of -LMS in sparse system identification. To be spe-cific, in the process of adaptation, a tap coefficient closer to zero

    indicates a higher possibility of being zero itself in the impulseresponse. As shown in Fig. 2, when a coefficient is within a

  • 7/23/2019 |l0 Norm Constraint LMS Algorithm for Sparse System IdentificationLMS Algorithm for Sparse System

    3/4

    776 IEEE SIGNAL PROCESSING LETTERS, VOL. 16, NO. 9, SEPTEMBER 2009

    Fig. 2. Curves of function with .

    neighborhood of zero, , the closer it is to zero, the

    greater the attraction intensity is. When a coefficient is out of therange, no additional attraction is exerted. Thus, the convergencerate of those near-zero coefficients will be raised. In conclusion,the acceleration of convergence of near-zero coefficients willimprove the performance of sparse system identification sincethose coefficients are in the majority.

    According to the above analysis, it can be readily acceptedthat and determine the performance of the proposed algo-rithm. Here a brief discussion about the choice of these two pa-rameters will be given.

    The choice of : As mentioned above, strong attractionintensity or a wide attraction range, which means the tapcoefficients are attracted more, will accelerate the conver-gence. According to Fig. 2, a large means strong inten-

    sity but a narrow attraction range. Therefore, it is difficultto evaluate the impact of on the convergence rate. Forpractical purposes, Bradley and Mangasarian in [13] sug-gest to set the value of to some finite value like 5 or in-creased slowly throughout the iteration process for betterapproximation. Here, is also proper. Further detailsare omitted here for brevity. And readers of interest pleaserefer to [13].

    The choice of : According to (3) or (9), the parameterdenotes the importance of norm or the intensity of at-traction. So a large results in a faster convergence sincethe intensity of attraction increases as increases. On theother hand, steady-state misalignment also increases as

    increases. After the adaptation reaches steady state, mostfilter weights are near to zero due to the sparsity. We havefor most . Regarding to those near-

    zero coefficients, will move randomly in the smallneighborhood of zero, driven by the attraction term as wellas the gradient noise term. Therefore, a large results in alarge steady-state misalignment. In conclusion, the param-eters are determined by the trade-off between adaptationspeed and adaptation quality in particular applications.

    IV. SIMULATIONS

    The proposed -NLMS is compared with the conventionalalgorithms NLMS, Stochastic Taps NLMS (STNLMS) [7],IPNLMS, and IIPNLMS in the application of sparse system

    identification. The effect of parameters of -LMS is also testedin various scenarios. and the proposed partial updating

    Fig. 3. Comparison of convergence rate for five different algorithms, driven bycolored signal.

    method with for -LMS and -NLMS is used in all thesimulations.

    The first experiment is to test the convergence and trackingperformance of the proposed algorithm driven by a coloredsignal. The unknown system is a network echo path, which isinitialized with the echo path model 5 in ITU-T recommenda-tion, delayed by 100 taps and tailed zeros [clustering sparsesystem, Fig. 1(b)]. After iterations, the delay is enlargedto 300 taps and the amplitude decrease 6dB. The input signalis generated by white Gaussian noise driving a first-orderAuto-Regressive (AR) filter, , and

    is normalized. The observed noise is white Gaussian withvariance . The five algorithms are simulated for a hundredtimes, respectively, with parameters . Theother parameters as follows.

    IPNLMS [5] and IIPNLMS [6]:.

    ST-NLMS: the initial positions of the first and last activetaps of the primary filter are 0 and 499, respectively; thoseof the auxiliary filter are randomly chosen.

    -NLMS: .Please notice that the parameters of all algorithms are chosen tomake their steady-state errors the same. The Mean Square Devi-ations (MSDs) between the coefficients of the adaptive filter andthe unknown system are shown in Fig. 3. According to Fig. 3,the proposed -NLMS reaches steady-state first among all algo-

    rithms. In addition, when the unknown system abruptly changes,again the proposed algorithm reaches steady-state first.The second experiment is to test the convergence perfor-

    mance of -LMS with different parameters . Suppose thatthe unknown system has 128 coefficients, in which eight ofthem are nonzero ones (their locations and values are randomlyselected). The driven signal and observed noise are white,Gaussian with variance 1 and , respectively. The filterlength is . The step-size of -LMS is fixed to ,while is with different values. After a hundred times run, theirMSDs are shown in Fig. 4, in which MSDs of LMSare also plotted for reference. It is evidently recognized thatnorm constraint algorithm converges faster than its ancestor. Inaddition, from the figure, it is obvious that a larger results in a

    higher convergence rate but a larger steady-state misalignment.These illustrate again that a delicate compromise should be

  • 7/23/2019 |l0 Norm Constraint LMS Algorithm for Sparse System IdentificationLMS Algorithm for Sparse System

    4/4

    GUet al.: NORM CONSTRAINT LMS ALGORITHM FOR SPARSE SYSTEM IDENTIFICATION 777

    Fig. 4. Learning curves of -LMS with different , driven by white signal.

    made between the convergence rate and steady-state misalign-ment for the choice of in practice. Certainly, the above resultsare consistent with the discussion in the previous section.

    The third experiment is to test the performance of -LMSalgorithm with various sparsities. The unknown system is sup-posed to have a total of 128 coefficients and is a general sparsesystem. The number of large coefficients varies from 8 to 128,while the other coefficients are Gaussian noise with a variance of

    . The input driven signal and observed noise are the sameas that in the first experiment. The filter length is also .In order to compare the convergence rate in all scenarios, thestep-sizes are fixed to . Parameter is carefully chosento make their steady-state error the same (Table II). All algo-rithms are simulated 100 times respectively and their MSDsare shown in Fig. 5. As predicted, the number of the large co-efficients has no influence on the performance of LMS. How-ever, the convergence rate decreases as the number of large co-efficients increases for -LMS. Therefore, the new algorithmis sensitive to the sparsity of system, that is, a sparser systemhas better performance. As the number of large coefficients in-creases, the performance of -LMS is gradually degraded tothat of standard LMS. Meanwhile, it is to be emphasized that inall cases, the -LMS algorithm is never worse than LMS.

    V. CONCLUSION

    In order to improve the performance of sparse system identi-fication, a new LMS algorithm is proposed in this letter by intro-

    ducing norm, which has vital impact on sparsity, to the costfunction as an additional constraint. Such improvement can ev-idently accelerate the convergence of near-zero coefficients inthe impulse response of a sparse system. To reduce the com-puting complexity, a method of partial updating coefficients isadopted. Finally, simulations demonstrate that -LMS acceler-ates the identification of sparse systems. The effects of algo-rithm parameters and unknown system sparsity are also verifiedin the experiments.

    ACKNOWLEDGMENT

    The authors are very grateful to the anonymous reviewers fortheir part in improving the quality of this paper.

    Fig. 5. Learning curves of -LMS and LMS with different sparsities, drivenby white signal, where LC denotes Large Coefficients.

    TABLE IIPARAMETERS OF -LMSIN THE3RDEXPERIMENT

    REFERENCES

    [1] W. F. Schreiber, Advanced television systems for terrestrial broad-casting: Some problems and some proposed solutions, Proc. IEEE,vol. 83, pp. 958981, 1995.

    [2] D. L. Duttweiler, Proportionate normalized least-mean-squares adap-tation in echo cancelers,IEEE Trans. Speech Audio Process, vol. 8,

    no. 5, pp. 508518, 2000.[3] B. Widrow and S. D. Stearns, Adaptive Signal Processing. Engle-

    wood Cliffs, NJ: Prentice-Hall, 1985.[4] D. M. Etter, Identification of sparse impulse response systems using

    an adaptive delay filter, in ICASSP85, pp. 11671172.[5] J. Benesty and S. L. Gay, An improved PNLMS algorithm, inProc.

    IEEE ICASSP, 2002, pp. II-1881II-1884.[6] P. A. Naylor, J. Cui, and M. Brookes, Adaptive algorithms for sparse

    echo cancellation, Signal Process., vol. 86, no. 6, pp. 11821192,2006.

    [7] Y. Li, Y. Gu,and K. Tang, ParallelNLMS filterswith stochastic activetapsand step-sizesfor sparse system identification,in ICASSP06, vol.3, pp. 109112.

    [8] V. H. Nascimento, Improving the initial convergence of adaptivefilters: Variable-length LMS algorithms, in DSP 2002, vol. 2, pp.667670.

    [9] R.K. Martinet al., Exploiting sparsity in adaptive filters,IEEE Trans.Signal Process., vol. 50, pp. 18831894, Aug. 2002.[10] O. A. Noskoski and J. Bermudez, Wavelet-packet-based adaptive al-

    gorithm for sparse impulse response identification, in ICASSP, 2007,vol. 3, pp. 13211324.

    [11] R. Tibshirani, Regression shinkage and selection via teh LASSO,J.Roy. Statist. Soc., vol. 58, no. 1, pp. 267288, 1996.

    [12] D. Donoho, Compressed sensing,IEEE Trans. Inform. Theory, vol.52, pp. 12891306, Apr. 2006.

    [13] P. S. Bradley and O. L. Mangasarian, Feature selection via concaveminimization and support vector machines, in Proc. 13thICML, 1998,pp. 8290.

    [14] C. D. Scott, Adaptive filters employing partial updates,IEEE Trans.Circuits Syst., vol. 44, no. 3, pp. 209216, 1997.

    [15] M. Godavarti and A. O. Hero, Partial update LMS algorithms,IEEETrans. Signal Process., vol. 53, no. 7, pp. 23822399, 2005.