asymptotically efficient blind deconvolution

17
Signal Processing 20 (1990) 193-209 193 Elsevier ASYMPTOTICALLY EFFICIENT BLIND DECONVOLUTIONt Sandro BELLINI and Fabio ROCCA Dipartimento di Elettronica, Politecnico di Milano, Piazza L. da Vinci 32, 20133 Milano, Italy Received 23 February 1989 Revised 1 December 1989 Abstract. A solution is proposed for the discrete blind deconvolution problem, i.e., the estimation of the impulse response of a sampled system given its output and some statistical information on its input. The input sequence is supposed to be independent and identically distributed. The estimate uses the crosscorrelation between the output samples and a nonlinear function of the output samples. The technique is efficient when the residual unknown channel distortion is small, i.e., asymptotically. Therefore, it is recommended as a final step to clean up the noise left by any preferred blind deconvolution method. The variance of the estimation error and the optimal nonlinear function, which depends on the input probability density, are given. The variance is checked against the Cramer-Rao bound. The rms error of the suboptimal solution that adopts the nonlinear function 'sign' depends on simple moments of the input data sequence and on the probability density at the origin. Moderate losses ensue in the case of generalized Gaussians. When the distortion is not small, simulations show that this technique is still useful, but iterations are needed to remove the estimation bias. Polyspectral techniques, that could give unbiased solutions for any channel distortion, cannot be any better asymptotically. The paper discusses whey they could be much noisier. The extension of the technique described in this paper to complex data and impulse response and to parameter dependent distortions is straightforward and is briefly sketched. Znsammenfassung. Es wird eine L6sung fiir das diskrete Entfaltungsproblem vorgeschlagen, d.h., die Sch/itzung der Impulsant- wort eines abgetasteten Systems, falls Ausgang sowie statistische Information fiber den Eingang vorliegen. Die Eingangsfolge wird als unabh~ingig und gleichverteilt angenommen. Die Sch~itzung verwendet die Kreuzkorrelierte zwiscben den Ausgang- swerten und einer nicht-linearen Funktion von ihnen. Das Verfahren ist effizient, wenn die restliche unbekannte Kanalst6rung klein ist, zumindest asymptotisch. Deshalb wird als letzter Schritt empfohlen, das iibrige Rauschen mit Hilfe einer Luftfaltung zu reduzieren. Die Varianz des Restfehlers und die optimale nichtlineare Funktion, die vonder Wahrscheinlichkeitsdichte des Eingangs abh~ingig ist, werden angegeben. Die Varianz wird mit der Cramer-Rao Grenze verglichen. Der RMS Fehler der suboptimalen L6sung f~ir den Fall der 'Signum'-Funktion h/ingt von einfachen Momenten der Eingangsfolge und der Wahrscheinlichkeitsdichte im Ursprung ab. Bescheidene Fehler ergeben sich fiir verallgemeinerte GauJ~-Dichten. Wenn die Stfrung nicht klein ist, zeigen Simulationen, da~3 das Verfahren immer noch brauchbar ist, jedoch sind Iterationen erforderlich um den Bias des Sch~itzwertes zu beseitigen. Polyspektrale Ans~itze, die auf LSsungen ohne Bias bei beliebiger Kanalst6rung fiihren, k/Snnen asymptotisch nicht besser sein. Dies wird in der Arbeit diskutiert. Die Erweiterung des vorgeschlagenen Verfahrens auf komplexe Daten und Impulsantwort sowie auf parameterabh~ingige St6rungen ist offensichtlich und wird kurz skizziert. R6sum@. Nous proposons une solution au probl~me de la d6convolution aveugle discrete, c.a.d., de l'estimation de la r6ponse impulsionnelle d'un syst~me 6chantillonn6 connaissant sa sortie et la statistique de l'entr6e. La s6quence d'entr6e est suppos6e ind6pendante et identiquement distribu6e. L'estimation utilise la eross-corr61ation entre les 6chantillons de sortie et un fonction nonlin6aire de ceux-ci. Cette technique est efficace quand la distortion r6siduelle du canal inconnu est faible, c.a.d. asymptotiquement. De ce fait elle est recommand6e en derni~re 6tape pour 61iminer le bruit laiss6 par toute m6tbode de d6convolution aveugle pr6f6rentielle. La variance de l'erreur d'estimation et la fonction nonlin6aire optimale, qui d6pend de la densit6 de probabilit6 de l'entr6e, sont donn6es. La variance est compar6e h la borne de Cramer-Rao. L'erreur rms de la solution sous-optimale utilisant la fonction nonlin6aire 'signe' d6pend de moments simples de la s6quence d'entr6e et de t This work has been supported by MPI 40% Program and by the National Research Council (CNR). It has also been partially supported by The European Communities CONTR. n. ENC3C-0013-I(s). 0165-1684/90/$03.50 © 1990- Elsevier Science Publishers B.V.

Upload: sandro-bellini

Post on 21-Jun-2016

216 views

Category:

Documents


3 download

TRANSCRIPT

Signal Processing 20 (1990) 193-209 193 Elsevier

ASYMPTOTICALLY EFFICIENT BLIND DECONVOLUTIONt

Sandro BELLINI and Fabio ROCCA Dipartimento di Elettronica, Politecnico di Milano, Piazza L. da Vinci 32, 20133 Milano, Italy

Received 23 February 1989 Revised 1 December 1989

Abstract. A solution is proposed for the discrete blind deconvolution problem, i.e., the estimation of the impulse response of a sampled system given its output and some statistical information on its input. The input sequence is supposed to be independent and identically distributed. The estimate uses the crosscorrelation between the output samples and a nonlinear function of the output samples. The technique is efficient when the residual unknown channel distortion is small, i.e., asymptotically. Therefore, it is recommended as a final step to clean up the noise left by any preferred blind deconvolution method. The variance of the estimation error and the optimal nonlinear function, which depends on the input probability density, are given. The variance is checked against the Cramer-Rao bound. The rms error of the suboptimal solution that adopts the nonlinear function 'sign' depends on simple moments of the input data sequence and on the probability density at the origin. Moderate losses ensue in the case of generalized Gaussians. When the distortion is not small, simulations show that this technique is still useful, but iterations are needed to remove the estimation bias. Polyspectral techniques, that could give unbiased solutions for any channel distortion, cannot be any better asymptotically. The paper discusses whey they could be much noisier. The extension of the technique described in this paper to complex data and impulse response and to parameter dependent distortions is straightforward and is briefly sketched.

Znsammenfassung. Es wird eine L6sung fiir das diskrete Entfaltungsproblem vorgeschlagen, d.h., die Sch/itzung der Impulsant- wort eines abgetasteten Systems, falls Ausgang sowie statistische Information fiber den Eingang vorliegen. Die Eingangsfolge wird als unabh~ingig und gleichverteilt angenommen. Die Sch~itzung verwendet die Kreuzkorrelierte zwiscben den Ausgang- swerten und einer nicht-linearen Funktion von ihnen. Das Verfahren ist effizient, wenn die restliche unbekannte Kanalst6rung klein ist, zumindest asymptotisch. Deshalb wird als letzter Schritt empfohlen, das iibrige Rauschen mit Hilfe einer Luftfaltung zu reduzieren. Die Varianz des Restfehlers und die optimale nichtlineare Funktion, die v o n d e r Wahrscheinlichkeitsdichte des Eingangs abh~ingig ist, werden angegeben. Die Varianz wird mit der Cramer-Rao Grenze verglichen. Der RMS Fehler der suboptimalen L6sung f~ir den Fall der 'Signum'-Funktion h/ingt von einfachen Momenten der Eingangsfolge und der Wahrscheinlichkeitsdichte im Ursprung ab. Bescheidene Fehler ergeben sich fiir verallgemeinerte GauJ~-Dichten. Wenn die Stfrung nicht klein ist, zeigen Simulationen, da~3 das Verfahren immer noch brauchbar ist, jedoch sind Iterationen erforderlich um den Bias des Sch~itzwertes zu beseitigen. Polyspektrale Ans~itze, die auf LSsungen ohne Bias bei beliebiger Kanalst6rung fiihren, k/Snnen asymptotisch nicht besser sein. Dies wird in der Arbeit diskutiert. Die Erweiterung des vorgeschlagenen Verfahrens auf komplexe Daten und Impulsantwort sowie auf parameterabh~ingige St6rungen ist offensichtlich und wird kurz skizziert.

R6sum@. Nous proposons une solution au probl~me de la d6convolution aveugle discrete, c.a.d., de l 'estimation de la r6ponse impulsionnelle d 'un syst~me 6chantillonn6 connaissant sa sortie et la statistique de l'entr6e. La s6quence d'entr6e est suppos6e ind6pendante et identiquement distribu6e. L'estimation utilise la eross-corr61ation entre les 6chantillons de sortie et un fonction nonlin6aire de ceux-ci. Cette technique est efficace quand la distortion r6siduelle du canal inconnu est faible, c.a.d. asymptotiquement. De ce fait elle est recommand6e en derni~re 6tape pour 61iminer le bruit laiss6 par toute m6tbode de d6convolution aveugle pr6f6rentielle. La variance de l'erreur d'estimation et la fonction nonlin6aire optimale, qui d6pend de la densit6 de probabilit6 de l'entr6e, sont donn6es. La variance est compar6e h la borne de Cramer-Rao. L'erreur rms de la solution sous-optimale utilisant la fonction nonlin6aire 'signe' d6pend de moments simples de la s6quence d'entr6e et de

t This work has been supported by MPI 40% Program and by the National Research Council (CNR). It has also been partially supported by The European Communities CONTR. n. ENC3C-0013-I(s).

0165-1684/90/$03.50 © 1990- Elsevier Science Publishers B.V.

194 S. Bellini, F. Rocca / Asymptotically e~cient blind deconoolution

la valeur de la densit6 de probabilit6 b. l'origine. Ceci entrai'ne un baisse de performance faible dans le cas de signaux Gaussiens g6n6ralis6s. Quand la distortion est significative, des r6sultats de simulations montrent que cette technique reste utile, mais un certain nombre d'it6rations est n6cessaire pour supprimer le biais de l'estimation. Les techniques polyspectrales, qui pourraient produire des solutions non biais6es pour n'importe quelle distortion, ne sont pas meilleures asymptotiquement. Cet article discute pourquoi elles pourraient 6tre beaucoup plus bruit6es. L'extension de la technique d6crite dans cet article ~. des donn6es et ~t une r6ponse impulsionnelle complexes, ainsi qu'h des distortions d6pendant de param~tres, est imm6diate et bri~vement pr6sent6e.

Keywords. Deconvolution, minimum entropy, non-Gaussian variates.

I. Introduction

Blind deconvolution is the identification of

phase and amplitude of the impulse response (PR) of a linear system, based on the analysis of output data and on some statistical hypotheses on the input. The problem has several applications like

seismic deconvolution, image deblurring, channel equalization, echographic data focusing either in the acoustic case or in the electromagnetic one (synthetic aperture radar). The linear system can

be monodimensional as well as multidimensional. The transfer function can depend on several para-

meters or just on one. I f the phase characteristic of the linear system

is irrelevant or known a priori, then the problem of blind deconvolution reduces to that of spectral

analysis. The techniques introduced in this paper are still useful, since additional statistical informa- tion on input data helps reducing the variance of the estimates of the parameters of the channel

transfer function. The problem has been studied by several authors

in the past, under several names and for different

applications. Homomorph ic filtering was intro- duced to remove effects of non-uniform illumina-

tion of images [16]. The name 'blind deconvolu- tion' was first used by Stockham et al. for the restoration of old records, based on a model of the signal [18]. Wiggins [20,21] introduced 'minimum entropy deconvolution' in seismic data analysis, seeking the phase and amplitude of that transfer function of the inverse channel that maximizes the kurtosis of the deconvolved data.

The same concept has been applied to synthetic aperture radar focusing by Li et al. [13]; in this

Signal Processing

case, the unique parameter to be estimated is the

rate of change of the Doppler shift of the radar

echo. The same principle is also used in speckle inter-

ferometry [23] to remove the effects of the blur induced on astronomical plates by short term vari- ations of the refraction index of the atomosphere.

Similar concepts may be used in multilevel data transmission for blind (or 'self-recovering')

equalization of telephone or radio channels [2, 4, 5, 8, 9, 17]. It has been shown in [5] that the minimization of properly chosen functionals of

output data pushes the output probability density function (pdf) towards the correct one (which is approximately uniform) and achieves channel

equalization. A simple way to implement blind deconvolution

is to calculate the stochastic gradient of a proper figure of merit (say, the estimated kurtosis of the data) with respect to the transfer function of the channel. Then, using this information, the transfer function is approximately deconvolved, the stochastic gradient is computed again and so on,

iterating towards convergence [11, 19]. Another approach [2, 10] makes use of coarse

estimates of the input data sequence obtained from

the channel output by means of an appropriate nonlinear function, which depends on the pdf of input data. This estimate of the input sequence is then crosscorrelated with the output, and an approximate inverse filter is thus obtained. The output sequence is deconvolved by this inverse filter, and the previous operations are repeated until convergence is found.

Benveniste et al. [5] proved that in particular cases convergence can indeed be achieved even

S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution

for highly distorted channels. Donoho [7] also determined the dispersion of the estimates of the PR within the minimum entropy approach derived by Wiggins.

In this paper we consider a different technique [12] for blind deconvolution, applicable when the pdf of input data is known. We introduce a func- tion, which is an average of products of output samples times a nonlinear function of the same data samples. Then, from this generalized cross- covariance we derive approximately unbiased esti- mates of the PR of the channel under the hypotheses that the data sequence is composed by independent identically distributed (iid) random variables, and that the channel distortion is small. In most cases of practical interest the observed sequence is much longer than the impulse response of the channel. We neglect edge effects, in order to give simple formulas for the variance of the estimates and for the Cramer-Rao bound. We show that the Cramer-Rao bound is attained if the nonlinear function is properly optimized.

In [3], we considered the problem of estimating only the odd part of the channel response, after expansion in a Taylor series and neglecting higher order terms. In that case, the generalized cross- covariance assumes the form of the average of a nonlinear function of pairs of data samples and not, as in this paper, the average of products of one sample times a nonlinear function of the other one.

When the input pdf is Gaussian, the estimation technique of the even part of the channel PR proposed in this paper reduces to the usual estima- tion of the amplitude of a transfer function given the autocovariance of the data. When the input data are not Gaussian, though, we are able to determine more efficiently the amplitude spectra; besides, we are also able to measure the phase of the transfer function.

In particular, we shall see that in the case of small distortions the estimates of even and odd parts of the PR of the channel are uncorrelated and have different variances. We also consider cases in which it is known a priori that the PR of

195

the channel is causal or that it depends on a few parameters.

We consider also a simple, suboptimum non- linear function, namely the function sign, and show that simplicity does not reduce performance too much. By the way, the merits of the sign have long been known (see, e.g., [4, 5, 17] for the communica- tion field).

The structure of the paper is as follows. We first introduce the model of the channel and define generalized crosscovariance. After that, we derive first order approximations of the unbiased esti- mates of the PR, their dispersion and the optimal nonlinear function. We give closed form expressions in the case of generalized Gaussian random variables.

Then, we show that the generalized crosscovari- ance is proportional to the gradient of the log- likelihood function of the PR of the channel. We compare the results got so far with the Cramer-Rao bound for unbiased estimates and verify that the asymptotical technique proposed in this paper is efficient (but for edge effects, which are negligible in most practical cases). We also sketch a com- parison with polyspectral techniques [22].

Finally, we present some simulated examples of iterative deconvolution, to give an idea of the per- formance of the method even when the channel distortion is not small.

The extension of this technique to complex data sequences and PRs is straightforward, and has been used, e.g., by Prati [6]; in that case the appli- cation was autofocusing of synthetic aperture radar images.

Our method is not guaranteed to converge in general, even if it often does. Therefore, if the pdf of input data is known, we recommend our method as a final step of deconvolution, to clean up in an efficient way the noise left by whichever preferred deconvolution technique.

Through the paper, we keep mathematics to a minimum. We neglect edge effects, remainder terms in power expansions, small biases and the like. Exact formulas would hardly be useful, in most cases.

Yol. 20, NO, 3, July 1990

196 S. Bellini, F. Rocca / Asymptotically e~cient blind deconvolution

2. The model of the channel and of input data

Let us suppose that the data Xk that enter a channel with unknown pulse response are iid ran- dom variables with known probability density fx (x) . For the sake of simplicity, in this paper we assume that the expected value of Xk is zero. The output sequence Yk is obtained as follows:

y k = X k + ~ a i X k _ i , l < - k < - N , (1) i

where ai represents interference from adjacent data (i ~ 0), or deviation from unit gain (i = 0). In the sequel we suppose that lail << 1 so that we can call

'approximately delta-like' the PR of the channel. In an iterative deconvolution technique, this hypothesis will certainly be valid in the last iter- ations, when convergence is about to be attained.

The calculations are extremely simplified by this hypothesis. Convergence is usually (but not always) achieved even when this condition is far from being fulfilled [12]. In a subsequent section we shall also discuss the case of some noise super- posed to output data, i.e.,

Yk --" Xk "~ ~ aiXk-i q- nk . (2) i

In this situation we shall see that the solution will not change much, and the same technique may be applied.

In both cases, anyhow, the goal of the decon- volution will be the retrieval of the coefficients ai using only the output sequence Yk and the pdf of input data.

The number of coefficients ai of the impulse response that need to be considered is up to the user of the deconvolution technique.

3. A first order analysis o f channel estimators

Given N samples Yk ( k = 1 , . . . , N) of- the channel output, let

~/i =--~ ~k g(Yk)Yk- i. (3)

Signal Processing

Only N - 1 i l terms in (3) will be available. In prac- tice large values of N will be needed, in order to have small variances of the estimates. Then, N is much larger than the maximum value of i of inter- est. Throughout the paper we will approximate N - i with N.

Let us consider first values of i different from zero, and let the channel be not distorted (ai = 0). The expected values of the random variables 7,- are

(7i) = (g( Yk)Yk-i) = (g(Xk)Xk-i)

= (g)(x) = 0, (4)

where angular brackets denote expectations and (g) is a shorthand notation for (g(Xk)). Since there is no distortion, the terms to be added in (3) are uncorrelated. Then, the variance of 2'i is given by

1 2 2 Var(yi) = ~ ( g )(x ). (5)

The covariances are zero, but for

1 x 2 C o v ( y , y-i) = ~ ( ( g ) ) • (6)

Now, let us suppose that the channel is distorted, but not too much. Thus, if we expand (yi) in a McLaurin series and retain only first order terms, we still have a good approximation. Since

we calculate a(yi)/aaj and evaluate the derivatives in the distortionless case (ai = 0, all i), obtaining

- - = (g'( Xk )Xk--jXk--i) + (g( Xk )Xk-i--j) Oa t

f(g')(x2), i= j ,

= ~(gx), i = - j , (8) / [0 , otherwise.

Therefore, neglecting higher order terms of the series expansion of (3'i) in ascending terms of the ai's, we see that (3~i) and (y-i) become linear combi-

S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution 197

nations of ai and a- i only, namely By a similar analysis we obtain, for i = 0,

(yi) = Diiai + Di-ia-i , (9)

(Y-i) = D_,ai + D- i - ia- i ,

where D U = O(yi)/Oaj. Estimates of a~ and a_i can thus be obtained from yi and y-i . We have

vi(g')(x 2) - v_,(gx) d, ((g,)(x2))2_((gx))g. (10)

These estimates are approximately unbiased, for

small distortion. In fact, bias terms are quadratic, cubic, etc. It is interesting to write the solution in terms of even and odd parts of the impulse response, namely

. . . . 8 i =t= 8-- i "Yi -t- T - i (11) a i = 2 2 ( ( g ' ) ( x 2 ) ± ( g x ) ) '

where plus and minus should be used for even and odd parts, respectively. Equation (11) shows that

the even (odd) part of Yi determines the even (odd) part of the estimated impulse response. Note that, for small distortions, a e and a ° correspond to

amplitude and phase distortion, respectively. It is

an important fact, however, that the gains for even

and odd parts are different. We anticipate that the odd part cannot be estimated when the data have a Gaussian distribution, since an infinite gain would then be required.

Variances and covariances of yi depend on the total channel distortion. However, a power series expansion for Var(yi) and C o v ( y , Y-i) will con-

tain only even powers of a~. Therefore, according

to our approximations, which consider only first order terms, (5) and (6) still hold. The variances of even and odd parts of the estimated impulse response can be easily evaluated, and are given by

2 (g2)(x2) + ( ( g x ) ) 2 (12) O'e,o -- 2N((g, ) (x 2) ± (gx))2.

Finally, (5) and (6) show that a~ and 8 ° are uncor- related (see (11) and (5)), while 8i and ,~_; are not. Then we have

Var(Si) = 2 2 tr¢ + tro. (13)

(Yo) = (gx)+ Dooao;

(g2x2) - ((gx)) 2 Var(yo) =

N

O(yo) D°J= 0aj

= (I(g'x2)+(gx), j = 0 (14) [0, otherwise;

To - (gx) A

ao = (g , x2) + (gx ) '

(g2x2)-((gx))2 Var(80) N((g ,x2)+(gx) )2 .

Finally, we want to show how our technique can

be extended to complex data and PRs. Let us

suppose for a while that the input sequence is complex, with iid real and imaginary parts, but the PR is real. In this case we have two independent estimates of the PR, based on real and imaginary

parts of x and y, respectively. We can average these

estimates, thus halving the variance. In fact, we

are using 2 N samples, instead of only N. Similar considerations hold if we know that the PR is

imaginary. We can average estimates based on the real part of x and the imaginary part of y, and vice versa.

Now, let the PR be complex. We shall show in Section 9 that no loss of efficiency is incurred if

we estimate the real and the imaginary part of the PR separately. It is left to the reader to verify that

the equations read much the same as for real data. Basically, we only have to substitute g(Yk) with

g(Re{yk}) +jg(Im{yk}) and Yk-i with its complex conjugate y*_;. The variances of the estimates are halved.

4 . O p t i m i z a t i o n o f the n o n l i n e a r e s t i m a t o r

Up to now, nothing has been said about the choice of the function g(y) . In this section we want to derive the nonlinear function that

Vol. 20, No, 3, July 1990

198 S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution

corresponds to the minimum variances of the estimates ~ , as given by (12), (13) and (14).

In principle different functions could be used for even and odd parts, and for ao. Equations (12) and (14) are valid if the appropriate functions are used. Therefore, we shall consider three cases separately. The analysis, which is reported in Appendix A, gives the following results (provided

f ~ ( x ) exists; see Section 6 for non-regular pdfs). In all cases the best function g ( y ) is given by

(or is proportional to)

The variances are given by

1 2 Oge, o = 2 N ( t ± l ) ;

t 2 2 t r e + °'o N ( t 2 _ 1)

- f ' ( y ) g ( y ) - f~(y---~, (15)

where the minus sign has been included for con- venience. The function g ( y ) is known in the statis- tical literature as Fisher score function [22].

For the optimum g(y) , the moments of interest are given by

(gx) = 1;

t ---- (g')(X 2) = (g2)(X2);

w =- (g2x2);

(g' x 2) = (g2x2) - ( f " x 2 / f ) = w - 2.

(16)

The parameters t and w, defined above, depend on the pdffx(x) . The parameter t, which is known as Fisher information for location [22], is greater than one for all distributions but the Gaussian one (in that case t = 1).

The optimum estimates are

(20)

and

1 ~g (21) N(w-1)"

Note that the Gaussian distribution is the worse, 2 2 as far as O'e and Cro are concerned. In particular,

the odd part of the impulse response cannot be estimated.

5. E x a m p l e . The genera l i zed Gauss ian dis tr ibut ion

As an example, let us consider the case of gen- eralized Gaussian distribution of input data, i.e.,

Of - x = (22) L ( x ) 2/3r(l/Of)exp( I/el ).

The analysis is based on formulas presented in Appendix B, and results are shown in Figs. 1 and 2. In particular, Fig. 1 shows the value of t versus the shape parameter Of of the distribution. Figure 2 shows 'even' and 'odd' gains 1 / ( 2 t + l ) , or the variances of the estimates, which are given by the same formulas (but for the size N of the sample).

a e'° _ 3',-4- 3'_~ (17) 2 ( t + l ) '

i.e.,

~i = t3"; - 3"-i (18) t 2 - 1

The center estimate ao is given by

a o - 3'o- 1. w - 1

Signal Processing

(19)

0 i i i i

0 i 2 3 4 (~ 5

Fig. 1. Values of the parameter t defined by (16) versus the shape parameter a of the generalized Gaussian distribution.

S. Bellini, F. Rocca / Asymptotically e~cient blind deconvolution 199

.4

.3

.2

. t

0 .0

10 z

1 0 i

10 0

, , , , 1 0 - 1 1 2 3 4 ot 5 0

(~)

I I I I

! 2 3 4 o~ 5

(b)

Fig. 2. 'Even' (a) and 'odd' (b) gains 1/(2t+ 1), or 'even' and 'odd' variances (to be scaled by the size N of the sample), versus the shape parameter a, for the optimum nonlinear function.

We have also analyzed the per formance of the

est imator based on a very simple, fixed non l i nea r

funct ion , name ly g ( y ) = sgn(y ) , which is op t imum

for a = 1. Ga ins and var iances have been evaluated

according to the formulas of Section 3 a n d Appen-

dix B, a nd are presented in Figs. 3 and 4. We would

like to c omme n t on the fol lowing points .

For a > 2 the gain for the odd part of the impulse

response is negative, since (gx) is greater than

(g')(x2). This means that for a > 2 the est imated

. 4

.3

.2

. t

0.0 0

]

I I I I

I 2 3 4 o( 5

CO

5

0

- 5

- 1 0

~ F

2 3 4 ot 5

(b)

Fig. 3. 'Even' (a) and 'odd' (b) gains (see (11)) versus the shape parameter a, for the nonlinear function sign.

.4 )

a e .3

.2

.1

0.0 0

10 2

0 0

t0 ~

1 0 0

, , , , 5.0 -~ i 2 3 4 o~ 5 0

i ! i i

1 2 3 4 ~ 5

(b)

Fig. 4. 'Even' (a) and 'odd' (b) variances (to be scaled by the size N of the sample), versus the shape parameter a, for the nonlinear function sign.

Vol. 20, No. 3, July 1990

200 s. Bellini, F. Rocca / Asymptotically efficient blind deconvolution

value of ai should be based more on 3'_; than on

3'; (see (10)). The simplified estimator is nearly as good as the

optimal one for a < 2, and not that efficient for

a >> 2 (but very large values of a are not likely to be encountered, as discussed in the following section).

We only need the following information about input data: ( x 2 > , (Ix[) and (28(x)> =

2 S 8 ( x ) f ~ ( x ) dx = 2fx(0) (i.e., the frequency of values around zero). I f these values are known only approximately, the estimated PR will turn out biased even for small distortion. However, one can still try to deconvolve the y sequence by iterating

PR estimation and deconvolution. At least the sign of the odd gain (i.e., whether a < 2 or a > 2) must

be known. Otherwise, the user is compelled to try

both signs.

6. Discontinuous and impulsive distributions

is no additive noise. This situation has no practical interest.

A more meaningful model is obtained if we still

assume that each a; is small, while the total interfer- ence from adjacent symbols is not. Since we still want to decouple the estimates of different pairs

(a;, a_j) f rom each other, we use the following

model to estimate a; and a_i:

Yk = Xk "q- a;Xk-i -F a_;Xk+; q- nk, (23)

where nk is the sum of all the interfering terms not explicitly included in (23) and of noise (if any). Loosely speaking, we want to expand in a power

series around Yk, instead of Xk.

An optimization similar to the one described in

Section 4 turns out to be very difficult. However, suppose we are willing to estimate ai only from 3';

(as suggested by the model we have used up to now in the case t = ~ ) . We obtain by an analysis similar to (7)-(12), neglecting terms containing

Figure 2 shows that t tends to infinity when a

tends to zero or to infinity. Then (18) shows that

ai and a_i are to be estimated separately from 3'; and 3'_, respectively. According to (20), also the

variances of the estimates tend to zero, but in practice this looks optimistic. We need not worry

about the case a = 0, which does not correspbnd

to any meaningful distribution. On the contrary, ot =oo corresponds to the uniform probabili ty

density (which is approximately the probabili ty density of data in multilevel digital transmission, for instance). This case is worth some comments and further analysis.

Our first order model is not adequate for that situation. For instance, the opt imum function g (y ) would look like an ideal barrier (i.e., g ( y ) = 0 for

lyl< Y; g(y ) =oo for lyl> Y). This result relies on the extremely sharp probabili ty density and is bound to vanish as soon as some additive noise, or some intersymbol interference, is included in the model. In other words, the variance is zero only if each ai is exactly equal to zero, and there Signal Processing

a_i,

(g '( y ) >(x2> ' (24)

1 (g2(y)>(y2> Var(~i)

N ( (g ' ( y ) ) ( x2 ) ) 2"

A simple optimization shows that the minimum value of the variance is obtained for

g ( y ) = ---fY(Y) (25) fy(Y)

Basically, the probabili ty density fy(y) of the out- put of the channel should be used to determine the opt imum nonlinear function, instead of the probabili ty density of input data.

Let us consider, for instance, the following very

simple case. The data are uniformly distributed (a = oo). There is no channel distortion and some Gaussian noise is added to the data. The probabil- ity density fy (y) can easily be calculated and the value of t can be evaluated numerically. For small noise levels, t is about 0.52 times the square root of the signal-to-noise ratio. In practice, even when a = oo we shall seldom be interested in values of t greater than some tens.

S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution 201

Another typical situation may be the following. The data have a discrete pd f

fx(x) = ~ P, ,6(x-X, , ) . (26) n

Therefore, the theory of Section 4 is not applicable. Now, suppose there is some additive noise with

small variance o "2. Then

We can estimate ai (for i > 0) either from 3'i or V-i; besides, we can use any linear combination of these estimates. Let us analyze only the case in which the opt imum function g(y) is used. Since D , = t and D_ , = 1, we consider the estimate

A a i = u + ( 1 - u)3'-i, (30)

g(y)_-f~(Y) fy(Y)

~ P.( y - Xn) exp ( - ( y - X.)2/2o- 2)

o- ~n Pn e x p ( - ( y - Xn)2/2o- 2)

1 ~--- (y - X ( y ) ) , (27)

or

where X ( y ) is the value Xn nearest to y. We obtain

(gy) ~ 0 ;

(g') ~ 1/o-; (28)

o-)'~ ( (yk--X(yk))Yk- , ) A - -

ai ~ (x 2) (x 2)

and we see that we do not even need to know o-2] I f the noise level is not small, g(y) will depend on o2. Similar considerations may prove useful for other non-regular data distributions, e.g., the

Bernoull i -Gaussian one.

where u is arbitrary. The corresponding variance is

1 (u -~+(1-u)2 t -~2u(~-u) ' ) (31) Var(d,) = ~

Standard calculus shows that, for t ~ 1, the vari- ance is minimum for u = 1 and is given by

1 Var(di) -- ~-~. (32)

Therefore, ai is to be estimated only from 7;, while ~_~ is useless (however, this would not hold any more if a non-opt imum function g(y) was used).

When the distribution is Gaussian (t = 1), % and 7-~ coincide, u is arbitrary and it can be verified that (32) still holds. Note that the a priori knowl-

edge that the impulse response is causal reduces the variance of the estimate, so that the impulse response can be estimated in phase and ampli tude also in the Gaussian case. In fact, this Gaussian

case is trivial, since our hypotheses (ao ~ 1, ai ~-O, a_i = 0) imply that the channel is minimum phase,

i.e., identifiable even using the usual linear theory.

7. A priori information about the impulse response

Suppose we are given the a priori information that the impulse response is even (or odd). This information can be used in a natural way by estimating only a~ (or a°). Unfortunately, that situation is not common in applications.

Let us now consider the cases in which we know

a priori that ai = 0 for i < 0, i.e., that the main pulse is followed and not preceded by small echoes. Equation (9) now reads

(%) = Dilai ; (29)

(Y-i) = D-iia,.

8. Comparison with maximum likelihood estimation

Maximum likelihood is another approach to estimate the PR of the channel [22].

For small distortion, we could expand the

logarithm of the likelihood function, i.e., of the pd f of the observed vector, given the channel PR, in a Taylor series, and truncate it to the second term

log fy (y / a) = Co+ cT a +½ar C2a, (33)

where bold variables are used to represent vectors or matrices, if capital.

Vot. 20, No. 3, July 1990

202 S. Bellini, 17. Rocca / Asymptotically efficient blind deconvolution

With this approximation, the vector ~ that maximizes (33) is the ML estimate. We have

a = --C~lcl. (34)

The convolutional model and the hypotheses about the impulse response lead to the following approxi- mate inversion of the convolution operation, which (apart from edge effects, since the sequences are truncated to N terms) is correct to the first order:

Xk =Yk- -~ aiXk-i~Yg--~ aiyk-i=--FCk. (35) i i

Besides, the following approximation is correct to the first order too:

f y ( y / a ) =fx(£). (36)

In (36) f~ is given by the product of N terms, since the input data are independent. Therefore we easily obtain the ith component of the vector c1:

0 l o g f r ( y / a ) cil=

=~k -- f ' (Yk) f x ( Y k ) Yk - i = N3,i, (37)

where the derivatives have been evaluated for a = 0. Thus, also the ML estimator makes use of the set Yi- Unfortunately, the computation of the matrix C2, and of its inverse, is not easy since second order approximations are needed.

Instead of computing the inverse of the matrix C2, our method simply imposes unbiasedness. Since maximum likelihood estimates are unbiased for large samples, we argue that our estimate is approximately ML.

If the data are, say, a Markov process of given order, we can still follow the same approach, but a different nonlinear function will be required, since the gradient is a different function of the data [I2].

9. Comparison with the Cramer-Rao bound

observed one vector with N components Yk ( N >> M).

The Cramer-Rao bound for unbiased estimators of the set of parameters ai can be obtained from the first derivatives of the log-likelihood function with respect to each ai. Since we are considering only the first order approximation, we evaluate the bound for channels without distortion (ai = 0). The derivatives are given by (37), substituting Yk with Xk. The covariances of the derivatives are

Cov 'O lo_gf 0 lo_ggf]

~ai " Oa.i /

" Nt, i = j # O ,

N, i = - j #O,

N ( w - 1 ) , i = j = 0 ,

0, otherwise.

(38)

As is well-known, this means that optimum joint estimates of al and a t (i # j , i # - j ) are not better than independent estimates. On the contrary, ai and a_i should be jointly estimated.

If, instead of ai and a_i, we consider their even and odd parts a~ and a °, we easily find by a change of variables

l o g f Oa~, o = N(y , ± y_,), (39)

and the two derivatives are now uncorrelated. Therefore, even and odd parts can be estimated separately. Finally, the Cramer-Rao bound for unbiased estimates is given by

1 V a r ( ~ ,°) >I

(N2(yi ± 3,_,) 2)

1 ( i # 0 ) ,

2 N ( t ± 1) (40)

1 Var(ao) I>

N ( w - 1 ) "

When the channel impulse response is causal, the coefficients ai (i > 0) can be estimated separately and the Cramer-Rao bound is

Let the parameter that we want to estimate be a vector with M components ai. Suppose we have

1 1 Var(t~,)t>~N 7)'"23P" Nt" (41)

Signal Processing

S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution 2 0 3

We conclude that, at least in a first order approxi- mation, the channel estimator proposed in this paper is efficient, since it achieves (apart from edge effects) the Cramer-Rao bound.

To conclude this section, we consider the case of complex data and PRs. Equation (35) is unchanged, and (36) still holds if x and y are considered 2N-dimensional vectors, whose com- ponents are the real and imaginary parts of input and output sequences.

In (37) we have terms like g(Re{yk}), g(Im{yk}), Re{yk-i} and Im{yk_~}. It can be verified that the derivatives (37) with respect to the real and

imaginary parts of a~ are given by the real and imaginary parts of

where the rectangular matrix W is known a priori, and only the vector of parameters ~" has to be estimated. Making use of the fact that a linear transformation of an efficient estimate is also efficient [22], it follows that an estimate ~ is

¢ = ( w T w ) -1 wT~. (44)

AS an example of a very simple case with only one parameter, let us consider the channel distortion due to an unknown and small phase rotation ¢. The corresponding impulse response is

ai = -srh,, (45)

where hi is the discrete time domain representation of the Hilbert transform

Y, (g(Re{yk}) +jg(Im{yk}))y*-,, (42) k

respectively. Provided N is substituted by 2N, it can be verified that (38) still gives the covariances of the derivatives of the log-likelihood function with respect to the real (imaginary) parts of ai and aj. Mixed covariances (i.e., one derivative with respect to the real part and the other one with respect to the imaginary part) are zero. Again, this means that real and imaginary parts of a~ can be estimated separately, without loss of efficiency. Thus, the estimates given in Section 3 are asymptotically efficient if the optimal nonlinear function g(y) is used.

The variance of the estimates is halved, as expected since we are using 2N (real and imaginary) samples instead of N.

10. Estimation of parameter dependent distortions

An easy generalization of the theory presented before covers the case of parameter dependent channel distortions.

Let us suppose that the PR of the channel can be described as

L

ai = ~ Wli~l, i.e., a = W~, (43) I=l

h,= [2/(xri) , i odd, (46) [0, i even.

Hence

co co ^ o

~ _ - 2 i 2 1 hi~ ° _ 4 ~ a_2_~., (47) ~i=l h~ "rr i=1 t

and the variance of the estimate could be easily derived.

11. Simulations

When the channel distortion is not small, the estimated impulse response becomes biased and its variance increases. We simulated simple chan- nels, with only one sample ai ~ 0. Figure 5 shows the estimated values ~i and a-i versus ai, for i ~ 0, and for various values of a. For large distortions, some coupling between 8_~ (which should be equal to zero) and 8~ exists. These results were indepen- dent of the index i. Also 40 suffers from some bias, shown in Fig. 6. No bias has been observed in the other samples of the impulse response.

Figure 7 shows the value of Var(~) (i ~ 0), nor- malized to the small distortion value given by (20), for various values of a. Again no dependence on i was observed.

Finally, Figs. 8 and 9 show two examples of deconvolution for a causal channel and for a

Vol. 20, No. 3, July 1990

204 S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution

. 8 i i . i /" %

ai / I III/''''''" , 6

/z~ / _~"

e i /d/ i

0 0 t. I I i i i

O. .~ .4 .6

. B

. 6

.4

. 2

0 . 0 . 8 0 . 0

/

. . . 4

/ ' / ' / ~1

/-" l . S

./ / ;~ '~ , / .- >..-/

.2 .4 .6

(u)

Fig. 5. M e a n va lue of ai (a) and a_ i (b) versus ai, for var ious va lues of a.

Signal Processing

i . 50

~o

i . 2 5

i .00 0 .O

I I I I , / . , - / / ' /

/ / , / /

/ / / /

/ / / /

/ / . ~ S j ~ f : ' ' -" / / . ~ . . - ~ / / y....

-- ~ I I I I I I I ~ I ' I

. l . 2 .3 .4 . 5 . 6 ,7

Fig, 6. M e a n va lue of t~ o versus at, for var ious va lues of a.

t/, i . 8

S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution 205

3 . 0

vat(a,)

2 . 0

i . 0

0 . 0 O.

I " 1 " I " I I - - I I

,4

I I I I i I 1 7

.1 . 2 . 3 . 4 . 5 . 6 . ai

Fig. 7. Var(~i) (i ~ 0), normalized to the small distortion value, versus a~, for various values of a.

. 8

i t e r a t i on ( res idual) e s t i m a t e d n u m b e r imp. resp. imp. resp.

i Ih I,

It I,

I I,

, [ I

I 5

inverse i t e r a t i on ( res idual) e s t i m a t e d f i l ter n u m b e r imp. resp. imp. resp.

1 .11 . . . . . . ,ll,

, , I I l ' " 4 , , .

Fig. 8. I te ra t ive d e c o n v o l u t i o n of a causa l channel .

inverse f i l ter

, r i

I , .

i, I

5 ,,I..

Fig. 9. I te ra t ive d e c o n v o l u t i o n of a non-causa l channel .

non-causal one, respectively. Since the distortion is not small, the estimated PR is only an approxi- mation of the actual one. Therefore, after decon-

volution by the inverse of the estimated PR, we iterated our technique. In the figures, the first column shows the channel PR (at the first iteration) or the residual PR after deconvolution (from the second iteration on). The second column shows the estimated PR, for values of the index i ranging

from - 5 to 5. Finally, the third column shows the

inverse filter. I f it is known a priori that the inverse filter is causal, time domain techniques are used

to update the inverse filter. Otherwise, frequency domain techniques are used. 2000 samples of the channel output are used. The impulse response of the causal channel shown in Fig. 8 is 0.7, 0.7 and 0.4. The value of the shape parameter o~ of the generalized Gaussian driving distribution is 1. The

Vol. 20, No. 3, July 1990

206 S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution

optimum nonlinear function, namely g ( y ) = sgn(y), is used.

The channel of Fig. 9 is obtained with two causal poles, located at 0.5 exp(~:jl), and one acausal pole, located at 2 exp(j0). The value of the shape parameter a is 3. The optimum nonlinear function, namely g ( y ) = y 2 , is used. The sequence to be deconvolved is normalized before each iteration, so that its power is equal to one.

In both cases, convergence is obtained in a few iterations. As expected, the residual impulse response is much noisier for ot = 3 than for a = 1.

12. Comparison with polyspectral techniques

We have described a technique that allows the estimation of the phase and amplitude response of the channel without any knowledge of input data (except for their pdf), and we have shown that it is efficient for small distortion. Polyspectral techniques also lead to unbiased estimates of the channel phase, without even assuming that the impulse response (or the transfer function) has a finite number of parameters. Besides, they do not need any information about the pdf of input data. Therefore, we expect polyspectral methods to be more robust with respect to the input pdf, but noisier than the method proposed in this paper. A comparison between the two techniques is useful to clarify the reciprocal advantages and disadvan- tages.

Let us consider the bispectrum of the zero mean time series Yk [14],

B(A, t~)=Y,)~ c(i, n) exp(-j(Ai+/~n)), (48) i n

The phase information can be recovered from the argument of the bispectrum, or, should the bispectrum be identically zero, from higher order spectra. In fact the following relation holds for the argument of the bispectrum [14]:

arg B(A,/z) = arg A(A) + arg A(/x)

- a rg A(A + ~), (49) where

A(A)= ~ aj exp(- jAi) , (50) i

and similar results hold for polyspectra as well. These equations can be used in many ways to obtain unbiased estimates of the unknown phase of the transfer function of the channel, provided of course that the polyspeetra are not all identically zero, i.e., the pdf of input data is not Gaussian.

However, polyspectral estimates tend to be very noisy and reliable phase estimation can be achieved only in simple cases with very long data sequences. In order to understand how this unfavorable situation comes out, let us sketch the derivation of (49). We have

(YoYiY,,) = ~ ~ Y, (xmxtxj)a_mai_la._j m l j

= (x 3) Y~ a,,,a,,,+ia,,+,, (51) m

where c( i ,n )=(yoy~y , ) is estimated from the observed sequence. The bispectrum is therefore the two-dimensional Fourier transform of the third order moments of the observed data. Higher order spectra are defined as multidimensional transforms of higher order cumulants and can be analyzed in a similar way. S i g n a l P r o c e s s i n g

and

B(A, ix) = (X3k) X X ~. a,,,am+,a,n+,, i rl m

x exp(- j(Ai +/zn))

=(x~)A(X)A(~)a(-;t -~). (52)

Then (49) follows easily, provided (x3)#0 , of

course. The correlations (YoYiY,) are estimated from the data, as averages of terms like YkYk+iYk+,. When i, n and i - n are large, the factors are approximately independent and contribute very little to phase estimation; of course they contribute to the noise in the estimated bispectrum. The situ- ation can be worse for higher order spectra, so that it may be advisable to use the lowest order spec- trum that can do the job. In any ease, to obtain a clean estimate of the spectrum, some form of win-

S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution 2 0 7

dowing is necessary. Windows, on the other hand,

produce biased estimates (see, for instance, [1] for a thorough discussion of bispectrum estimation).

Once an estimate of the bispectrum has been obtained, (49) has to be used to get the phase of the channel. Different ways for its use have been

proposed [15], since the equations are overdeter- mined. In some sense, one could say that a careful

use of (49) is needed to reduce the noise introduced by the large number of terms that are not very useful in phase estimation, but that make the esti- mate unbiased for all possible distortions.

We can notice now that the method that we are

proposing makes use of terms of the form

g(Yk)Yk-i, that are the richest in information (at least for small distortions). Thus, the minimal quantity of noise is introduced and the estimate is

efficient. On the other hand, to do so we need to know the pdf of input data. Besides, we loose unbiasedness for high values of channel distortion.

The interest in higher-order spectral analysis is still great [24]. In practice, even when the pdf of

input data is known, polyspectral analysis may be a useful tool for the initial stage of deconvolution,

to be then followed by the efficient estimate pro- posed in this paper.

13. C o n c l u s i o n s

We have evaluated the performance of a tech nique for blind deconvolution, which is efficient when the input sequence is iid, its pdf is known and the channel distortion is small. In principle, this approach can be extended to non-iid sequences, say Markov sequences of various orders

[12]. Moreover, we have shown how to estimate parameter dependent waveforms; for channels with greater distortions, more work has to be

carried out to ensure efficiency.

We have also shown that a simple nonlinear function such as the function sign can be good enough in many cases.

The polyspectral approach, notwithstanding its inefficiency, is able to overcome the problem of

the biased estimates, which could be noticeable

when the channel distortion is great. Then, if the pdf of input data is known, the method proposed in this paper is able to clean up the noise left by polyspectral techniques (or whichever preferred blind deconvolution method).

A p p e n d i x A

For i ~ 0, in order to find the minimum of the variance of even and odd estimates, we want to minimize the numerator of (12)

(g2)(x2) + ((gx)) 2

= (x 2) [ g2(x)f~(x) dx

+ xg(x)f,(x) dx , (53)

with the constraint that the denominator is con- stant:

(g')(x2)+(gx)

= (x 2) f g'(x)fx(X) dx

± J xg(x)fx(x) dx = const. (54)

Taking into account the constraint by a Lagrange multiplier A, we get the solution of this variational problem by the Euler condition

2(x2)g(x)fx(x) ± 2ixXfx(X)

-h(x2)f'~(x) + hxf,~(x) = O, (55)

where

Ix = f xg(x)fx(x) dx (56)

is a constant to be determined. From (55) we obtain

r X • , ~ fx ' ( ) g(x)=, '~x±~ , (57) Z(x)

where A and B are unknown constants. Now, let us consider the estimation of the odd

part. It is apparent that the term Ax is irrelevant, since it disappears in the difference y~-y_~ (see

Vol. 20, No. 3, July 1990

208

( 3 ) ) . Therefore

- f ' ( x ) g ( x ) - - - (58)

fx(x)

can be chosen. Since

(gx)=-f xf'(x)dx= f fx(x)dx=l; (x2)(g ') = (x2)((f~/f~) + ( f ' / fx)2) ) (59)

=(x 2) ff;(x) d x + t = t,

we get f rom (12)

1 cr 2 = (60)

2 N ( t - 1 ) "

For the even part, let us substitute (57) back in

(12). With some algebra we obtain

2 1 2 u 2 - 4 u + t + l c r e - 2 N ( 2 u - t - 1 ) 2 ' (61)

where u = A(x2)/B. Standard calculus shows that,

2 is obtained for u = 0, for t ¢ 1, the min imum of o-e

i.e. for A = 0, and we get

1 2 ere = (62) 2 N ( t + l ) "

By (59) and the Schwartz inequality we obtain

1 = ( ( -x f ' / f~ ) ) 2 <~ (x2)((f'/f~)2) = t, (63)

with equality i f f - f ' ( x ) / f x ( x ) is propor t ional to x,

i.e., for a Gaussian distribution. In that case the

two terms in (57) are propor t ional to each other, and from (12) we get tr 2 = 1/4N, which agrees with

(62).

Finally, let us consider the opt imizat ion o f the variance o f the 'center ' estimate. We want to find the min imum of

(g2x2) - ((gx)) 2

= f x2g2(x)fx(x) dx

Signal Processing

S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution

with the constraint

(g' x 2) + (gx}

= f x2g'(x)f~(x) dt

+ J xg(x)fx(X) dx = const. (65)

We find the Euler condi t ion

2g(x)x2f,,(x) - 2tZXfx (x) - A (2xf,(x)

+ x2f ' (x)) + Axfx(x) = O, (66)

i.e.,

g(x) = A + B f ' ( x ) (67) x L ( x ) "

Now, the term A / x would only add a constant to

To to be subtracted (see (14)). Therefore we can

choose

- ix (X) g(x) f~(x) " (68)

Since

(g 'x 2) = - f x2f~(x) dx + ( (x f ' / fx ) 2)

= - 2 + w, (69)

we obtain, with some algebra,

1 cr g (70)

S ( w - 1 ) "

A p p e n d i x B

Let us give some useful formulas for generalized

Gauss ian distributions. Let

o/ fx(x) =, , . . . . . . e x p ( - l x / / 3 V ) (71)

,~p l~ l / ot )

be the probabil i ty density o f a generalized Gauss ian r andom variable with shape parameter a and scale parameter /3 . Then we have

- i f ( x ) a sgn(x)[x[ "-1 - - - ( 7 2 ) L ( x ) /3° '

S. Bellini, F. Rocca / Asymptotically efficient blind deconvolution 209

and the moments of the random variable are given by

(ixl~) _ F(()t + 1)/ ,~) ,~ ~ (73) r ( 1 / a )

F r o m these f o r m u l a s w e obta in , for ins tance ,

r ( 1 / . ) '

r(2/~)13. < l x l > - / " ( 1 / o z ) '

t = (x2)(( f~/ fx) 2)

a z F ( Z - 1 / o~ ) F ( 3 / e~ )

r:(1/o~)

w = ( ( x f ' / L ) ~)

o~2F(2+ l/o~)

r ( 1 / ~ )

(74)

(75)

(76)

- c~ + 1, (77)

a n d so on.

References

[1] V.G. Alekseev, "Some aspects of estimation of the bispec- tral density of a stationary stochastic process", Problems Inform. Transmission, Vol. 19, No. 3, July-September 1983, pp. 204-214.

[2] S. Bellini, "Bussgang techniques for blind equalization", Proc. IEEE Global Telecommunications Conference, Houston, TX, 1-4 December 1986, pp. 1634-1640.

[3] S. Bellini and F. Rocca, "Blind deconvolution: Poly- spectra or Bussgang techniques?", in: E. Biglieri and G. Prati, eds., Digital Communications, Elsevier Science Pub- lishers, Amsterdam, 1986, pp. 251-263.

[4] A. Benveniste and G. Goursat, "Blind equalizers", IEEE Trans. Commun., Vol. COM-32, No. 8, August 1984, pp. 871-883.

[5] A. Benveniste, M. Goursat and G. Ruget, "Robust iden- tification of a nonminimum phase system: Blind adjust- ment of a linear equalizer in data communications", IEEE Trans. Automat. Control, Vol. AC-25, No. 3, June 1980, pp. 385-399.

[6] C. Cafforio, C. Prati and F. Rocca, "Full resolution focus- ing of SEASAT images in the f - k domain", Internat. J. Remote Sensing, to appear.

[7] D. Donoho, "On minimum entropy deconvolution", in: D. Findley, ed., Applied Time Series Analysis II, Academic Press, New York, 1981, pp. 556-608.

[8] G.J. Foschini, "Equalizing without altering or detecting data", AT&T Tech. J., Vol. 64, No. 8, October 1985, pp. 1885-1911.

[9] D.N. Godard, "Self recovering equalization and carrier tracking in two-dimensional data communication sys- tems", IEEE Trans. Commun., Vol. COM-28, No. 11, November 1980, pp. 1867-1875.

[10] R. Godfrey and F. Rocca, "Zero memory non-linear deconvolution", Geophys. Prospecting, Vol. 29, No. 4, April 1981, pp. 189-228.

[11] W. Gray, "Variable norm deconvolution", Ph.D. Disserta- tion, Stanford University, Stanford, CA, 1979.

[12] C. Kostov and F. Rocca "Estimation of residual wavelets", in: M. Bernabini et al., eds., Deconvolution and Inversion, Blackwell, London, 1987, pp. 126-145.

[13] F.K. Li, D. Held, J. Curlander and C. Wu, "'Doppler parameter estimation for spaceborne synthetic aperture radars", IEEE Trans. Geosci. Remote Sensing, Vol. GE-23, No. 1, January 1985, pp. 47-51.

[14] K.S. Lii and M. Rosenblatt, "Deconvolution and estima- tion of transfer function phase and coefficients for non- Gaussian linear processes", Ann. Statist., Vol. 10, 1982, pp. 1195-1208.

[15] T. Matsuoka and T.J. Ulrych, "Phase estimation using the bispectrum", Proc. IEEE, Vol. 72, No. 10, October 1984, pp. 1403-1411.

[ 16] A.V. Oppenheim and R.W. Shafer, Digital Signal Process- ing, Prentice-Hall, Englewood Cliffs, N J, 1975, Chapter 10, pp. 524-527.

[17] Y. Sato, "A method of self-recovering equalization for multilevel amplitude-modulation systems", IEEE Trans. Commun., Vol. COM-23, No. 6, June 1975, pp. 679-682.

[18] T. Stockham, T. Cannon and R. lngebretsen, "Blind deconvolution through digital signal processing", Proc. IEEE, Vol. 63, No. 4, April 1975, pp. 678-692.

[19] A.T. Walden, "Non Gaussian reflectivity, entropy, and deconvolution", Geophysics, Vol. 50, No. 12, December 1985, pp. 2862-2888.

[20] R.A. Wiggins, "'Minimum Entropy Deconvolution", Geoexploration, Vol. 16, No. 1, February 1978, pp. 21-35.

[21] R.A. Wiggins, "Minimum entropy deconvolution", Pres- ented at the 39th Meeting of the European Association of the Exploration Geophysicists, 1977.

[22] S. Wilks, Mathematical Statistics, Wiley, New York, 1962. [23] J.L. Yen, "Image reconstruction in synthesis radiotele-

scope arrays", in: S. Haykin, Ed., Array Signal Processing, Prentice-Hall, Englewood Cliffs, N J, 1985, pp. 293-350.

[24] Proc. Workshop on higher-order spectral analysis, Vail, CO, 28-30 June 1989.

Vo[. 20, No. 3, July 1990