tikhonov’s regularization to the deconvolution problem

This article was downloaded by: [University of Newcastle (Australia)]On: 01 October 2014, At: 01:00Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Theoryand MethodsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/lsta20

Tikhonov’s Regularization to theDeconvolution ProblemDang Duc Tronga, Cao Xuan Phuongb, Truong Trung Tuyenc & DinhNgoc Thanha

a Faculty of Mathematics and Computer Science, Ho Chi Minh CityNational University, Ho Chi Minh City, Viet Namb Faculty of Mathematics and Statistics, Ton Duc Thang University,Ho Chi Minh City, Viet Namc Department of Mathematics, Indiana University, Bloomington,Indiana, USAAccepted author version posted online: 11 Jun 2014.Publishedonline: 30 Sep 2014.

To cite this article: Dang Duc Trong, Cao Xuan Phuong, Truong Trung Tuyen & Dinh Ngoc Thanh (2014)Tikhonov’s Regularization to the Deconvolution Problem, Communications in Statistics - Theory andMethods, 43:20, 4384-4400, DOI: 10.1080/03610926.2012.721916

To link to this article: http://dx.doi.org/10.1080/03610926.2012.721916

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &

http://www.tandfonline.com/loi/lsta20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/03610926.2012.721916

http://dx.doi.org/10.1080/03610926.2012.721916

Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Communications in Statistics—Theory and Methods, 43: 4384–4400, 2014Copyright © Taylor & Francis Group, LLCISSN: 0361-0926 print / 1532-415X onlineDOI: 10.1080/03610926.2012.721916

Tikhonov’s Regularization to the DeconvolutionProblem

DANG DUC TRONG,1 CAO XUAN PHUONG,2 TRUONGTRUNG TUYEN,3 AND DINH NGOC THANH1

1Faculty of Mathematics and Computer Science, Ho Chi Minh City NationalUniversity, Ho Chi Minh City, Viet Nam2Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi MinhCity, Viet Nam3Department of Mathematics, Indiana University, Bloomington, Indiana, USA

We are interested in estimating a density function f of i.i.d. random variablesX1, . . . , Xnfrom the model Yj = Xj + Zj , where Zj are unobserved error random variables,distributed with the density function g and independent of Xj . This problem is knownas the deconvolution problem in nonparametric statistics. The most popular method ofsolving the problem is the kernel one in which, we assume gf t (t) �= 0, for all t ∈ R,where gf t (t) is the Fourier transform of g. The more general case in which gf t (t) mayhave real zeros has not been considered much. In this article, we will consider this case.By estimating the Lebesgue measure of the low level sets of gf t and combining with theTikhonov regularization method, we give an approximation fn to the density function fand evaluate the rate of convergence of sup

g∈Gs0 ,γ,M,Tsup

f∈Fq,KE ‖fn − f ‖2

L2(R). A lower bound

for this quantity is also provided.

Keywords Deconvolution; Tikhonov’s regularization; Fourier transform.

Mathematics Subject Classification 62G05; 62G07.

1. Introduction

In this article, we are interested in estimating an unknown density function f of the randomvariables i.i.d. X1, X2, . . . , Xn based on the direct random variables Y1, Y2, . . . , Yn frommodel

Yj = Xj + Zj , j = 1, 2, . . . , n. (1)

Received June 11, 2012; Accepted August 17, 2012.Address correspondence to Cao Xuan Phuong, Faculty of Mathematics and Statistics, Ton

Duc Thang University, Nguyen Huu Tho, District 7, Ho Chi Minh City, Viet Nam; E-mail:[email protected]

4384

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

Tikhonov’s Regularization 4385

Here, Zj are unobserved error random variables, distributed with the density function gand independent ofXj . We know that if h is the probability density function of Yj , then wehave the relation

h = f ∗ g (2)

where the symbol ∗ denotes the convolution of two functions f and g,

(f ∗ g) (x) =∫ +∞

−∞f (x − t) g (t) dt, x ∈ R.

We denote the Fourier transform of the function f by

f ft (t) =∫ +∞

−∞f (x) eitxdx, t ∈ R. (3)

Put

NZg = {t ∈ R : gft (t) �= 0

}.

Informally, if h is known, we can apply the Fourier transform to both sides of (2) to get

f ft = hft

gftfor all t ∈ NZg. (4)

Then using the inverse Fourier transform, we can find f . This is a classical problem inanalysis.

In practical situations, we do not have the density function h. We only have theobservations Yj , j = 1, . . . , n. The problem of recovering f from observations Yj is calledthe deconvolution problem in statistics or deconvolution problem for short. Equation (2) isan integral equation and solving (2) is a typically ill-posed problem.

A specific deconvolution problem is the one of consistency. To prove a deconvolutionproblem is consistent, we have to show that there exists a sequence of estimators {fn}suchthat

limn→+∞ E‖fn (·;Y1, . . . , Yn) − f ‖X = 0

where X is an appropriate Banach space.In fact, the simplest case isNZg = R. In this case, there are many methods to construct

the estimator fn(x;Y1, . . . , Yn). Kernel estimation is one of the most popular approach todeal with the deconvolution problem. In this method, one estimates the density function fby the estimator

fn(x;Y1, . . . , Yn) = 1

2π

∫ +∞

−∞e−itx

K ft (tb)

gft (t)

1

n

n∑j=1

eitYj dt, (5)

where K is a kernel function and K ft is compactly supported. This method was firstintroduced in the articles of Carroll and Hall (1988), Stefanski and Carroll (1990), andFan (1991a, b). The estimator (5) has known as the standard deconvolution kernel density.We note that the estimator (5) is defined as gft(t) �= 0 for all t ∈ R, and so the condition

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

4386 Trong et al.

NZg = R has become common in deconvolution topics. In fact, the density functions goften satisfies ∣∣gft (t)

∣∣ ≥ C(1 + t2

)−αexp {−C0|t |γ }

where C, C0 > 0, α ≥ 0, γ ≥ 0 and α + γ > 0. Similarly, in case of bivariate randomvariables, Goldenshluger (2002) also assumed that

min|t |≤v

∣∣gft (t)∣∣ ≥ C exp

{−C0v2} , ∀v > 0.

However, there are many important density functions which do not satisfy NZg = R, e.g.,the uniform densities, the self-convolved uniform densities or the convolution of an arbitrarydensity function with a uniform density.

Deconvolution problem in the caseNZg �= R is very difficult. According to our knowl-edge, there are only a few articles mentioning this case. The first article which consideredthis problem is Devroye (1989). The consistency was established with respect to theL1(R)-norm. Using the technique of truncation, he constructed a consistent estimator fn for thetarget density f when the Fourier transform gft vanishes on a Lebesgue-zero-set,

fn (x;Y1, . . . , Yn) = 1

2πRe

⎧⎨⎩∫

R\Are−itx

K (th)

gft (t)

1

n

n∑j=1

eitYj dt

⎫⎬⎭ , |x| < T,

= 0, |x| ≥ T ,

whereAr = {t ∈ R :

∣∣gft (t)∣∣ < r

}, r > 0, h > 0 andK ft is compactly supported. However,

no convergence rate is provided in Devroye (1989).In Meister (2007), the density deconvolution is also considered in case the target

density f is contained in the class of densities which satisfy∫ S−S f (x) dx = 1 and

∫ +∞

−∞

∣∣f ft (t)∣∣2(1 + t2

)βdt ≤ C

with S, C, β > 0 whereas the error density belongs to the class of densities whichhas

∣∣gft (t)∣∣ ≥ μ for t ∈ [−ν, ν] and ‖g‖∞ ≤ C. The rate of uniform convergence of

MISE(fn, f ) is O((ln n)−2β(1−δ)(ln ln n)2β) with δ ∈ [0, 1). This rate is only derived when

the sample size n is chosen such that S ∈ [O((ln n)δ)O(1) ;O

((ln n)δ

)]. Actually, this condition

is difficult to verify because S is not known exactly and so we cannot choose n exactly ingeneral.

In Groeneboom and Jongbloed (2003), the authors focused on considering the de-convolution problem in a uniform density model. By choosing a suitable bandwidth, theyproved that it is possible to construct a kernel estimator of target density f if f has a finiteleft endpoint. In Hall and Meister (2007), the authors have given an approach to solve thedeconvolution problem in case NZg �= R. To avoid division by zero, the authors replacedgft (t) by the maximum of gft (t) and hn (t) = n−ξ |t |ρ with ξ > 0, ρ > 0. The function hn(t)as above is called the “ridge function.” An estimator for the density f is defined by

fn (x) = Re

⎧⎨⎩ 1

2π

∫ +∞

−∞e−itx

gft (−t) ∣∣gft (t)∣∣r(

max{∣∣gft (t)

∣∣ ;hn (t)})r+2

1

n

n∑j=1

eitYj dt

⎫⎬⎭ , (6)

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014


with r ≥ 0. The optimal rates of estimation are provided. Recently, using a modified kernelmethod, Delaigle and Meister (2011) also gave a similar result. We see that the conditionimposed on gft is very strict. In these papers, the density function g is assumed to satisfy∣∣gft (t)

∣∣ ≥ c|sin (kt)|ν(1 + |t |)−α exp{−d|t |β} , t ∈ R, (7)

with k > 0, c > 0, d > 0, α ≥ 0, β ≥ 0, α + β > 0, ν > 0. In this case, R\NZg ⊂{nπk

: n ∈ Z}. In other words, the positions of zeros of gft are fixed. If g is a uniform

density then (7) holds, but if g is an arbitrary density function then the condition (7) oftendoes not satisfy.

Motivated by this problem, in the present article, we will consider the deconvolutionproblem in case the Fourier transform of error distribution has zeros on the real line,with no specific constraints on NZg . Applying Tikhonov’s regularization, we introduce anestimation procedure for the target density function. Using properties of entire functionsand some results from harmonic analysis, we consider the low level sets of the function gft

and give the convergence rate of our procedure.The rest this article consists of three sections. In Sec. 2, we will present the Tikhonov’s

regularization; and use it to give a result of consistency and an estimation for probabilitydensity functions. In Sec. 3, we state and prove approximation results and provide a lowerbound for error of the estimator. In Sec. 4, by estimating the Lebesgue measure of the lowlevel sets of the Fourier transform of g, we prove Lemma 3.1 which is stated in Sec. 3.

2. Tikhonov’s Regularization

As discussed in Sec. 1, we know that Problem (2) is typically ill-posed and a regularizationis required. In the theory of ill-posed problems, a method of regularization which is oftenused for the deconvolution problem is the Tikhonov regularization. In this method, wewill approximate f ft by a function having the form ϕgft where ϕ is often called the “filterfunction.” In fact, we consider the linear operator A : L2 (R) → L2 (R), A (ϕ) = ϕgft forall ϕ ∈ L2(R). For each δ > 0, we consider the Tikhonov functional

Jδ (ϕ) = ∥∥Aϕ − hft∥∥2

L2(R) + δ ‖ϕ‖2L2(R) , ϕ ∈ L2(R). (8)

We will find the function ϕ minimizing Jδ . As known, Jδ attains its minimum at a uniqueminimum function ϕδ ∈ L2(R). This minimum ϕδ is the unique solution of the equation

δϕδ + (A∗A

)(ϕδ) = A∗ (hft

), (9)

where A∗ : L2(R) → L2(R) is the adjoint operator of A (see Theorem 2.11, Sec. 2.2 inKirsch,1996). From (9), we get

δϕδ + ∣∣gft∣∣2ϕδ = gft hft

and so we have the approximation

ϕδ = gft hft

δ + ∣∣gft∣∣2

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

4388 Trong et al.

to the Fourier transform f ft of the density function f . After that, using the inverse Fouriertransform gives

fδ (x) = 1

2π

∫ +∞

−∞e−itx

gft (t)hft (t)

δ + ∣∣gft (t)∣∣2 dt, x ∈ R. (10)

This can be seen as an estimator for the density function f .As mentioned, in practical situations we do not have the density function h, we have

only the observations Y1, . . . , Yn. Thus, we cannot use directly the formula (10) to makean approximation for f . However, in case we have the i.i.d. observations Y1, . . . , Yn, since

E

⎛⎝1

n

n∑j=1

eitYj

⎞⎠ = 1

n

n∑j=1

E(eitYj

) = hft (t) ,

we can replace hft (t) in (10) by the quantity

� (t ;Y1, ..., Yn) = 1

n

n∑j=1

eitYj . (11)

It suggests an approximation for the density function f based on Y1, . . . , Yn as follows:

Lδ,g(x;Y1, . . . , Yn) = 1

2π

∫ +∞

−∞e−itx

gft (t)

δ + ∣∣gft (t)∣∣2 1

n

n∑j=1

eitYj dt. (12)

We have the following general estimate for error E∥∥Lδ,g − f

∥∥2L2(R).

Lemma 2.1. Let δ > 0, g ∈ L1 (R) ∩ L2 (R) be the density function of error randomvariables and f ∈ L1 (R) ∩ L2 (R) be the solution of Problem (2). Then

E‖Lδ,g − f ‖2L2(R) = 1

2πn

∫ +∞

−∞

|gft(t)|2

(δ + |gft(t)|2)2 (1 − |f ft(t)gft(t)|2)dt

+ 1

2π

∫ +∞

−∞

∣∣f ft (t)∣∣2( δ

δ + ∣∣gft (t)∣∣2)2

dt. (13)

Proof. From (12), we get

Lftδ,g (t) = gft (t)

δ + ∣∣gft (t)∣∣2 1

n

n∑j=1

eitYj , t ∈ R.

Applying the Parseval identity, the Fubini theorem and the equality

E∣∣Lft

δ,g (t) − f ft (t)∣∣2 = V arLft

δ,g (t) + ∣∣ELftδ,g (t) − f ft (t)

∣∣2,

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014


we derive

E∥∥Lδ,g − f

∥∥2L2(R) = 1

2π

∫ +∞

−∞E∣∣Lft

δ,g (t) − f ft (t)∣∣2 dt

= 1

2π

∫ +∞

−∞V arLft

δ,g (t) dt + 1

2π

∫ +∞

−∞

∣∣ELftδ,g (t) − f ft (t)

∣∣2 dt.As the Y1, Y2, . . . , Yn are i.i.d. random variables, we get

∫ +∞

−∞VarLft

δ,g(t)dt = 1

n2

∫ +∞

−∞

∣∣∣∣∣ gft (t)

δ + ∣∣gft (t)∣∣2∣∣∣∣∣2 n∑j=1

Var(eitYj

)dt

= 1

n

∫ +∞

−∞

∣∣∣∣∣ gft (t)

δ + ∣∣gft (t)∣∣2∣∣∣∣∣2

Var(eitY1

)dt

= 1

n

∫ +∞

−∞

∣∣gft (t)∣∣2(

δ + ∣∣gft (t)∣∣2)2

(E∣∣eitY1

∣∣2 − ∣∣E (eitY1)∣∣2) dt

= 1

n

∫ +∞

−∞

∣∣gft (t)∣∣2(

δ + ∣∣gft (t)∣∣2)2

(1 − ∣∣hft (t)

∣∣2) dt

= 1

n

∫ +∞

−∞

∣∣gft (t)∣∣2(

δ + ∣∣gft (t)∣∣2)2

(1 − ∣∣f ft (t) gft (t)

∣∣2) dtand

∫ +∞

−∞

∣∣ELftδ,g (t) − f ft (t)

∣∣2dt =∫ +∞

−∞

∣∣∣∣∣ gft (t)

δ + ∣∣gft (t)∣∣2 E

(eitY1

)− f ft (t)

∣∣∣∣∣2

dt

=∫ +∞

−∞

∣∣∣∣∣ gft (t)

δ + ∣∣gft (t)∣∣2 hft (t) − f ft (t)

∣∣∣∣∣2

dt

=∫ +∞

−∞

∣∣∣∣∣∣∣gft (t)

∣∣2δ + ∣∣gft (t)

∣∣2 f ft (t) − f ft (t)

∣∣∣∣∣2

dt

=∫ +∞

−∞

∣∣f ft (t)∣∣2( δ

δ + ∣∣gft (t)∣∣2)2

dt.

Combining the above equalities, we get the conclusion of the Lemma. �

The consistency of the problem with respect to theL1(R)-norm was studied in Devroye(1989). Meister (2005) also gave a very general consistency result in L2(R)-weighted norm(NZg is assumed to be dense in R). Now, we give a consistency result in L2(R)-norm witha simple estimator and an easy proof.

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

4390 Trong et al.

Theorem 2.1. Let g ∈ L1 (R)∩L2 (R) be the density function of the error random variablesand f ∈ L1 (R) ∩ L2 (R) be the solution of Problem (2). Assume that m

(R\NZg

) = 0wherem(·) is the Lebesgue measure on R. Let (δn) be a positive sequence such that δn → 0,nδ2n → +∞ as n → +∞. Then,

limn→+∞ E

∥∥Lδn,g − f∥∥2L2(R) = 0.

Proof. Applying the result (13) of Lemma 2.1, we have

E∥∥Lδn,g − f

∥∥2L2(R) = 1

2πn

∫ +∞

−∞

∣∣gft (t)∣∣2(

δn + ∣∣gft (t)∣∣2)2

(1 − ∣∣f ft (t) gft (t)

∣∣2) dt

+ 1

2π

∫ +∞

−∞

∣∣f ft (t)∣∣2( δn

δn + ∣∣gft (t)∣∣2)2

dt

≤ 1

2πnδ2n

∫ +∞

−∞

∣∣gft (t)∣∣2dt + 1

2π

∫ +∞

−∞

∣∣f ft (t)∣∣2

×(

δn

δn + ∣∣gft (t)∣∣2)2

dt = 1

nδ2n

‖g‖2L2(R) + 1

2π

∫ +∞

−∞

∣∣f ft (t)∣∣2

×(

δn

δn + ∣∣gft (t)∣∣2)2

dt.

Using the Lebesgue dominated convergence theorem, we get E∥∥Lδn,g − f

∥∥2L2(R) → 0 as

n → +∞. The proof of the theorem is completed. �

The following theorem will be used in Sec. 2 to get the main result of our article.

Theorem 2.2. Let ρ > 0, δ > 0, R > 0, let g ∈ L1 (R) ∩L2 (R) be the density function oferror random variables and f ∈ L1 (R) ∩ L2 (R) be the solution of Problem (2). Then

E∥∥Lδ,g − f

∥∥2L2(R) ≤ C1

(m(Bρ,R

)+∫

|t |>R

∣∣f ft (t)∣∣2dt + δ2

ρ4+ 1

nδ2

), (14)

where

C1 = 1

2πmax

{1;∥∥f ft

∥∥2

L2(R) ;∥∥gft

∥∥2

L2(R)

},

Bρ,R = {t ∈ R :

∣∣gft (t)∣∣ < ρ, |t | < R

}.

Proof. From (13), we get the estimate

E∥∥Lδ,g − f

∥∥2L2(R) ≤ 1

2πnδ2

∫ +∞

−∞

∣∣gft (t)∣∣2dt + 1

2π

∫ +∞

−∞

∣∣f ft (t)∣∣2( δ

δ + ∣∣gft (t)∣∣2)2

dt.

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014


Now we write

∫ +∞

−∞

∣∣f ft (t)∣∣2 ( δ

δ + ∣∣gft (t)∣∣2)2

dt =∫

|t |<R, |gft(t)|<ρ∣∣f ft (t)

∣∣2 ( δ

δ + ∣∣gft (t)∣∣2)2

dt

+∫

|t |>R, |gft(t)|<ρ∣∣f ft (t)

∣∣2 ( δ

δ + ∣∣gft (t)∣∣2)2

dt

+∫|gft(t)|>ρ

∣∣f ft (t)∣∣2 ( δ

δ + ∣∣gft (t)∣∣2)2

dt

≤∫

|t |<R, |gft(t)|<ρdt +

∫|t |>R, |gft(t)|<ρ

∣∣f ft (t)∣∣2 dt + (

δ

ρ2

)2 ∫|gft(t)|>ρ

∣∣f ft (t)∣∣2 dt

≤ m(Bρ,R

)+∫

|t |>R

∣∣f ft (t)∣∣2dt + δ2

ρ4

∥∥f ft∥∥2

L2(R) .

Therefore,

E∥∥Lδ,g − f

∥∥2L2(R) ≤ C1

(m(Bρ,R) +

∫|t |>R

∣∣f ft (t)∣∣2dt + δ2

ρ4+ 1

nδ2

),

where C1 = 12π max{1;

∥∥f ft∥∥2L2(R) ;

∥∥gft∥∥2L2(R)}. The proof of the theorem is completed. �

3. Approximation Results

The difficulty of applying directly the formula (4) arises in two aspects: one, in reality wecannot have the density function h, and two, even we have h, we cannot compute efficientlythe inverse Fourier transform of f ft if the function gft has zeros on the real axis. Hence, aregularization is in order.

For each entire function ψ , low level sets of ψ are defined by {z ∈ C : |ψ (z)| < ε},ε > 0. In case g is compactly supported, the set of zeros gft(t) affects heavily the recoveringof the function f from its Fourier transform. For actually computing the solution f and forthe regularization of Eq. (2), we must know more about the Lebesgue measure of the lowlevel sets of gft. The latter goes back to the well-known theorem of Cartan about the sizeof the low level sets Aε = {z ∈ C : |P (z)| < ε}, ε > 0, where P (z) is a polynomial. Heproved that Aε is contained in a finite set of disks whose sum of radius is less than Cε

1n ,

where n is the degree ofP (z) and C is a constant that depends only on the leading coefficientof P (z) and n (see Theorem 3 of sec. 11.2 in Levin, 1996). In particular, we have

limε→0

m ({z ∈ C : |P (z)| < ε}) = 0

where m (·) is the Lebesgue measure on C.We will use the Cartan’s theorem to give an asymptotic estimate on the low level

sets {t ∈ R : |gft(t)| < ε, |t | < Rε} of the function gft, and shall apply this estimate to theTikhonov regularization of Problem (2).

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

4392 Trong et al.

To get an explicit estimate for E∥∥Lδ,g − f

∥∥2L2(R), some prior information to f and g

must be assumed. From now on, we assume that f is contained in the set

Fq,K ={f ∈ L1 (R) ∩ L2 (R) : f ≥ 0,

∫ +∞

−∞f (x) dx = 1,

∣∣f ft (t)∣∣2 ≤ K(

1 + t2)q},

(15)where K ≥ 1 and q > 1

2 . The condition∣∣f ft (t)

∣∣2 ≤ K

(1+t2)q imposed on the density f ∈Fq,K as in (15) is quite natural. It is equivalent to the condition that the density function fis in the Sobolev space Wq,1(R).

Moreover, let s0 > 0, γ ∈ (1, 2), M ≥ 1 and T > 12 , we assume that the density func-

tion g of the error random variables belongs to the class of functions

Gs0,γ,M,T ={g ∈ L2 (R) : g ≥ 0,

∫ +∞

−∞g (x) dx = 1,

∫ +∞

−∞g (t) es0|t |

γ

dt ≤ M, ‖g‖2L2(R) ≤ T

}. (16)

We note that the density function g (x) = 1√2πe−

x2

2 of the Gauss distribution and compactlysupported density functions are in Gs0,γ,M,T . Not all of such functions satisfies NZg = R.

For each ε > 0, we put

sε = inf

{s > 0 :

∫|t |≥s

|g (t)| dt ≤ ε

}. (17)

Lemma 3.1. Let s0 > 0, λ ∈ (1, 2),M ≥ 1, T > 12 , β ∈ (0, 1), q > 1

2 and let g ∈ Gs0,γ,M,Tbe the density function of error random variables. For ε > 0 small enough, choose Rε tosatisfy

2esεRε

[(q + 1

2

)lnRε + ln

(15e3

)] = − ln(εβ + ε

). (18)

If ε > 0 is small enough then

m(Bεβ,Rε

) ≤ 2R−q+ 1

2ε ,

where Bρ,R is defined in Theorem 2.2.

We will prove Lemma 3.1 in Sec. 4 (see also Trong and Tuyen, (2006). Our main resultis the following.

Theorem 3.1. Let s0 > 0, γ ∈ (1, 2), α ∈ (0, 1), β ∈ (0, 1), αβ < 14 , ν = 1

4 + αβ and letK ≥ 1, M ≥ 1, T > 1

2 , q > 12 . Choosing ε = n−α, δ = n−ν and denoting

fn (x) = Lδ,g (x;Y1, . . . , Yn) ,

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014


we have the estimate

supg∈Gs0 ,γ,M,T

(supf∈Fq,K

E ‖fn − f ‖2L2(R)

)

≤ C3

⎡⎣( β (s0)

1γ

30 (2q + 1) e4

)− q

2 + 12

(ln (nα))(

12 − 1

2γ

)(−q+ 1

2 ) + 2(ln (nα))2ν−1α

⎤⎦

for all n ∈ N large enough, where C3 > 0 depends on q, K, T.

Proof. Let f ∈ Fq,K and g ∈ Gs0,γ,M,T . We have∫|t |>Rε

∣∣f ft (t)∣∣2dt ≤

∫|t |>Rε

Kdt(1 + t2

)q ≤∫

|t |>Rε

Kdt

|t |2q ≤ 2K

2q − 1R

−q+ 12

ε

for all ε > 0 small enough. Combining this with Theorem 2.2 and Lemma 3.1, we get

E∥∥Lδ,g − f

∥∥2L2(R) ≤ C1

[(2 + 2K

2q − 1

)R

−q+ 12

ε + δ2

ε4β+ 1

nδ2

]

≤ C2

(R

−q+ 12

ε + δ2

ε4β+ 1

nδ2

)

for all ε > 0 small enough, where C2 = C1

(2 + 2K

2q−1

). Moreover, for all ε > 0 small

enough, from (25) we have the estimate

R−q+ 1

2ε ≤

(β (s0)

1γ

30 (2q + 1) e4

)− q

2 + 12(

ln

(1

ε

))( 12 − 1

2γ

)(−q+ 1

2 ).

Therefore

E∥∥Lδ,g − f

∥∥2L2(R) ≤ C2

[(β (s0)

1γ

30 (2q + 1) e4

)− q

2 + 12

×(

ln

(1

ε

))( 12 − 1

2γ

)(−q+ 1

2 )+ δ2

ε4β+ 1

nδ2

].

Replacing ε = n−α , δ = n−ν to the right hand side of the latter inequality, we obtain

E ‖fn − f ‖2L2(R) ≤ C2

⎡⎣( β (s0)

1γ

30 (2q + 1) e4

)− q

2 + 12

(ln (nα))(

12 − 1

2γ

)(−q+ 1

2 ) + 2n2ν−1

⎤⎦

for all n ∈ N large enough.Furthermore, we have n2ν−1 = (nα)

2ν−1α ≤ (ln (nα))

2ν−1α . Hence,

E ‖fn − f ‖2L2(R)

≤ C2

⎡⎣( β (s0)

1γ

30 (2q + 1) e4

)− q

2 + 12

(ln (nα))(

12 − 1

2γ

)(−q+ 1

2 ) + 2(ln (nα))2ν−1α

⎤⎦

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

4394 Trong et al.

for all n ∈ N large enough.Because f ∈ Fq,K and g ∈ Gs0,γ,M,T , we have

C1 = 1

2πmax

{1;∥∥f ft

∥∥2

L2(R) ;∥∥gft

∥∥2

L2(R)

}≤ 1

2πmax

{1; 2π

∫ +∞

−∞

Kdt(1 + t2

)q ; 2πT

}.

So, for all n ∈ N large enough, we have

E ‖fn − f ‖2L2(R)

≤ C3

⎡⎣( β (s0)

1γ

30 (2q + 1) e4

)− q

2 + 12

(ln (nα))(

12 − 1

2γ

)(−q+ 1

2 ) + 2(ln (nα))2ν−1α

⎤⎦ , (19)

where

C3 = 1

2π

(2 + 2K

2q − 1

)max

{1; 2π

∫ +∞

−∞

Kdt(1 + t2

)q ; 2πT

}.

We note that the right hand side of (19) is independent of f and g. Therefore,

supg∈Gs0 ,γ,M,T

(supf∈Fq,K

E ‖fn − f ‖2L2(R)

)

≤ C3

⎡⎣( β (s0)

1γ

30 (2q + 1) e4

)− q

2 + 12

(ln (nα))(

12 − 1

2γ

)(−q+ 1

2 ) + 2(ln (nα))2ν−1α

⎤⎦

The proof of the theorem is completed. �

In case the error density function g is compactly supported, we get the following result

Theorem 3.2. Let assumptions be as in Theorem 3.2. Moreover, assume that the densityfunction g has supp g ⊂ [−L;L] where supp g is the support of g. Then,

supg∈Gs0 ,γ,M,T

(supf∈Fq,K

E ‖fn − f ‖2L2(R)

)

≤ C3

[(1

30L (2q + 1) e4

)− q

2 + 14

(ln (nα))−q

2 + 14 + 2(ln (nα))

2ν−1α

],

where constant C3 > 0 depends on q, K, T .

Proof. For each ε > 0, from the definition of sε in (17), we get∫|t |≥sε |g (t)| dt ≤ ε.

Moreover, from the property of infimum, we have∫|t |≥sε−η |g (t)| dt > ε for all η > 0,

which implies∫|t |≥sε |g (t)| dt ≥ ε. So,

∫|t |≥sε

|g (t)| dt = ε.

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014


If sε > L then∫|t |≥sε |g (t)| dt = 0 which is a contradiction. So sε ≤ L for all ε > 0 small

enough.From (21), for all ε > 0 small enough, we have

R2ε ≥ 1

15L (2q + 1) e4

[1

βln

(1

ε

)+ ln

(1

1 + ε1−β

)]≥ 1

30L (2q + 1) e4ln

(1

ε

).

It follows that

R−q+ 1

2ε ≤

(1

30L (2q + 1) e4

)− q

2 + 14(

ln

(1

ε

))− q

2 + 14

.

Therefore, combining with the proof of Theorem 3.1, we derive

E∥∥Lδ,g − f

∥∥2L2(R) ≤ C3

[(1

30L (2q + 1) e4

)− q

2 + 14(

ln

(1

ε

))− q

4 + 14

+ δ2

ε4β+ 1

nδ2

].

Replacing ε = n−α , δ = n−ν , we get

E ‖fn − f ‖2L2(R) ≤ C3

[(1

30L (2q + 1) e4

)− q

2 + 14

(ln (nα))−q

2 + 14 + 2(ln (nα))

2ν−1α

]

for all n ∈ N large enough. The proof of the theorem is completed. �

Thus, from Theorem 3.1 we see that the MISE of our estimator attains the logarithmicrate. The following theorem will show that the logarithmic rates are unavoidable.

Theorem 3.3. Let s0 > 0, γ ∈ (1, 2), K ≥ 1, M ≥ 1, T > 12 , q > 1

2 and m is an integergreater than q. Then, for all δ > 0 small enough, we have

supg∈Gs0 ,γ,M,T

(supf∈Fq,K

E∥∥Lδ,g − f

∥∥2L2(R)

)≥ 4−m

8πm

(ln

(1

δ

))−2m

.

Proof. We consider the density function g0 (x) = 1√2πe−

x2

2 . The function g0 ∈ Gs0,γ,M,Tand has gft

0 (t) = e−t2

2 . With the density function ψ (x) = 12e

−|x| of Laplace distribution(also known as double-exponential density; see p. 35, Sec. 2.4 in Meister, 2009), thefunction

f0 = ψ ∗ ψ ∗ ... ∗ ψ︸︷︷︸m times

is also a density function. This function has f ft0 (t) = (

ψ ft (t))m = 1

(1+t2)m . Since

∣∣f ft0 (t)

∣∣2 = 1(1 + t2

)2m ≤ 1(1 + t2

)m ≤ K(1 + t2

)q ,

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

4396 Trong et al.

we have f0 ∈ Fq,K . We denote

Hδ = supg∈Gs0 ,γ,M,T

(supf∈Fq,K

E∥∥Lδ,g − f

∥∥2L2(R)

).

From the equality (13) of Lemma 2.1, for all f ∈ Fq,K and g ∈ Gs0,γ,M,T , we have

E∥∥Lδ,g − f

∥∥2L2(R) ≥ 1

2π

∫ +∞

−∞

∣∣f ft (t)∣∣2 δ2(δ + ∣∣gft (t)

∣∣2)2 dt.

Therefore,

Hδ ≥ supf∈Fq,K

E∥∥Lδ,g0 − f

∥∥2L2(R) ≥ E

∥∥Lδ,g0 − f0

∥∥2L2(R)

≥ 1

2π

∫ +∞

−∞

∣∣f ft0 (t)

∣∣2 δ2(δ + ∣∣gft

0 (t)∣∣2)2 dt.

= 1

2π

∫ +∞

−∞

δ2(1 + t2

)2m (δ + e−t2

)2 dt

≥ 1

2π

∫e−t2 ≤δ

δ2(1 + t2

)2m (δ + e−t2

)2 dt

= 1

π

∫ +∞√

ln( 1δ )

δ2(1 + t2

)2m (δ + e−t2

)2 dt

≥ 1

4π

∫ +∞√

ln( 1δ )

1(1 + t2

)2m dt

≥ 1

4π

∫ +∞√

ln( 1δ )

2t(1 + t2

)2m+1 dt.

By direct computations, we derive

Hδ ≥ 1

8πm

(1 + ln

(1

δ

))−2m

≥ 4−m

8πm

(ln

(1

δ

))−2m

for all δ > 0 small enough. The proof of the theorem is completed. �

Choosing δ = n−ν with ν = 14 +αβ, α ∈ (0, 1), β ∈ (0, 1), αβ < 1

4 as in Theorem 3.2and denoting fn(x) = Lδ,g(x;Y1, . . . , Yn), we get

Corollary 3.1. Let assumptions be as in Theorems 3.1 and 3.3. Then

supg∈Gs0 ,γ,M,T

(supf∈Fq,K

E ‖fn − f ‖2L2(R)

)≥ 4−m

8πm(ν ln (n))−2m

for all n ∈ N large enough.

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014


4. Proof of Lemma 3.1

To estimate the Lebesgue measure of low level sets, we shall use the following result (seeTheorem 4, Sec. §11.3 in Levin, 1996).

Lemma 4.1. Let f (z) be an analytic function in the disk {z : |z| ≤ 2eR}, |f (0)| = 1, andlet η be an arbitrary small positive number. Then the estimate

ln |f (z)| > − ln

(15e3

η

). lnMf (2eR)

is valid everywhere in the disk {z : |z| ≤ R} except a set of disks (Dj ) with sum of radius∑rj ≤ ηR, where Mf (r) = max

|z|=r|f (z)| .

Using the latter lemma, we now state and prove an estimate for the Lebesgue measureof the low level sets.

Theorem 4.1. Let the density function g be in Gs0,λ,M,T where s0 > 0, λ ∈ (1, 2), M ≥ 1,T > 1

2 and let β ∈ (0, 1), q > 12 . For ε > 0 small enough, we choose sε as in (17) and

choose Rε to satisfy

2esεRε

[(q + 1

2

)lnRε + ln

(15e3

)] = − ln(εβ + ε

). (20)

Then,

limε→0

Rε = +∞.

Moreover, if ε is small enough, we have

m (Dεβ+ε) ≤ 2R−q+ 1

2ε ,

where

Dεβ+ε = {z ∈ R : |�ε (z)| < εβ + ε, |z| < Rε

}with

�ε (z) =∫ sε

−sεg (t) eztdt, z ∈ C.

Proof. The proof of the theorem is divided into two steps. In Step 1, we give the ex-istence of Rε and prove that lim

ε→0Rε = +∞. In Step 2, we shall estimate m (Dεβ+ε).

Step 1. We consider the function

ψ (R) = 2esεR

[(q + 1

2

)lnR + ln

(15e3

)]+ ln(εβ + ε

), R ≥ 0.

We have ψ (R) → +∞ as R → +∞ and ψ(R) → ln(εβ + ε

)< 0 as R → 0 for ε > 0

small enough. So there exists an Rε > 0 such that ψ(Rε) = 0, i.e., Rε satisfies (20). Also,

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

4398 Trong et al.

from (20), we get

(2q + 1) eRε ln(15e3Rε

) ≥ 1

sεln

(1

εβ + ε

).

In view of the inequality ln x ≤ x for all x > 0, we have

R2ε ≥ 1

15e4 (2q + 1)

β ln(

1ε

)+ ln(

11+ε1−β

)sε

. (21)

From the definition of sε in (17), we get∫|t |≥sε |g (t)| dt = ε. Thus,

Me−s0(sε)γ ≥ e−s0(sε)γ∫ +∞

−∞es0|t |

γ |g (t)| dt ≥ e−s0(sε)γ∫

|t |≥sεes0|t |

γ |g (t)| dt ≥ ε.

The latter inequality implies

sε ≤(

1

s0

) 1γ(

ln

(M

ε

)) 1γ

. (22)

From (21) and (22), we get

R2ε ≥ 1

15e4 (2q + 1)

⎡⎣β (s0)

1γ

ln(

1ε

)(ln(Mε

)) 1γ

+ (s0)1γ .

ln(

11+ε1−β

)(ln(Mε

)) 1γ

⎤⎦ . (23)

As

limε→0

⎡⎣ ln

(1ε

)(ln(Mε

)) 1γ

:

(ln

(M

ε

)) 12 − 1

2γ

⎤⎦ = +∞,

we obtain

ln(

1ε

)(ln(Mε

)) 1γ

≥(

ln

(M

ε

)) 12 − 1

2γ

, (24)

for all ε > 0 small enough. Furthermore, since

limε→0

ln(

1ε

)(ln(Mε

)) 1γ

= +∞, limε→0

ln(

11+ε1−β

)(ln(Mε

)) 1γ

= 0,

we have

R2ε ≥ β (s0)

1γ

30 (2q + 1)βe4

(ln

(M

ε

)) 12 − 1

2γ

≥ β (s0)1γ

30 (2q + 1) e4

(ln

(1

ε

)) 12 − 1

2γ

(25)

for all ε > 0 small enough.From the inequality (25), we obtain Rε → +∞ as ε → 0.

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014


Step 2. We see that �ε is an entire function, i.e., a complex function analytic on C, andmoreover

|�ε (z)| =∣∣∣∣∫ sε

−sεg (t) eztdt

∣∣∣∣ ≤∫ sε

−sε|g (t)| e|z||t |dt ≤ e|z|sε , z ∈ C. (26)

Since g is a density function, �ε is a non trivial entire function, and so there exists anx0 ∈ R such that |�ε (x0)| = C4 > 0. Changing variable if necessary, we may assume that|�ε (0)| = 1 if ε > 0 small enough.

For all |z| = 2eRε, from (26), we get |�ε (z)| ≤ e2esεRε and that

lnM�ε (2eRε) ≡ ln

(max

|z|=2eRε|�ε (z)|

)≤ 2esεRε.

We choose ηε = R−q− 1

2ε . Then, for all |z| ≤ Rε, applying Lemma 4.1, we have the estimate

|�ε (z)| ≥ exp

{− ln

(15e3

R−q− 1

2ε

)lnM�ε (2eRε)

}

≥ exp

{−2esεRε

[(q + 1

2

)lnRε + ln

(15e3

)]}= εβ + ε

except a set of disks{D(zj , rj

)}j∈J whose sum of radius is less than ηεRε = R

−q+ 12

ε . Thisimplies

Dεβ+ε ≡ {z ∈ R : |�ε (z)| < εβ + ε, |z| < Rε

} ⊂⋃j∈J

(D(zj , rj

) ∩ R)

(27)

where we recall

D(zj , rj

) = {z ∈ C :

∣∣z− zj∣∣ < rj

}, j ∈ J.

From (27) we derive

m (Dεβ+ε) ≤ m

⎛⎝⋃j∈J

(D(zj , rj

) ∩ R)⎞⎠ ≤

∑j∈J

m(D(zj , rj

) ∩ R) ≤

∑j∈J

2rj ≤ 2R−q+ 1

2ε

for all ε > 0 small enough. This completes the proof of Step 2 and the proof of ourtheorem. �

Finally, we turn to the following proof.

Proof of Lemma 3.1 For each ε > 0, we put

gε (t) ={g (t) , |t | ≤ sε,

0, |t | > sε.

We see that gftε (x) = �ε (ix) if x ∈ R. For all x ∈ R, we have

∣∣gftε (x) − gft (x)

∣∣ =∣∣∣∣∫ sε

−sεg (t) eitxdt −

∫ +∞

−∞g (t) eitxdt

∣∣∣∣ ≤∫

|t |≥sε|g (t)| dt = ε.

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

4400 Trong et al.

Thus, if∣∣gft (x)

∣∣ < εβ , |x| < Rε then∣∣gftε (x)

∣∣ < εβ + ε. This implies |�ε (ix)| < εβ + ε.Applying Theorem 4.1, we get

m(Bεβ,Rε

) ≤ m (Dεβ+ε) ≤ 2R−q+ 1

2ε

for all ε > 0 small enough. The proof of the lemma is completed. �Acknowledgments

We thank the referees for their kind and careful reading of the article and for helpfulcomments and suggestions which led to this improved version.

Funding

This article is supported by National Foundation of Scientific and Technology Development(NAFOSTED) Project 101.01-2012.07.

References

Carroll, R., Hall, P. (1988). Optimal rates of convergence for deconvolving a density. J. Amer. Statist.Assoc. 83(404):1184–1186.

Delaigle, A., Meister, A. (2011). Nonparametric function estimation under Fourier-oscillating noise.Statistics Sinica 21:1065–1092.

Devroye, L. (1989). Consistent deconvolution in density estimation. Cana. J. Statist, 2:235–239.Fan, J. (1991a). A symptotic normality for deconvolution kernel density estimators. Sankhya

53:97–110.Fan, J. (1991b). On the optimal rates of convergence for nonparametric deconvolution problems, Ann.

Statist. 19(3):1257–1272.Goldenshluger, A. (2002). Density deconvolution in the circular structural model. J. Multivariate

Anal. 81:360–375.Groeneboom, P., Jongbloed, G. (2003). Density estimation in the uniform deconvolution model. Stat.

Neerlandica 57:136–157.Hall, P. Meister, A., (2007). A ridge-parameter approach to deconvolution. Ann. Statist.

35(4):1535–1558.Kirsch, A. (1996). An Introduction to the Mathematical Theory of Inverse Problems. New York:

Springer-Verlag.Levin, B. Y. (1996). Lectures on Entire Functions. Trans. Math. Monographs, vol. 150, Providence,

Rhole Island: AMS.Meister, A. (2009). Deconvolution Problems in Nonparametric Statistics. Berlin Heidelberg:

Springer-Verlag.Meister, A. (2005). Deconvolving compactly supported densities. Mathemat. Meth. Statist.

16(1):63–76.Meister, A. (2005). Non-estimability in spite of identifiability in density deconvolution. Mathemat.

Meth. Statist. 14(4):479–487.Stefanski, L., Carroll, R. (1990). Deconvoluting kernel density estimators. Statistics 169–184.Trong, D. D., Tuyen, T. T. (2006). Error of Tikhonov’s regularization for integral convolution equation.

http://arxiv.org/abs/math/0610046, Version 1, October 1.

Dow

nloa

ded

by [

Uni

vers

ity o

f N

ewca

stle

(A

ustr

alia

)] a

t 01:

00 0

1 O

ctob

er 2

014

tikhonov’s regularization to the deconvolution problem

Documents