ieee transactions on image processing, may 2005 1 …jzhou2/publications/journal_nsct.pdfthe...

IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 1

The Nonsubsampled Contourlet Transform:

Theory, Design, and ApplicationsArthur L. Cunha,Student Member, IEEE, Jianping Zhou,Student Member, IEEE,

and Minh N. Do,Member, IEEE,

Abstract

The contourlet transform was proposed to address the lack ofgeometrical structure in the separable

two-dimensional wavelet transform. Because of its filter bank structure the contourlet transform is not

shift-invariant. In this paper we propose the nonsubsampled contourlet transform (NSCT) and study its

applications. The construction proposed in this paper is based on a nonsubsampled pyramid structure

and nonsubsampled directional filter banks. The result is a flexible multi-scale, multi-direction, and

shift-invariant image decomposition that can be efficiently implemented via thea trous algorithm. At

the core of the proposed scheme is the nonseparable two-channel nonsubsampled filter bank. We study

the filter design problem and propose a design framework based on the mapping approach. We exploit

the less stringent design condition of the nonsubsampled filter bank to design filters that lead to a

NSCT with better frequency selectivity and regularity whencompared to the contourlet transform. The

proposed design allows for a fast implementation based on a lifting or ladder structure that only uses

one-dimensional filtering in some cases. In addition, our design ensures that the corresponding frame

elements are regular, symmetric, and the frame is close to a tight one. We assess the performance of

the NSCT in image denoising and enhancement applications. In both applications the NSCT compares

favorably to other existing methods in the literature.

Index Terms

Multidimensional Filter Banks, Nonsubsampled Filter Banks, Frames, Contourlet transform, Image

Denoising, Image Enhancement.

A. L. Cunha and J. Zhou are with the Department of Electrical and Computer Engineering and the Coordinated Science

Laboratory, University of Illinois at Urbana-Champaign, Urbana IL 61801 (email:{cunhada,jzhou2}@uiuc.edu)

M. N. Do is with the Department of Electrical and Computer Engineering, theCoordinated Science Laboratory, and the

Beckman Institute, University of Illinois at Urbana-Champaign, UrbanaIL 61801 (email: [email protected]).

This work was supported by a CAPES fellowship, Brazil, and the U.S. National Science Foundation under grant CCR-0237633

(CAREER).

May 19, 2005 DRAFT


I. I NTRODUCTION

A number of image processing tasks are efficiently carried outin a domain other than the pixel domain,

often by means of an invertible linear transformation. For example, image compression and denoising

are efficiently done in the wavelet transform domain [1], [2].A good transform or representation would

capture the essence of a given signal or class of signals withfew basis elements. The set of basis functions

completely characterizes the transform and this set can be redundant or not, depending on whether the

basis functions are linear dependent. By allowing redundancy, it is possible to enrich the set of basis

functions so that the representation is more efficient in capturing some signal behavior. Applications such

as edge detection, contour detection, denoising, and imagerestoration can greatly benefit from redundant

representations.

In the context of multi-scale expansions implemented with filter banks, dropping the linear inde-

pendence requirement offers the possibility of a shift-invariant expansion, a crucial property in some

applications [3]. For instance, in image denoising by thresholding in the wavelet domain, the lack of

shift-invariance causes pseudo-Gibbs phenomena around singularities [4]. Not surprisingly, most state-

of-the-art wavelet denoising routines (see for example [5], [6], [7]) use an expansion with less shift

sensitivity than the standard maximally decimated waveletdecomposition — the most common being the

nonsubsampled wavelet transform (NSWT) computed with thea trousalgorithm [8] 1.

In addition to shift-invariance, it has been recognized that an efficient image representation has to

account for the geometrical structure pervasive in naturalscenes. In this direction, several representation

schemes have recently been proposed. These include adaptiveschemes such as wedgelets [9], [10] and

bandelets [11], and non-adaptive ones such as curvelets [12] and contourlets [13]. The contourlet transform

is a multi-directional and multi-scale transform that is constructed by combining the Laplacian pyramid

[14], [15] and the directional filter bank (DFB) [16]. Due to downsamplers and upsamplers present in

both the Laplacian pyramid and the DFB, the contourlet transform is not shift-invariant.

In this paper we propose an overcomplete transform that we call the nonsubsampled contourlet

transform (NSCT). The NSCT is a fully shift-invariant, multi-scale, and multi-direction expansion that

has better directional frequency localization and a fast implementation. In essence, the proposed structure

achieves a similar subband decomposition as that of contourlets, but without downsamplers and upsam-

plers in it. Because of its redundancy, the filter design problem of the NSCT is much less constrained than

1Denoising by thresholding in the NSWT domain can also be realized by denoising multiple circular shifts of the signal with

a critically sampled wavelet transform and then averaging the results. Thishas been termedcycle spinningafter [4].

May 19, 2005 DRAFT


that of contourlets. This enables us to design filters with better frequency selectivity thereby achieving

a better subband decomposition. Using the mapping approachwe provide a framework for filter design

that ensures good frequency selectivity in addition to having a fast implementation through ladders steps.

The NSCT has proven to be very efficient in image noise removal andimage enhancement as we show

in this paper.

The paper is structured as follows. In Section II we describe the NSCT and its building blocks. We

introduce a pyramid structure that ensures the multi-scalefeature of the NSCT and the directional filtering

structure based on the DFB. The basic unit in our construction is the nonsubsampled filter bank (NSFB)

which is discussed in Section II. In Section III we study the issues associated with the NSFB design and

implementation problems. We propose a design methodology based on one-dimensional (1-D) prototype

filters that allows fast implementation through ladder steps. Several examples are presented. Applications

of the NSCT in image denoising and enhancement are discussed in Section IV. Concluding remarks are

drawn in Section V.

Notation:Throughout the paper, a two-dimensional (2-D) filter is denoted byH(z) wherez = [z1, z2]T .

If m = [m1, m2]T is a 2-D vector, thenzm = zm1

1 zm2

2 whereas ifM is a 2 × 2 matrix, thenzM =

[zm1 , zm2 ] with m1,m2 the columns ofM. In this paper we often deal with zero-phase 2-D filters.

Such filters can be written as polynomials incos ω = (cos ω1, cos ω2). We thus writeF (x1, x2) for a

zero-phase filter in whichx1 andx2 denotecos ω1 andcos ω2 respectively.

II. N ONSUBSAMPLEDCONTOURLETS ANDFILTER BANKS

A. The Nonsubsampled Contourlet Transform

Figure 1 (a) displays a high level view of the proposed NSCT. The structure consists in a bank of filters

that splits the 2-D frequency plane in the subbands illustrated in Figure 1(b). Our proposed transform

can thus be divided into two parts that are both shift-invariant: (1) A nonsubsampled pyramid structure

that ensures the multi-scale property and (2) A nonsubsampled DFB structure that gives directionality.

1) The Nonsubsampled Pyramid (NSP):What gives the multi-scale property of the NSCT is a shift-

invariant filtering structure that achieves a subband decomposition similar to that of the Laplacian pyramid.

Our solution is obtained by using two-channel nonsubsampled 2-D filter banks. Figure 2 illustrates

the proposednonsubsampled pyramid(NSP) decomposition withJ = 3 stages. Such expansion is

conceptually similar to the 1-D nonsubsampled wavelet transform computed with thea trous algorithm

[8] and hasJ +1 redundancy, whereJ denotes the number of decomposition stages. The ideal passband

support of the lowpass filter at thej-th stage is the region[− π2j ,

π2j ]2. Accordingly, the ideal support of

May 19, 2005 DRAFT


directional

subbands

Bandpass

directional

subbands

Image

Bandpass

(a)

(π, π)

(−π,−π)

ω1

ω2

(b)

Fig. 1. The nonsubsampled contourlet transform. (a) Nonsubsampled filter bank structure that implements the NSCT. (b) The

idealized frequency partitioning obtained with the proposed structure.

the equivalent highpass filter is the complement of the lowpass, i.e., the region[− π2j−1 ,

π2j−1 ]2\[− π

2j ,π2j ]2.

The filters for subsequent stages are obtained by upsampling the filters of the first stage. This gives the

multi-scale property without the need for additional filter design. The proposed structure is thus different

from the separable nonsubsampled wavelet transform (NSWT). In particular, one bandpass image is

produced at each stage resulting inJ + 1 redundancy. By contrast, the separable NSWT produces three

directional images at each stage resulting in3J + 1 redundancy.

H0(z)

H1(z)

H0(z2I)

H1(z2I)

H0(z4I)

H1(z4I)

x

y0

y1

y2

y3

(a)

0 1 2 3

(π, π)

(−π,−π)

ω1

ω2

(b)

Fig. 2. The proposed nonsubsampled pyramid is a 2-D multi-resolution expansion similar to the 1-D nonsubsampled wavelet

transform. (a) A three-stage pyramid decomposition. The lighter gray regions denote the aliasing caused by upsampling. (b) The

subbands on the 2-D frequency plane.

The proposed pyramid is not the only 2-D pyramid withJ +1 redundancy. The 2-D pyramid proposed

May 19, 2005 DRAFT


in [17] is obtained with a similar structure. Specifically, theNSFB of [17] is built from a given lowpass

filter H0(z). One then setsH1(z) = 1 − H0(z), andG0(z) = G1(z) = 1. This perfect reconstruction

system can be seen as a particular case of our more general structure. The advantage of our construction

is that it is less restrictive and as a result, better filters can be obtained.

2) The Nonsubsampled Directional Filter Bank (NSDFB):The directional filter bank of Bamberger and

Smith [16] is constructed by combining critically-sampled two-channel fan filter banks and resampling

operations. The result is a tree-structured filter bank that splits the frequency plane in the directional

wedges shown in Figure 3 (a).

37

6

5

4 7

6

5

4

0123

0 1 2

(π, π)

(−π,−π)

ω1

ω2

(a)

+

Ueq0 (z)

Ueq1 (z)

Ueq

L−1(z)

Veq0 (z)

Veq1 (z)

Veq

L−1(z)

S0S0

S1S1

SL−1SL−1

xx

y0

y1

yL−1

(b)

Fig. 3. The directional filter bank. (a) Ideal partitioning of the 2-D frequency plane into23 = 8 wedges. (b) Equivalent

multi-channel filter bank structure. The number of channels isL = 2l, wherel is the number of stages in the tree structure.

Using multirate identities [18], the tree-structured DFB can be put into the equivalent form shown in

Figure 3 (b), where the downsampling/upsampling matricesSk for 0 ≤ k ≤ 2l − 1 are given by

Sk =

diag(2l−1, 2

), for 0 ≤ k ≤ 2l−1 − 1;

diag(2, 2l−1

), for 2l−1 ≤ k ≤ 2l − 1,

(1)

with l denoting the number of stages in the tree structure. It is clear from the above that the DFB is not

shift-invariant. A shift-invariant directional expansion is obtained with a nonsubsampled DFB (NSDFB).

The NSDFB is constructed by eliminating the downsamplers and upsamplers in Figure 3 (b). This is

equivalent to switching off the downsamplers in each two-channel filter bank in the DFB tree structure and

upsampling the filters accordingly. This results in a tree composed of two-channel nonsubsampled filter

banks. The equivalent filter in thek-th channel of the analysis NSDFB tree is the filterU eqk (z) in Figure 3

(b). Thus, each filterU eqk (z) is a product ofl simpler filters. Just like the DFB, all NSFB’s in the NSDFB

tree structure are obtained from a single NSFB with fan filters (see Figure 6 (b)). Figure 4 illustrates a

May 19, 2005 DRAFT


four channel decomposition. Note that in the second level, the upsampled fan filtersUi(zQ), i = 0, 1

have checker-board frequency support, and when combined with the filters in the first level give the four

directional frequency decomposition shown in Figure 4. The synthesis filter bank is obtained similarly.

Just like the NSP case, each filter bank in the NSDFB tree has the same computational complexity as

that of the prototype NSFB.

U0(z)

U1(z)

U0(zQ)

U0(zQ)

U1(zQ)

U1(zQ)

x

y0

y1

y2

y3

(a)

0 13

2

(π, π)

(−π,−π)

ω1

ω2

(b)

Fig. 4. A four-channel nonsubsampled directional filter bank constructed with two-channel fan filter banks. (a) Filtering structure.

The equivalent filter in each channel is given byUeq

k (z) = Ui(z)Uj(zQ). (b) Corresponding frequency decomposition.

3) Combining the NSP and NSDFB in the NSCT:The NSCT is constructed by combining the NSP

and the NSDFB as shown in Figure 1 (a). In constructing the nonsubsampled contourlet transform, care

must be taken when applying the directional filters to the coarser scales of the pyramid. Due to the

tree-structure nature of the NSDFB, the directional responseat the lower and upper frequencies suffers

from aliasing which can be a problem in the upper stages of thepyramid (see Figure 9). This is illustrated

in Figure 5 (a), where the passband region of the directional filter is labelled as “Good” or “Bad”. Thus

we see that for coarser scales, the highpass channel in effect is filtered with the bad portion of the

directional filter passband. This results in severe aliasing and in some observed cases a considerable loss

of directional resolution.

We remedy this by judiciously upsampling the NSDFB filters. Denote thek-th directional filter by

Uk(z). Then for higher scales, we substituteUk(z2mI) for Uk(z) wherem is chosen to ensure that the good

part of the response overlaps with the pyramid passband. Figure 5 (b) illustrates a typical example. Note

that this modification preserves perfect reconstruction. Ina typical 5-scale decomposition, we upsample

May 19, 2005 DRAFT


��

��

��

��

��

replacements (π, π)

(−π,−π)

ω1

ω2

(a)

��

��

“Good”“Bad”

��

��

��

��

��

��

��

��

��

��

(π, π)

(−π,−π)

ω1

ω2

(b)

Fig. 5. The need for upsampling in the NSCT. (a) With no upsampling, the highpass at higher scales will be filtered by the

portion of the directional filter that has “bad” response. (b) Upsamplingensures that filtering is done in the “good” region.

by 2I the NSDFB filters of the last two stages.

Filtering with the upsampled filters does not increase computational complexity. Specifically, for a

given sampling matrixS and a 2-D filterH(z), to obtain the outputy[n] resulting from filteringx[n]

with H(zS), we use the convolution formula

y[n] =∑

k∈supp(h)

h[k]x[n − Sk]. (2)

Therefore, each filter in the NSDFB tree has the same complexity asthat of the building-block fan

NSFB. Likewise, each filtering stage of the NSP has the same complexity as that incurred by the first

stage. Thus, the complexity of the NSCT is dictated by the complexity of the building-block NSFB’s. If

each NSFB in both NSP and NSDFB requiresL operations per output sample, then for an image ofN

pixels the NSCT requires aboutBNL operations whereB denotes the number of subbands. For instance,

if L = 32, a typical decomposition with 4 pyramid levels, 16 directions in the two finer scales and 8

directions in the two coarser scales would require a total of1536 operations per image pixel.

If the building block 2-channel NSFB’s in the NSP and NSDFB are invertible, then clearly the NSCT is

invertible and thus it implements a frame expansion (see Section II-C). The frame elements are localized

in space and oriented along a discrete set of directions. The NSCT is flexible in that it allows any

number of2l directions in each scale. In particular, it can satisfy the anisotropic scaling law — a key

property in establishing the expansion nonlinear approximation behavior [12], [13]. This property is

ensured by doubling the number of directions in the NSDFB expansion at every other scale. The NSCT

has redundancy given by1 +∑J

j=1 2lj , wherelj denotes the number of levels in the NSDFB at thej-th

May 19, 2005 DRAFT


scale.

B. Nonsubsampled Filter Banks

At the core of the proposed NSCT structure is the 2-D two-channel nonsubsampled filter bank. Shown

in Figure 6 are the NSFB’s needed to construct the NSCT. In this paper we focus exclusively on the FIR

case simply because it is easier to implement in multiple dimensions. For a general FIR two-channel

NSFB, perfect reconstruction is achieved provided the filters satisfy theBezout identity:

H0(z)G0(z) + H1(z)G1(z) = 1. (3)

The Bezout identity puts no constraint on the frequency response of the filters involved. Therefore, to

obtain good solutions one has to impose additional conditions on the filters.

+

H0(z)

H1(z)

G0(z)

G1(z)

x x

y0

y1

(a)

+x x

y0

y1

U0(z)

U1(z)

V0(z)

V1(z)

(b)

Fig. 6. The two-channel nonsubsampled filter banks used in the NSCT. The system is two times redundant and the reconstruction

is error free when the filters satisfy Bezout’s identity. (a) Pyramid NSFB.(b) Fan NSFB.

C. Frame Analysis of the NSCT

The nonsubsampled filter bank can be interpreted in terms of analysis/synthesis operators offrame

systems.A family of vectors{φn}n∈Γ constitute a frame for a Hilbert spaceH if there exist two positive

constantsA, B such that for eachx ∈ H we have

A‖x‖2 ≤∑

n∈Γ

|〈x, φn〉|2 ≤ B‖x‖2. (4)

In the event thatA = B the frame is said to be tight. Theframe boundsare the tightest positive constants

satisfying (4).

Consider the NSFB of Figure 6 (a). The family{h0[·−n], h1[·−n]}n∈Z2 is a frame forℓ2(Z2) if and

only if there exist constants0 < A ≤ B < ∞ such that [19]

May 19, 2005 DRAFT


A ≤ |H0(ejω)|2 + |H1(e

jω)|2︸︷︷︸

t(ejω)

≤ B. (5)

Thus, the frame bounds of an NSFB can be computed by

A = ess.infω∈[−π,π]2

t(ejω), B = ess.supω∈[−π,π]2

t(ejω), (6)

where ess.inf, and ess.sup denote the essential infimum and essential supremum respectively. From (5),

we see that the frame is tight whenevert(ejω) is almost everywhere constant. As a result, FIR filters

corresponding to tight frames cannot be linear-phase, except for Haar-type filters (see [18] pp. 337-338).

Because the NSFB is redundant, an infinite number of inverses exist. Among them, thepseudo-inverse

is the best inverse in a number of senses [1]. Given a frame of analysis filters, the synthesis filters

corresponding to the frame pseudo-inverse are given byGi(z) = Hi(z)/t(z) for i = 0, 1 [19]. In this

case, the synthesis filters form thedual framewith lower and upper frame bounds given byB−1 and

A−1 respectively. When the analysis filters are FIR, then unless the frame is tight, the synthesis filters

of the pseudo-inverse will be IIR.

From the above discussion we gather two important points: (1)linear phase filters and tight frames are

mutually exclusive; (2) the pseudo-inverse is desirable, but is IIR if the frame is not tight. Consequently,

an FIR NSFB system with linear phase filters and with synthesis filters corresponding to the pseudo-

inverse is not possible. However, we can approximate the pseudo-inverse with FIR filters. For a given

number of filter taps, the closer the frame is to being tight, the better an FIR approximation of the

pseudo-inverse can be [20]. Thus, in the designs we seek linear phase filters that underly a frame that is

as close to a tight one as possible.

In a general FIR perfect reconstruction NSFB system both analysis and synthesis filters form a frame.

If we denote the analysis and synthesis frame bounds byAa, Ba andAs, Bs respectively, the frames will

be close to tight provided

ra := Aa/Ba ≈ 1, andrs := As/Bs ≈ 1.

We always assume that the filters are normalized so that we haveAa ≤ 1 ≤ Ba. In case the pseudo-

inverse is used, then we also haveAs ≤ 1 ≤ Bs. The following result shows the NSCT is a frame

operator forℓ2(Z2) whenever the constituent NSFB’s each forms a frame.

Proposition 1: In the nonsubsampled contourlet transform if the pyramid filter bank constitute a frame

with frame boundsAp andBp, and the fan filters constitute a frame with frame boundsAq andBq, then

May 19, 2005 DRAFT


J A actual A estimated B actual B estimated

1 1.0435 1.0435 0.9586 0.9596

2 1.0504 1.0889 0.9393 0.9189

3 1.0515 1.1362 0.9332 0.8808

4 1.0517 1.1857 0.9316 0.8444

TABLE I

FRAME BOUNDS EVOLVING WITH SCALE FOR THE PYRAMID FILTERS GIVEN IN EXAMPLE 1 IN SECTION III

the NSCT is a frame with boundsA andB satisfying

AJp Amin {lj}

q ≤ A ≤ B ≤ BJp Bmax {lj}

q

Proof: See Appendix A.

Remark 1:When both the pyramid and fan filter banks form tight frames with bound 1, thenAp =

Aq = Bp = Bq = 1, and from the above proposition, the nonsubsampled contourlet transform is also a

tight frame with bound 1.

The above estimates onA andB can be accurate in some cases, especially when the frame is close

to a tight one and the number of levels is small (e.g.,J = 4). In general however they are not accurate

estimates. Their purpose is more of giving an interval for theframe bounds rather than the actual values.

Table I shows estimates for different number of scales. The actual frame bound is computed from (5 -

6) whereas the estimates are the ones given according to Proposition 1.

III. F ILTER DESIGN AND IMPLEMENTATION

The filter design problem of the NSCT comprises the two basic NSFB’sdisplayed in Figure 6. The

goal is to design the filters imposing the Bezout identity (i.e. perfect reconstruction) and enforcing other

properties such as sharp frequency response, easy implementation, regularity of the frame elements, and

tightness of the corresponding frames. It is also desirablethat the filters are linear-phase.

Two-channel 1-D NSFB’s that underly tight frames are designedin [19]. However, the design method-

ology of [19] is not easy to extend to 2-D designs since it relies on spectral factorization which is hard

in 2-D. If we relax the tightness constraint then the design becomes more flexible. In addition, as we

alluded to earlier, non-tight filters can be linear phase.

An effective and simple way to design 2-D filters is themappingapproach first proposed by McClellan

[21] in the context of digital filters and then used by several authors [22], [23], [24], [25] in the context

May 19, 2005 DRAFT


of filter banks. In such approach, the 2-D filters are obtained from 1-D ones. In the context of NSFB’s,

a set of perfect reconstruction 2-D filters is obtained in the following way:

Step 1. Construct a set of 1-D polynomials{H(1D)i (x), G

(1D)i (x)}i=0,1 that satisfies the Bezout identity.

Step 2. Given a 2-D FIR filterf(z), then{H(1D)i (f(z)), G

(1D)i (f(z))}i=0,1 are 2-D filters satisfying

the Bezout identity.

Thus, one has to design the set of 1-D filters and the mapping function f(z) so that the ideal responses

are well approximated with a small number of filter coefficients.

One advantage of the mapping approach is that one can controlthe frequency and phase responses

through the mapping function. Moreover, if the mapping function is zero-phase, thenf(z) = f(z−1) and

it follows that the mapped filters are also zero-phase. Therefore, we restrict our attention to zero-phase

designs only. In this case, on the unit sphere, the mapping function is a 2-D polynomial in(cos ω1, cos ω2).

We thus denote it byF (x1, x2), where it is implicit thatf(ejω) = F (cos ω1, cos ω2).

+

+

+ +

+

x x

Q(1D)(f(z)) −Q(1D)(f(z)) −P (1D)(f(z))P (1D)(f(z))

Fig. 7. Lifting structure for the nonsubsampled filter bank designed with themapping approach. The 1-D prototype is factored

with the Euclidean algorithm. The 2-D filters are obtained by replacingx 7→ f(z).

A. Implementation Through Lifting

Filters designed with the mapping approach can be efficiently factored into a ladder [26] or lifting [27]

structure that simplifies computations. To see this, assume without loss of generality that the degree of

the highpass prototype polynomialH(1D)1 (x) is smaller than that ofH(1D)

0 (x). Suppose also that there are

synthesis filtersG(1D)0 (x) andG

(1D)1 (x) such that the Bezout identity is satisfied. In this case it follows

that gcd{H(1D)0 , H

(1D)1 } = 1. The Euclidean algorithm then enables us to factor the filters inthe following

way [28], [26], [27]:

H

(1D)0 (x)

H(1D)1 (x)

=N∏

i=0

1 0

P(1D)i (x) 1

1 Q

(1D)i (x)

0 1

1

0

. (7)

May 19, 2005 DRAFT


As a result, we can obtain a 2-D factorization by replacingx with f(z). This factorization characterizes

every 2-D NSFB derived from 1-D through the mapping method. Figure 7 illustrates the ladder structure

with one stage.

In general, the lifting implementation at least halves the number of multiplications and additions of

the direct form [27]. The complexity can be reduced further ifthe lifting steps in the 1-D prototype are

monomials and the mapping filterf(z) has the form

f(z) = f1(zp1

1 zp2

2 )f2(zq1

1 zq2

2 ), (8)

for suitablef1(z), f2(z), and integersp1, p2, q1, q2. Note that iff(z) is a 1-D filter, then the 2-D filter

f(zp1zq

2) for integersp andq has the same complexity as that off(z). Therefore, filters of the form in (8)

have the same complexity as that of separable filters (i.e., filters of the formf(z) = f(z1)f(z2) ) which

amounts to two 1-D filtering operations. Notice that iff(z) is as in (8), then for an arbitrary sampling

matrix S, f(zS) also has the form in (17). Consequently, all NSFB’s in the NSDFB tree structure can

be implemented with 1-D operations whenever the prototype fan NSFB can be implemented with 1-D

filtering operations. The same reasoning applies to the NSFB’s ofthe NSP.

B. Pyramid Filter Design

In the pyramid case, we imposeline zerosat ω = (π, ω2) and ω = (ω1, π). Notice that anN -th

order line zero atω = (π, ω2), for example, amounts to a(1 + ejω1)N factor in the low pass filter. Such

zeros are useful to obtain good approximation of the ideal frequency response of Figure 6, in addition

to imposing regularity of the scaling function. We point outthat for the approximation of polynomial

images, point zeros at(±π,±π) would suffice [29]. However, our experience shows that point zeros alone

do not guarantee a “reasonable” frequency response of the pyramid filters. The following proposition

characterizes the mapping function that generates zeros inthe resulting 2-D filters.

Proposition 2: Let G(1D)(z) be a polynomial with roots{zi}ni=1 where eachzi has multiplicity ni.

Suppose we want a mapping functionF (x1, x2) such that

G(1D) (F (x1, x2)) = (x1 − c)N1 (x2 − d)N2 L(x1, x2), (9)

where L(x1, x2) is a bivariate polynomial. ThenG(1D)(F (x1, x2)) has the form in (9)if and only if

F (x1, x2) takes the form

F (x1, x2) = zj + (x1 − c)N ′

1 (x2 − d)N ′

2 LF (x1, x2), (10)

May 19, 2005 DRAFT


for some rootzj ∈ {zi}ni=1, where LF (x1, x2) is a bivariate polynomial, andN ′

1, N′2 are such that

N ′1ni ≥ N1 andN ′

2nj ≥ N2.

Proof: See Appendix B.

The above result holds in general, even for point zeros (as opposed to line zeros as in Proposition 2).

We will explore this extensively in the designs that follow.

Suppose the prototype filtersH(1D)0 (x), G

(1D)0 (x) each has zeros atx = −1. Then, in order to produce

a suitable zero-phase mapping function for the pyramid NSFB weconsider the class of maximally-flat

filters given by the polynomials [30]

PN,L(x) :=

(1 + x

2

)N L−1−N∑

l=0

(N + l − 1

l

)(1 − x

2

)l

, (11)

whereN controls the degree of flatness atx = −1 andL controls the degree flatness atx = 1. Following

Proposition 2, we can construct a family of mapping functionsas

F (x1, x2) = −1 + 2PN0,L0(x1)PN1,L1

(x2) (12)

so that zero moments atx1 = −1 andx2 = −1 are guaranteed. Note thatF (x1, x2) can be implemented

with 1-D filtering operations only.

C. Fan Filter Design

To design the fan filters idealized in Figure 6(b) we use the samemethodology as in the pyramid

case. The distinction occurs in the mapping function. The fan-shaped response can be obtained from

a diamond-shaped response by simple modulation in one of thefrequency variables. This modulation

preserves the perfect reconstruction property. A useful family of mapping functions for the diamond-

shaped response is obtained by imposing flatness aroundω = (±π,±π) and ω = (0, 0) in addition to

point zeros at(±π,±π). If the mapping function is zero-phase, this amounts to imposing flatness in a

polynomial at(x1, x2) = (−1,−1) and (x1, x2) = (1, 1), and zeros at(x1, x2) = (−1,−1). Mapping

functions satisfying these desiderata with minimum numberof polynomial coefficients are given by

FN (x1, x2) = −1 + QN (x1, x2), (13)

where the polynomialsQN (x1, x2) give the class of maximally-flat half-band filters with diamondsupport.

A closed-form expression forQN (x1, x2) is given in [31]. Table II displays the first six mapping functions

FN (x1, x2) for the diamond filter bank.

We point out that diamond maximally-flat mapping polynomialscan be generated from 1-D ones by

a separable product and then an appropriate change of variables. In this case the mapping function is

May 19, 2005 DRAFT


N FN (x1, x2)

1 12(x1 + x2)

2 14(x1 + x2)(3 − x1x2)

3 116

(x1 + x2)(15 − x21 − 8x1x2 − x2

2 + 3x21x

22)

4 132

(x1 + x2)(35 − 5x21 − 25x1x2 + 3x3

1x2 − 5x22 + 15x2

1x22 + 3x1x

32 − 5x3

1x32)

5 1256

(x1 + x2)�315 − 70x2

1 + 3x41 − 280x1x2 + 72x3

1x2 − 70x22 + 228x2

1x22

−30x41x

22 + 72x1x

32 − 120x3

1x32 + 3x4

2 − 30x21x

42 + 35x4

1x42

�6 1

512(x1 + x2)

�693 − 210x2

1 + 21x41 − 735x1x2 + 294x3

1x2 − 15x51x2 − 210x2

2 + 756x21x

22 − 210x4

1x22

+294x1x32 − 540x3

1x32 + 70x5

1x32 + 21x4

2 − 210x21x

42 + 245x4

1x42 − 15x1x

52 + 70x3

1x52 − 63x5

1x52

�TABLE II

MAXIMALLY FLAT MAPPING POLYNOMIALS USED IN THE DESIGN OF THE NONSUBSAMPLED FAN FILTER BANK.

separable and has the formf(z) = f1(z1z2)f2(z1z−12 ). This has been done for instance in [23]. However,

the nonseparable solution is generally shorter (roughly bya factor of 2) while yielding the same number

of zeros at the aliasing frequencies and a similar frequencyresponse. For faster implementation, one may

choose the longer mapping filters which can be implemented with 1-D filtering operations.

D. Design Examples

The design through mapping is based on a set of 1-D polynomialsthat satisfies the Bezout identity,

that is, H(1D)0 (x)G

(1D)0 (x) + H

(1D)1 (x)G

(1D)1 (x) = 1. The design can be simplified if we impose the

restriction thatH(1D)1 (x) = G

(1D)0 (−x) and G

(1D)1 (x) = H

(1D)0 (−x). One advantage of this choice is

that the frequency response of the filters can be controlled bythe lowpass branch — the highpass will

automatically have the complementary response. Another advantage is that, under an additional condition,

the frame bounds of the analysis and synthesis frames are thesame and can be computed from the 1-D

prototypes. To see this, suppose thatf is the mapping function and thatRan(f) = Ran(−f) with

Ran(f) denoting the range of the mapping functionf . Then we have that

Aa = infω∈[π,π]2

|H(1D)0 (f(ejω))|2 + |G(1D)

0 (−f(ejω))|2

= infx∈Ran(f)

|H(1D)0 (x)|2 + |G(1D)

0 (−x)|2 = infx∈Ran(−f)

|H(1D)0 (−x)|2 + |G(1D)

0 (x)|2

= infx∈Ran(f)

|H(1D)0 (−x)|2 + |G(1D)

0 (x)|2 = As. (14)

May 19, 2005 DRAFT


A similar argument shows that

Ba = Bs = supx∈Ran(f)

|H(1D)0 (−x)|2 + |G(1D)

0 (x)|2. (15)

Example 1 ( Pyramid filters very close to tight ones ):In order to get filters that are almost tight we

design the prototypesH(1D)0 (x) and G

(1D)0 (x) to be very close to each other. If we letH

(1D)1 (x) =

G(1D)0 (−x) andG

(1D)1 (x) = H

(1D)0 (−x), then the following filters can be checked to satisfy the Bezout

identity:

H(1D)0 (x) = K

(1 + k2x + k2k3x

2),

G(1D)0 (x) = K−1

(1 − k1x − k3x + k1k2x

2 − k1k2k3x3).

To obtain the prototype we choose the constantsK, k1, k2, and k3 such that each filter has a zero at

z = −1 and in addition thatH(1D)0 (−1) = G

(1D)0 (−1) =

√2. We obtain

H(1D)0 (x) =

1

2(x + 1)

(√2 + (1 −

√2)x)

,

G(1D)0 (x) =

1

2(x + 1)

(√2 + (4 − 3

√2)x + (2

√2 − 3)x2

)

.

The lifting factorization of the prototype filters is given by

H

(1D)0 (x)

H(1D)1 (x)

=

1 αx

0 1

1 0

βx 1

1 γx

0 1

K1

K2

(16)

with

α = γ = 1 −√

2, β =1√2, K1 = K2 =

1√2.

Notice that this implementation has 5 multiplies/sample whereas the direct form has 10 multiplies/sample.

In this example we setF (x1, x2) = −1 + 2P2,4(x)P2,4(y) so that each of the filters has a 4-th order

zero atω1 = ±π and atω2 = ±π. Since the ladder steps are monomials, the NSFB can be implemented

with 1-D filtering operations. The frame bounds are computed using (14-15):

Aa = As = 0.96, Ba = Bs = 1.04.

Thus we havera = rs = 1.083 and the frame is almost tight. The support size ofH0(z) is 13 × 13

whereasG0(z) has support size19 × 19. Figure 8 shows the response of the resulting filters.

Example 2 (Maximally-flat fan filters):Using the prototype filters of Example 1, we modulate a maximally-

flat diamond mapping function to obtain the maximally-flat fan mapping function. Thus we choose

May 19, 2005 DRAFT


-2

0

2

-2

0

2

0

0.25

0.5

0.75

1

-2

0

2

(a) |H0(ejω)|

-2

0

2

-2

0

2

0

0.25

0.5

0.75

1

-2

0

2

(b) |G0(ejω)|

-2

0

2

-2

0

2

0

0.25

0.5

0.75

1

-2

0

2

(c) |H1(ejω)|

-2

0

2

-2

0

2

0

0.25

0.5

0.75

1

-2

0

2

(d) |G1(ejω)|

Fig. 8. Magnitude response of the filters designed in Example 1 with maximally-flat filters. The nonsubsampled pyramid filter

bank underlies almost tight analysis and synthesis frames.

FN (x1, x2) in Table II with N = 3. Figure 9 shows the magnitude response of the filters. The support

size ofU0(z) is 21 × 21 whereas the support size ofV0(z) is 31 × 31.

E. Regularity of the NSCT basis functions

The multi-scale property of the NSCT is obtained through the nonsubsampled pyramid implemented

with a 2-D a trous algorithm. Denoting byH0(ω) the scaling lowpass filter used in the pyramid (we

write H0(ω) instead ofH0(ejω) for convenience), we have the associated scaling function

Φ(ω) :=∞∏

j=1

H0(2−j

ω),

where convergence is in the weak sense. In our proposed design the filterH0(ω) can be factored as

H0(ω) =

(1 + ejω1

2

)N1(

1 + ejω2

2

)N2

RH0(ω). (17)

May 19, 2005 DRAFT


-2

0

2

-2

0

2

0

0.25

0.5

0.75

1

-2

0

2

(a) |U0(ejω)|

-2

0

2

-2

0

2

0

0.25

0.5

0.75

1

-2

0

2

(b) |V0(ejω)|

-2

0

2

-2

0

2

0

0.25

0.5

0.75

1

-2

0

2

(c) |U1(ejω)|

-2

0

2

-2

0

2

0

0.25

0.5

0.75

1

-2

0

2

(d) |V1(ejω)|

Fig. 9. Fan filters designed with prototype filters of Example 1 and diamond maximally-flat mapping filters.

Notice that the remainder filterRH0(ω) is not separable. Therefore, one cannot separate the regularity

estimation as two 1-D problems. Nonetheless, a similar argument can be developed and an estimate of

the 2-D regularity of the scaling filter is obtained.

Proposition 3: Let H0(ω) be a scaling filter as in (17) with the corresponding scaling functionΦ(ω).

Let

B = supω∈[π,π]2

|RH0(ω)|.

Then

|Φ(ω)| ≤ C

(1

1 + |ω1|

)N1−log2B ( 1

1 + |ω2|

)N2−log2B

.

Proof: See Appendix C.

As an example, consider the prototype filter in Example 1 and themapping F (x1, x2) = −1 +

2P1,2(x)P1,2(y). The resulting filter has second-order zeros atω1 = ±π and atω2 = ±π. It can be verified

May 19, 2005 DRAFT


that|RH0(ω)| / 1.83 and|RG0

(ω)| / 1.49 so that the regularity exponent2 is at least2−log2 1.83 ≈ 1.13

for H0(ω) and2 − log2 1.49 ≈ 1.43 for G0(ω). Thus the corresponding scaling functions and wavelets

are at least continuous. We point out that better estimates are possible applying similar 1-D techniques.

For instance, one could prove a result similar to Lemma 7.1.2 in [30], pp. 217, as a consequence of

Proposition 3.

Figure 10 shows the basis functions of the NSCT obtained with the filters designed via mapping. As

the picture shows, the functions have a good degree of regularity.

IV. A PPLICATIONS

A. Image Denoising

In order to illustrate the potential of the NSCT designed using the techniques previously discussed we

study additive white Gaussian noise (AWGN) removal from images by means of thresholding estimators.

We test the NSCT under two denoising schemes described next.

1) Hard Threshold:For the hard threshold estimator we choose a global threshold Ti,j = Kσnijfor

each directional subband. This has been termedK-sigma thresholding in [32]. We setK = 4 for the finer

scale andK = 3 for the remaining ones. We refer to this method as NSWT-HT whenapplied to NSWT

coefficients and NSCT-HT when applied to NSCT coefficients. We usefive scales of decomposition for

both NSCT and NSWT. For the NSCT we use 4,8,8,16,16 directions inthe scales from coarser to finer

respectively.

2) Local adaptive shrinkage:We perform soft thresholding (shrinkage) independently ineach subband.

Following the work in [5] we choose the threshold

Ti,j =σ2

Nij

σi,j,n

,

whereσi,j,n denotes the variance of then-th coefficient at thei-th directional subband of thej-th scale,

andσ2Nij

is the noise variance at scalej and directioni. It is shown in [5] that shrinkage estimation with

T = σ2

σX, and assumingX generalized Gaussian distributed yields a risk within 5% ofthe optimal Bayes

risk. As studied in [33], contourlet coefficients are well modeled by generalized Gaussian distributions.

The signal variances are estimated locally using the neighboring coefficients contained in a square window

within each subband and a maximum likelihood estimator. The noise variance in each subband is inferred

using a Monte-Carlo technique where the variances are computed for a few normalized noise images

2The regularity exponent of a scaling functionφ(t) is the largest numberα such thatΦ(ω) decays as fast as 1(1+|ω1|+|ω2|)

α .

May 19, 2005 DRAFT


Basis functions, Level 2

(a)



(b)

Fig. 10. Basis functions of the nonsubsampled contourlet transform. (a) Basis functions of the second stage of the pyramid,

(b) Basis functions of third (top 8) and fourth (bottom 8) stages of the pyramid.

and then averaged to stabilize the results. We refer to this method aslocal adaptive shrinkage(LAS).

Effectively, our LAS method is a simplified version of the denoising method proposed in [34] that works

in the NSCT or NSWT domain.

May 19, 2005 DRAFT


To benchmark the performance of the NSCT we have used some of the best denoising schemes reported

in the literature:

1) Hard thresholding with the NSWT.

2) Hard thresholding in the curvelet transform domain [32].

3) Bivariate shrinkage with local variance estimation [6].

4) Bayes least-squares with a Gaussian scale-mixture model(BLS-GSM) proposed in [7].

For the LAS estimator we use four scales for both the NSCT and NSWT. For the NSCT we use 3,3,4,4

directions in the scales from coarser to finer respectively.

We use three512 × 512 standard images in our test suit: “Lena”, “Barbara”, and “Peppers”. Table

III shows the PSNR results for the various methods used in our study3. The results show that for hard

thresholding, the NSCT is consistently superior to curvelets and NSWT in PSNR measure. For the

“Barbara” image the NSCT-HT yields improvements in excess of1.90 dB in PSNR over the NSWT-HT.

Figure 11 displays the reconstructed images using the the NSWT-HT, curvelets and NSCT-HT. As the

figure shows, both the NSCT and the curvelet transform offers a better recovery of edge information

relative to the NSWT. But improvements can be seen in the NSCT, particularly around the eye.

The NSCT coupled with the local adaptive shrinkage estimator (NSCT-LAS) produced very satisfactory

results as well. In particular, among the methods studied, the NSCT-LAS yields the best results for the

“Barbara” image, being surpassed by the BLS-GSM method for the other images. Despite its slight loss in

performance relative to BLS-GSM, we believe the NSCT has potential for better results. This is because

by comparison, the BLS-GSM is a considerably richer and more sophisticated estimation method than our

simple local thresholding estimator. However, studying more complex denoising methods in the NSCT

domain is beyond the scope of the present paper. Figure 12 displays the denoised images with both BLS-

GLM and NSCT-LAS methods. As the pictures show, the NSCT offers a slightly better reconstruction.

In particular, the table cloth texture is better recovered in the NSCT-LAS scheme.

We briefly mention that in denoising applications, one can reduce the redundancy of the NSCT by using

critically sampled directional filter banks over the nonsubsampled pyramid. This results in a transform

with J + 1 redundancy which is considerably faster. There is however a loss in performance as Table

IV shows. Nonetheless, in some applications, the small performance loss might be a good price to pay

given the reduced redundancy of this alternative construction.

3The PSNR values of the BivShrink method were obtained from the tables in [6]. In the results shown in [6] the authors do

not use the “Peppers” image as a test image, hence we do not have a BivShrink column for “Peppers”.

May 19, 2005 DRAFT


“Lena” image, PSNR (dB)

σ Noisy NSWT-HT Curvelets [32] NSCT-HT NSWT-LAS BivShrink [6] BLS-GSM [7] NSCT-LAS

10 28.13 34.26 34.17 34.69 35.19 35.34 35.59 35.46

20 22.13 31.40 31.52 32.03 32.12 32.40 32.62 32.50

30 18.63 29.66 30.01 30.35 30.30 30.54 30.84 30.70

40 16.13 28.37 28.84 29.10 29.01 - 29.58 29.38

50 14.20 27.41 27.78 28.10 28.00 - 28.61 28.34

“Barbara” image, PSNR (dB)

σ Noisy NSWT-HT Curvelets [32] NSCT-HT NSWT-LAS BivShrink [6] BLS-GSM [7] NSCT-LAS

10 28.17 31.58 32.28 33.01 33.40 33.35 34.03 34.09

20 22.15 27.23 28.89 29.41 29.45 29.80 30.28 30.60

30 18.63 25.10 26.93 27.24 27.22 27.65 28.11 28.56

40 16.14 24.02 25.51 25.79 25.76 - 26.58 27.12

50 14.20 23.37 24.31 24.79 24.72 - 25.43 26.02

“Peppers” image, PSNR (dB)

σ Noisy NSWT-HT Curvelets [32] NSCT-HT NSWT-LAS BLS-GSM [7] NSCT-LAS

10 28.17 33.71 33.59 33.81 34.35 34.63 34.41

20 22.15 31.19 31.13 31.60 31.65 32.06 31.82

30 18.63 29.43 29.45 30.07 29.95 30.41 30.19

40 16.13 28.09 28.01 28.85 28.65 29.20 28.95

50 14.20 27.04 26.70 27.82 27.62 28.24 27.93

TABLE III

DENOISING PERFORMANCE OF THENSCT. THE LEFT-MOST COLUMNS ARE HARD THRESHOLDING AND THE RIGHT-MOST

ONES SOFT ESTIMATORS. FOR HARD THRESHOLDING, THE NSCT CONSISTENTLY OUTPERFORMS CURVELETS AND THE

NSWT. THE NSCT-LAS PERFORMS ON A PAR WITH THE MORE SOPHISTICATED ESTIMATORBLS-GSM.

σ “Lena” “Barbara” “Peppers”

10 -0.23 -0.48 -0.32

20 -0.09 -0.44 -0.09

30 -0.05 -0.37 -0.01

40 -0.04 -0.34 -0.00

50 -0.05 -0.31 -0.01

TABLE IV

RELATIVE LOSS IN PSNRPERFORMANCE(DB) WHEN USING THENSPWITH A CRITICALLY SAMPLED DFB AND THE LAS

ESTIMATOR WITH RESPECT TO THENSCT-LAS METHOD.

May 19, 2005 DRAFT


(a) Original (b) NSWT-HT

(c) Curvelets-HT (d) NSCT-HT

Fig. 11. Image denoising with the NSCT and hard thresholding. The noisy intensity is20. (a) Original Lena image. (b) Denoised

with the NSWT-HT, PSNR = 31.40dB. (c) Denoised with the curvelet transform and hard thresholding, PSNR = 31.52dB. (d)

Denoised with the NSCT-HT, PSNR = 32.03dB.

B. Image Enhancement

Existing image enhancement methods amplify noise when they amplify weak edges since they cannot

distinguish noise from weak edges. In the frequency domain,both weak edges and noise produce low-

May 19, 2005 DRAFT


(a) Original (b) BLS-GSM (c) NSCT-LAS

Fig. 12. Comparison between the NSCT-LAS and BLS-GSM denoising methods. The noise intensity is 20. (a) Original Barbara

image. (b) Denoised with the BLS-GSM method of [7], PSNR = 30.28dB. (c) Denoised with NSCT-LAS, PSNR = 30.60dB.

magnitude coefficients. The NSCT provides not only multiresolution analysis, but also geometrical and

directional representation. Since weak edges are geometricstructures and noise is not, we can use this

geometric representation to distinguish them.

The NSCT is shift-invariant so that each pixel of the transformsubbands corresponds to that of the

original image in the same location. Therefore, we gather thegeometrical information pixel by pixel from

the NSCT coefficients. We observe that there are three classes of pixels: strong edges, weak edges, and

noise. First, the strong edges correspond to those pixels with large magnitude coefficients in all subbands.

Second, the weak edges correspond to those pixels with large magnitude coefficients in some directional

subbands but small magnitude coefficients in other directional subbands within the same scale. Finally,

the noise corresponds to those pixels with small magnitude coefficients in all subbands. Based on this

observation, we can classify pixels into three categories by analyzing the distribution of their coefficients

in different subbands. One simple way is to compute the mean (denoted bymean) and the maximum

(denoted bymax) magnitude of the coefficients for each pixel, and then classify it by

strong edge if mean ≥ cσ

weak edge if mean < cσ, max ≥ cσ

noise if mean < cσ, max < cσ

, (18)

wherec is a parameter ranging from1 to 5, andσ is the noise standard deviation of the subbands at a

specific level. We first estimate the noise variance of the inputimage with the robust median operator

[34] and then compute the noise variance of each subband [32].

May 19, 2005 DRAFT


The goal of image enhancement is to amplify weak edges and to suppress noise. To this end, we

modify the NSCT coefficients according to the category of each pixel by a nonlinear mapping function

(similar to [35])

y(x) =

x strong edge pixels

max(( cσ|x|)

p, 1)x, weak edge pixels

0 noise pixels

, (19)

where the inputx is the original coefficient, and0 < p < 1 is the amplifying ratio. This function keeps

the coefficients of strong edges, amplifies the coefficients of weak edges, and zeros the noise coefficients.

We summarize our enhancement method using the NSCT in the following algorithm:

1) Compute the NSCT of the input image forN levels.

2) Estimate the noise standard deviation of the input image.

3) For each level DFB,

a) Estimate the noise variance.

b) Compute the threshold and the amplifying ratio.

c) At each pixel, compute the mean and the maximum magnitude of all directional subbands at

this level, and classify it by (18) into strong edges, weak edges, or noise.

d) For each directional subband, use the nonlinear mapping function given in (19) to modify the

NSCT coefficients according to the classification.

4) Reconstruct the enhanced image from the modified NSCT coefficients.

We compare the enhancement results by the proposed algorithm with those by the NSWT. In the

experiments, we choosec = 4 and p = 0.3. Figure 13 shows the results obtained for the “Barbara”

image. We observe that our proposed algorithm does a better job in enhancing the weak edges in the

textures.

V. CONCLUSION

We have proposed a fully shift-invariant version of the contourlet transform, the nonsubsampled con-

tourlet transform. The proposed construction reduces the filter design problem to that of a nonsubsampled

pyramid filter bank and a nonsubsampled fan filter bank. We have addressed the 2-D design problem

using the mapping approach, thus overcoming the need of factorization. We also developed a lifting/ladder

structure for the 2-D NSFB. This structure, when coupled with the filters designed via mapping, provides

a very efficient implementation that under some additional conditions can be reduced to 1-D filtering

May 19, 2005 DRAFT


(a)

(b) (c)

Fig. 13. Image enhancement with the NSCT. (a) Original “Barbara” image. (b) Enhanced by the NSWT. (c) Enhanced by the

NSCT.

operations. Applications of our proposed transform in image denoising and enhancement were studied.

In denoising, we studied the performance of the NSCT when coupled with a hard thresholding estimator

and a local adaptive shrinkage. For hard thresholding, our results indicate that the NSCT provides better

performance than competing transform such as the NSWT and curvelets. Concurrently, our local adaptive

shrinkage results are competitive to other denoising methods. In particular, our results show that a

fairly simple estimator in the NSCT domain yields comparableperformance to state-of-the-art denoising

May 19, 2005 DRAFT


methods that are more sophisticated and complex. In image enhancement, the results obtained with the

NSCT are visually superior to those of the NSWT.

APPENDIX

A. Proof of Proposition 1

Consider the pyramid shown in Figure 2 (a). IfJ = 1, then we have that‖y0‖2 + ‖y1‖2 ≤ Bp‖x‖2.

Now, suppose we haveJ levels and assume∑J

j=0 ‖yi‖2 ≤ BJp ‖x‖2. Then if we further splityJ into y′J

andyJ+1, noting thatBp ≥ 1 we have that

J−1∑

j=0

‖yj‖2 + ‖y′J‖2 + ‖yJ+1‖2 ≤J−1∑

j=0

‖yj‖2 + Bp‖yJ‖2

≤ Bp

J−1∑

j=0

‖yj‖2 + ‖yJ‖2

≤ BJ+1p ‖x‖2.

Thus, by induction we conclude that∑J

j=0 ‖yi‖2 ≤ BJp ‖x‖2 for any J ≥ 1. A similar argument shows

that in the NSDFB withlj stages, one has that∑2lj−1

k=0 ‖yj,k‖2 ≤ Bljq ‖yj‖2 so that

‖yJ‖2 +J−1∑

j=0

2lj−1∑

k=0

‖yj,k‖2 ≤ ‖yJ‖2 +J−1∑

j=0

Bljq ‖yj‖2

≤ ‖yJ‖2 + Bmax {lj}q

J−1∑

j=0

‖yj‖2

≤ BJBmax {lj}q ‖x‖2.

The bound forA is proved similarly, by just reversing the inequalities.�

B. Proof of Proposition 2

We prove the claim for the case in which the zeros ofG(1D)(z) are distinct. The proof for the case of

repeated roots can be handled similarly.

Denote

G(1D)(z) = g0

n∏

i=1

(z − zi), g0 ∈ C. (20)

Then sufficiency follows by direct substitution of (10) in (20).

May 19, 2005 DRAFT


We prove necessity by induction. SupposeG(1D)(F (x1, x2)) = (x1 − c)N1L(x1, x2) for some poly-

nomial L(x1, x2). Note thatG(1D)(F (c, x2)) = 0 for all x2 if and only if F (c, x2) = zj for all x2 and

some zerozj of G(1D)(z). So, it follows that

(F (x1, x2) − zj)x1=c= 0 for all x2, (21)

which implies thatF (x1, x2) = zj + (x1 − c)L1(x1, x2) with L1(x1, x2) a polynomial. Suppose

F (x1, x2) = zj + (x1 − c)k−1Lk−1(x1, x2),

wherek ≤ N1. By successively applying the chain rule for differentiation we get that

0 =∂k−1G(1D)(F (x1, x2))

∂xk−11

∣∣∣∣∣x1=c

= G(1D)′(F (c, x2))∂k−1F (x1, x2)

∂xk−11

∣∣∣∣∣x1=c

.

BecauseF (c, x2) = zj and thezj ’s are distinct, we have thatG(1D)′(F (c, x2)) 6= 0 and then

∂k−1F (x1, x2)

∂xk−11

∣∣∣∣∣x1=c

= 0

⇒ ∂k−1F (x1, x2)

∂xk−11

= (x1 − c)Lk−1(x1, x2)

⇒ ∂F (x1, x2)

∂x1= (x1 − c)k−1Lk−1(x1, x2). (22)

Combining (22) with (21) we obtain thatF (x1, x2) = zj + (x1 − c)kLk(x1, x2). By induction we

conclude that

F (x1, x2) = zj + (x1 − c)N1LN1(x1, x2).

If G(1D)(F (x1, x2)) = (x2 − d)N2L(x1, x2) for some polynomialL(x1, x2) then a similar argument

shows that

F (x1, x2) = zi + (x2 − d)N2LN2(x1, x2),

wherezi is a zero ofG(1D)(z). Thus,

zi − zj = (x1 − c)N1LN1(x1, x2) − (x2 − d)N2LN2

(x1, x2),

and hence we havezj = zi. Therefore,

(x1 − c)N1LN1(x1, x2) = (x2 − d)N2LN2

(x1, x2),

and so

LN1(x1, x2) = (x2 − d)N2LF (x1, x2),

and (10) is established withN ′1 = N1 andN ′

2 = N2. �

May 19, 2005 DRAFT


C. Proof of Proposition 3

The proof follows the same lines as for the 1-D case (see e.g., [1] pp. 245). Using the identity

∞∏

k=1

(

1 + ej 2−kω

2

)N

=

(1 + ejω

ω

)N

we have that

Φ(ω) =

(1 + ejω1

ω1

)N1(

1 + ejω2

ω2

)N2 ∞∏

j=1

R0(2−j

ω).

BecauseH0 is continuously differentiable atω = 0, the same also holds forR0. Now write R0 in polar

coordinates(r, ϕ) wherer2 = |ω1|2 + |ω2|2. SinceR0 is continuously differentiable, for eachϕ, R0(r, ϕ)

is a continuously differentiable function ofr. SinceR0(0) = 1, from the mean value theorem we have

that for ǫ > 0, and0 ≤ r < ǫ,

|R(r, ϕ)| ≤ 1 + |∂rR(ρ, ϕ)| r ≤ 1 + Kr

whereK := sup{|R(ρ, ϕ)| : 0 ≤ ρ < ǫ, ϕ ∈ [0, 2π]} < ∞. This gives

0 ≤ r < ǫ =⇒∞∏

j=1

R0(2−j

ω) ≤∞∏

j=1

(1 + K2−jr) ≤ eKǫ,

where we have used the inequality1 + r ≤ er. Now chooseJ so that2J−1ǫ ≤ r ≤ 2Jǫ. We then obtain,

for r > ǫ,

∞∏

j=1

|R0(2−j

ω)| =J∏

j=1

|R0(2−j

ω)|∞∏

j=1

|R0(2−j−J

ω)| ≤ BJeKǫ ≤ C1rlog

2BeKǫ

so that for eachω ∈ R2,∞∏

j=1

R0(2−j

ω) ≤ eKǫ(

1 + C1rlog

2B)

= eKǫ(

1 + C1(|ω1|2 + |ω2|2)1

2log

2B)

.

Now, putting all together we obtain

|Φ(ω)| ≤ C21

|ω1|N1

1

|ω2|N2

(

1 + C1(|ω1|2 + |ω2|2)1

2log

2B)

≤ C3

(1

1 + |ω1|

)N1−log2B ( 1

1 + |ω2|

)N2−log2B

,

which completes the proof.�

May 19, 2005 DRAFT


ACKNOWLEDGMENT

We would like to thank Dr. Jean-Luc Starck for providing the curvelet denoising software and Jason

Laska for helping with the implementation of the NSCT.

REFERENCES

[1] S. Mallat, A Wavelet Tour of Sinal Processing, 2nd ed. Academic Press, 1999.

[2] M. Vetterli and J. Kovacevic, Wavelets and Subband Coding. Englewood Cliffs, NJ, USA: Prencitce Hall, 1995.

[3] E. P. Simoncelli, W. T. Freeman, E. H. Adelson, and D. J. Heeger,“Shiftable multiscale transforms,”IEEE Trans. Info.

Theory, vol. 38, no. 2, pp. 587–607, March 1992.

[4] R. R. Coifman and D. L. Donoho, “Translation invariant de-noising,” in Wavelets and Statistics, A. Antoniadis and

G. Oppenheim, Eds. New York: Springer-Verlag, 1995, pp. 125–150.

[5] S. G. Chang, B. Yu, and M. Vetterli, “Adaptive wavelet thresholdingfor image denoising and compression,”IEEE Trans.

Img. Proc., vol. 9, no. 9, pp. 1532–1546, September 2000.

[6] L. Sendur and I. W. Selesnick, “Bivariate shrinkage with local variance estimation,”IEEE Signal Processing Letters, vol. 9,

no. 12, pp. 438–441, December 2002.

[7] J. Portilla, V. Strela, M. J. Wainwright, and E. P. Simoncelli, “Image denoising using scale mixtures of Gaussians in the

wavelet domain,”IEEE Trans. Img. Proc., vol. 12, no. 11, pp. 1338–1351, 2003.

[8] M. J. Shensa, “The discrete wavelet transform: Wedding thea trous and Mallat algorithms.”IEEE Trans. Signal Proc.,

vol. 40, no. 10, pp. 2464–2482, October 1992.

[9] D. L. Donoho, “Wedgelets: nearly minimax estimation of edges,”Ann. Statist., vol. 27, no. 3, pp. 859–897, 1999.

[10] M. B. Wakin, J. K. Romberg, H. Choi, and R. G. Baraniuk, “Wavelet-domain approximation and compression of piecewise

smooth images,” inIEEE Trans. Img. Proc., to appear, 2005.

[11] E. L. Pennec and S. Mallat, “Sparse geometric image representation with bandelets,”IEEE Tras. Img. Proc., vol. 14, no. 4,

pp. 423–438, April 2005.

[12] E. J. Candes and D. L. Donoho, “New tight frames of curvelets and optimal representations of objects with piecewiseC2

singularities,”Comm. Pure and Appl. Math, vol. 57, no. 2, pp. 219–266, February 2004.

[13] M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution image representation,”IEEE

Trans. Img. Proc., to appear, 2005,http://www.ifp.uiuc.edu/˜minhdo/publications .

[14] P. J. Burt and E. H. Adelson, “The Laplacian pyramid as a compact image code,”IEEE Trans. Communications, vol. 31,

no. 4, pp. 532–540, April 1983.

[15] M. N. Do and M. Vetterli, “Framing pyramids,”IEEE Trans. Signal Processing, vol. 51, no. 9, pp. 2329 – 2342, Sept.

2003.

[16] R. H. Bamberger and M. J. T. Smith, “A filter bank for the directional decomposition of images: Theory and design,”

IEEE Trans. Signal Processing, vol. 40, no. 4, pp. 882–893, April 1992.

[17] J. L. Starck, F. Murtagh, and A. Bijaoui,Image Processing and Data Analysis. Cambridge University Press, 1998.

[18] P. P. Vaidyanathan,Multirate Systems and Filterbanks. Prentice Hall, 1993.

[19] Z. Cvetkovic and M. Vetterli, “Oversampled filter banks,”IEEE Transactions on Signal Processing, vol. 46, no. 5, pp.

1245–1255, May 1998.

May 19, 2005 DRAFT


[20] H. Bolcskei, F. Hlawatsch, and H. G. Feichtinger, “Frame-theoretic analysisof oversampled filter banks,”IEEE Trans.

Signal Proc., vol. 46, no. 12, pp. 3256–3268, December 1998.

[21] J. H. McClellan, “The design of two-dimensional digital filters by transformation,” Proc. 7th Annual Priceton Conf.

Information Sciences and Systems, 1973.

[22] J. Kovacevic and M. Vetterli, “Nonseparable multidimensional perfect reconstruction filter banks and wavelet bases for

Rn,” IEEE Trans. Information Theory, vol. 38, no. 2, pp. 533–555, March 1992.

[23] S.-M. Phoong, C. W. Kim, P. P. Vaidyanathan, and R. Ansari, “Anew class of two-channel biorthogonal filter banks and

wavelet bases,”IEEE Transactions on Signal Processing, vol. 43, no. 3, pp. 649–661, March 1995.

[24] D. B. H. Tay and N. G. Kingsbury, “Flexible design of multidimensional perfect reconstruction FIR 2-band filters using

transformation of variables,”IEEE Trans. Img. Proc., vol. 2, no. 4, pp. 466–480, October 1993.

[25] A. L. Cunha and M. N. Do, “On two-channel filter banks with directional vanishing moments,”IEEE Trans. Image

Processing, submitted,http://www.ifp.uiuc.edu/˜minhdo/publications .

[26] S. Mitra and R. Sherwood, “Digital ladder networks,”IEEE Transactions on Audio and Electroacoustics, vol. AU-21, no. 1,

pp. 30–36, February 1973.

[27] W. Sweldens, “The lifting scheme: A custom-design construction ofbiorthogonal wavelets,”Appl. Comput. Harmon. Anal.,

vol. 3, no. 2, pp. 186–200, 1996.

[28] R. E. Blahut,Fast Algorithms for Digital Signal Processing. Addison-Wesley, 1985.

[29] R. Jia, “Approximation properties of multivariate wavelets,”Mathematics of Computation, vol. 67, pp. 647–665, 1998.

[30] I. Daubechies,Ten Lectures on Wavelets. Philadelphia, PA: SIAM, 1992.

[31] T. Cooklev, T. Yoshida, and A. Nishihara, “Maximally flat half-band diamond-shaped FIR filters using the bernstein

polynomial,” IEEE Tras. CAS-II, vol. 40, no. 11, pp. 749–751, Nov. 1993.

[32] J.-L. Starck, E. J. Candes, and D. L. Donoho, “The curvelet transform for image denoising,” IEEE Trans. Image Proc.,

vol. 11, no. 6, pp. 670–684, June 2002.

[33] D. D.-Y. Po and M. N. Do, “Directional multiscale modeling of imagesusing the contourlet transform,”IEEE Trans. Img

Proc., to appear, 2005,http://www.ifp.uiuc.edu/˜minhdo/publications .

[34] S. G. Chang, B. Yu, and M. Vetterli, “Spatially adaptive wavelet thresholding with context modeling for image denoising,”

IEEE Trans. Img. Proc., vol. 9, no. 9, pp. 1522–1531, September 2000.

[35] K. V. Velde, “Multi-scale color image enhancement,”Proc. IEEE Int. Conf. on Image Proc. (ICIP), vol. 3, pp. 584–587,

1999.

May 19, 2005 DRAFT

ieee transactions on image processing, may 2005 1 …jzhou2/publications/journal_nsct.pdfthe...

Documents