ieee transactions on image processing, may 2005 1 …jzhou2/publications/journal_nsct.pdfthe...
TRANSCRIPT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 1
The Nonsubsampled Contourlet Transform:
Theory, Design, and ApplicationsArthur L. Cunha,Student Member, IEEE, Jianping Zhou,Student Member, IEEE,
and Minh N. Do,Member, IEEE,
Abstract
The contourlet transform was proposed to address the lack ofgeometrical structure in the separable
two-dimensional wavelet transform. Because of its filter bank structure the contourlet transform is not
shift-invariant. In this paper we propose the nonsubsampled contourlet transform (NSCT) and study its
applications. The construction proposed in this paper is based on a nonsubsampled pyramid structure
and nonsubsampled directional filter banks. The result is a flexible multi-scale, multi-direction, and
shift-invariant image decomposition that can be efficiently implemented via thea trous algorithm. At
the core of the proposed scheme is the nonseparable two-channel nonsubsampled filter bank. We study
the filter design problem and propose a design framework based on the mapping approach. We exploit
the less stringent design condition of the nonsubsampled filter bank to design filters that lead to a
NSCT with better frequency selectivity and regularity whencompared to the contourlet transform. The
proposed design allows for a fast implementation based on a lifting or ladder structure that only uses
one-dimensional filtering in some cases. In addition, our design ensures that the corresponding frame
elements are regular, symmetric, and the frame is close to a tight one. We assess the performance of
the NSCT in image denoising and enhancement applications. In both applications the NSCT compares
favorably to other existing methods in the literature.
Index Terms
Multidimensional Filter Banks, Nonsubsampled Filter Banks, Frames, Contourlet transform, Image
Denoising, Image Enhancement.
A. L. Cunha and J. Zhou are with the Department of Electrical and Computer Engineering and the Coordinated Science
Laboratory, University of Illinois at Urbana-Champaign, Urbana IL 61801 (email:{cunhada,jzhou2}@uiuc.edu)
M. N. Do is with the Department of Electrical and Computer Engineering, theCoordinated Science Laboratory, and the
Beckman Institute, University of Illinois at Urbana-Champaign, UrbanaIL 61801 (email: [email protected]).
This work was supported by a CAPES fellowship, Brazil, and the U.S. National Science Foundation under grant CCR-0237633
(CAREER).
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 2
I. I NTRODUCTION
A number of image processing tasks are efficiently carried outin a domain other than the pixel domain,
often by means of an invertible linear transformation. For example, image compression and denoising
are efficiently done in the wavelet transform domain [1], [2].A good transform or representation would
capture the essence of a given signal or class of signals withfew basis elements. The set of basis functions
completely characterizes the transform and this set can be redundant or not, depending on whether the
basis functions are linear dependent. By allowing redundancy, it is possible to enrich the set of basis
functions so that the representation is more efficient in capturing some signal behavior. Applications such
as edge detection, contour detection, denoising, and imagerestoration can greatly benefit from redundant
representations.
In the context of multi-scale expansions implemented with filter banks, dropping the linear inde-
pendence requirement offers the possibility of a shift-invariant expansion, a crucial property in some
applications [3]. For instance, in image denoising by thresholding in the wavelet domain, the lack of
shift-invariance causes pseudo-Gibbs phenomena around singularities [4]. Not surprisingly, most state-
of-the-art wavelet denoising routines (see for example [5], [6], [7]) use an expansion with less shift
sensitivity than the standard maximally decimated waveletdecomposition — the most common being the
nonsubsampled wavelet transform (NSWT) computed with thea trousalgorithm [8] 1.
In addition to shift-invariance, it has been recognized that an efficient image representation has to
account for the geometrical structure pervasive in naturalscenes. In this direction, several representation
schemes have recently been proposed. These include adaptiveschemes such as wedgelets [9], [10] and
bandelets [11], and non-adaptive ones such as curvelets [12] and contourlets [13]. The contourlet transform
is a multi-directional and multi-scale transform that is constructed by combining the Laplacian pyramid
[14], [15] and the directional filter bank (DFB) [16]. Due to downsamplers and upsamplers present in
both the Laplacian pyramid and the DFB, the contourlet transform is not shift-invariant.
In this paper we propose an overcomplete transform that we call the nonsubsampled contourlet
transform (NSCT). The NSCT is a fully shift-invariant, multi-scale, and multi-direction expansion that
has better directional frequency localization and a fast implementation. In essence, the proposed structure
achieves a similar subband decomposition as that of contourlets, but without downsamplers and upsam-
plers in it. Because of its redundancy, the filter design problem of the NSCT is much less constrained than
1Denoising by thresholding in the NSWT domain can also be realized by denoising multiple circular shifts of the signal with
a critically sampled wavelet transform and then averaging the results. Thishas been termedcycle spinningafter [4].
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 3
that of contourlets. This enables us to design filters with better frequency selectivity thereby achieving
a better subband decomposition. Using the mapping approachwe provide a framework for filter design
that ensures good frequency selectivity in addition to having a fast implementation through ladders steps.
The NSCT has proven to be very efficient in image noise removal andimage enhancement as we show
in this paper.
The paper is structured as follows. In Section II we describe the NSCT and its building blocks. We
introduce a pyramid structure that ensures the multi-scalefeature of the NSCT and the directional filtering
structure based on the DFB. The basic unit in our construction is the nonsubsampled filter bank (NSFB)
which is discussed in Section II. In Section III we study the issues associated with the NSFB design and
implementation problems. We propose a design methodology based on one-dimensional (1-D) prototype
filters that allows fast implementation through ladder steps. Several examples are presented. Applications
of the NSCT in image denoising and enhancement are discussed in Section IV. Concluding remarks are
drawn in Section V.
Notation:Throughout the paper, a two-dimensional (2-D) filter is denoted byH(z) wherez = [z1, z2]T .
If m = [m1, m2]T is a 2-D vector, thenzm = zm1
1 zm2
2 whereas ifM is a 2 × 2 matrix, thenzM =
[zm1 , zm2 ] with m1,m2 the columns ofM. In this paper we often deal with zero-phase 2-D filters.
Such filters can be written as polynomials incos ω = (cos ω1, cos ω2). We thus writeF (x1, x2) for a
zero-phase filter in whichx1 andx2 denotecos ω1 andcos ω2 respectively.
II. N ONSUBSAMPLEDCONTOURLETS ANDFILTER BANKS
A. The Nonsubsampled Contourlet Transform
Figure 1 (a) displays a high level view of the proposed NSCT. The structure consists in a bank of filters
that splits the 2-D frequency plane in the subbands illustrated in Figure 1(b). Our proposed transform
can thus be divided into two parts that are both shift-invariant: (1) A nonsubsampled pyramid structure
that ensures the multi-scale property and (2) A nonsubsampled DFB structure that gives directionality.
1) The Nonsubsampled Pyramid (NSP):What gives the multi-scale property of the NSCT is a shift-
invariant filtering structure that achieves a subband decomposition similar to that of the Laplacian pyramid.
Our solution is obtained by using two-channel nonsubsampled 2-D filter banks. Figure 2 illustrates
the proposednonsubsampled pyramid(NSP) decomposition withJ = 3 stages. Such expansion is
conceptually similar to the 1-D nonsubsampled wavelet transform computed with thea trous algorithm
[8] and hasJ +1 redundancy, whereJ denotes the number of decomposition stages. The ideal passband
support of the lowpass filter at thej-th stage is the region[− π2j ,
π2j ]2. Accordingly, the ideal support of
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 4
directional
subbands
Bandpass
directional
subbands
Image
Bandpass
(a)
(π, π)
(−π,−π)
ω1
ω2
(b)
Fig. 1. The nonsubsampled contourlet transform. (a) Nonsubsampled filter bank structure that implements the NSCT. (b) The
idealized frequency partitioning obtained with the proposed structure.
the equivalent highpass filter is the complement of the lowpass, i.e., the region[− π2j−1 ,
π2j−1 ]2\[− π
2j ,π2j ]2.
The filters for subsequent stages are obtained by upsampling the filters of the first stage. This gives the
multi-scale property without the need for additional filter design. The proposed structure is thus different
from the separable nonsubsampled wavelet transform (NSWT). In particular, one bandpass image is
produced at each stage resulting inJ + 1 redundancy. By contrast, the separable NSWT produces three
directional images at each stage resulting in3J + 1 redundancy.
H0(z)
H1(z)
H0(z2I)
H1(z2I)
H0(z4I)
H1(z4I)
x
y0
y1
y2
y3
(a)
0 1 2 3
(π, π)
(−π,−π)
ω1
ω2
(b)
Fig. 2. The proposed nonsubsampled pyramid is a 2-D multi-resolution expansion similar to the 1-D nonsubsampled wavelet
transform. (a) A three-stage pyramid decomposition. The lighter gray regions denote the aliasing caused by upsampling. (b) The
subbands on the 2-D frequency plane.
The proposed pyramid is not the only 2-D pyramid withJ +1 redundancy. The 2-D pyramid proposed
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 5
in [17] is obtained with a similar structure. Specifically, theNSFB of [17] is built from a given lowpass
filter H0(z). One then setsH1(z) = 1 − H0(z), andG0(z) = G1(z) = 1. This perfect reconstruction
system can be seen as a particular case of our more general structure. The advantage of our construction
is that it is less restrictive and as a result, better filters can be obtained.
2) The Nonsubsampled Directional Filter Bank (NSDFB):The directional filter bank of Bamberger and
Smith [16] is constructed by combining critically-sampled two-channel fan filter banks and resampling
operations. The result is a tree-structured filter bank that splits the frequency plane in the directional
wedges shown in Figure 3 (a).
37
6
5
4 7
6
5
4
0123
0 1 2
(π, π)
(−π,−π)
ω1
ω2
(a)
+
Ueq0 (z)
Ueq1 (z)
Ueq
L−1(z)
Veq0 (z)
Veq1 (z)
Veq
L−1(z)
S0S0
S1S1
SL−1SL−1
xx
y0
y1
yL−1
(b)
Fig. 3. The directional filter bank. (a) Ideal partitioning of the 2-D frequency plane into23 = 8 wedges. (b) Equivalent
multi-channel filter bank structure. The number of channels isL = 2l, wherel is the number of stages in the tree structure.
Using multirate identities [18], the tree-structured DFB can be put into the equivalent form shown in
Figure 3 (b), where the downsampling/upsampling matricesSk for 0 ≤ k ≤ 2l − 1 are given by
Sk =
diag(2l−1, 2
), for 0 ≤ k ≤ 2l−1 − 1;
diag(2, 2l−1
), for 2l−1 ≤ k ≤ 2l − 1,
(1)
with l denoting the number of stages in the tree structure. It is clear from the above that the DFB is not
shift-invariant. A shift-invariant directional expansion is obtained with a nonsubsampled DFB (NSDFB).
The NSDFB is constructed by eliminating the downsamplers and upsamplers in Figure 3 (b). This is
equivalent to switching off the downsamplers in each two-channel filter bank in the DFB tree structure and
upsampling the filters accordingly. This results in a tree composed of two-channel nonsubsampled filter
banks. The equivalent filter in thek-th channel of the analysis NSDFB tree is the filterU eqk (z) in Figure 3
(b). Thus, each filterU eqk (z) is a product ofl simpler filters. Just like the DFB, all NSFB’s in the NSDFB
tree structure are obtained from a single NSFB with fan filters (see Figure 6 (b)). Figure 4 illustrates a
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 6
four channel decomposition. Note that in the second level, the upsampled fan filtersUi(zQ), i = 0, 1
have checker-board frequency support, and when combined with the filters in the first level give the four
directional frequency decomposition shown in Figure 4. The synthesis filter bank is obtained similarly.
Just like the NSP case, each filter bank in the NSDFB tree has the same computational complexity as
that of the prototype NSFB.
U0(z)
U1(z)
U0(zQ)
U0(zQ)
U1(zQ)
U1(zQ)
x
y0
y1
y2
y3
(a)
0 13
2
(π, π)
(−π,−π)
ω1
ω2
(b)
Fig. 4. A four-channel nonsubsampled directional filter bank constructed with two-channel fan filter banks. (a) Filtering structure.
The equivalent filter in each channel is given byUeq
k (z) = Ui(z)Uj(zQ). (b) Corresponding frequency decomposition.
3) Combining the NSP and NSDFB in the NSCT:The NSCT is constructed by combining the NSP
and the NSDFB as shown in Figure 1 (a). In constructing the nonsubsampled contourlet transform, care
must be taken when applying the directional filters to the coarser scales of the pyramid. Due to the
tree-structure nature of the NSDFB, the directional responseat the lower and upper frequencies suffers
from aliasing which can be a problem in the upper stages of thepyramid (see Figure 9). This is illustrated
in Figure 5 (a), where the passband region of the directional filter is labelled as “Good” or “Bad”. Thus
we see that for coarser scales, the highpass channel in effect is filtered with the bad portion of the
directional filter passband. This results in severe aliasing and in some observed cases a considerable loss
of directional resolution.
We remedy this by judiciously upsampling the NSDFB filters. Denote thek-th directional filter by
Uk(z). Then for higher scales, we substituteUk(z2mI) for Uk(z) wherem is chosen to ensure that the good
part of the response overlaps with the pyramid passband. Figure 5 (b) illustrates a typical example. Note
that this modification preserves perfect reconstruction. Ina typical 5-scale decomposition, we upsample
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 7
���������������
���������������
��������������������
������������������������������
������������������������������
replacements (π, π)
(−π,−π)
ω1
ω2
(a)
���������
���������
“Good”“Bad”
������������
������������
������������
������������
������������
������������
������������
������������
������������
������������
(π, π)
(−π,−π)
ω1
ω2
(b)
Fig. 5. The need for upsampling in the NSCT. (a) With no upsampling, the highpass at higher scales will be filtered by the
portion of the directional filter that has “bad” response. (b) Upsamplingensures that filtering is done in the “good” region.
by 2I the NSDFB filters of the last two stages.
Filtering with the upsampled filters does not increase computational complexity. Specifically, for a
given sampling matrixS and a 2-D filterH(z), to obtain the outputy[n] resulting from filteringx[n]
with H(zS), we use the convolution formula
y[n] =∑
k∈supp(h)
h[k]x[n − Sk]. (2)
Therefore, each filter in the NSDFB tree has the same complexity asthat of the building-block fan
NSFB. Likewise, each filtering stage of the NSP has the same complexity as that incurred by the first
stage. Thus, the complexity of the NSCT is dictated by the complexity of the building-block NSFB’s. If
each NSFB in both NSP and NSDFB requiresL operations per output sample, then for an image ofN
pixels the NSCT requires aboutBNL operations whereB denotes the number of subbands. For instance,
if L = 32, a typical decomposition with 4 pyramid levels, 16 directions in the two finer scales and 8
directions in the two coarser scales would require a total of1536 operations per image pixel.
If the building block 2-channel NSFB’s in the NSP and NSDFB are invertible, then clearly the NSCT is
invertible and thus it implements a frame expansion (see Section II-C). The frame elements are localized
in space and oriented along a discrete set of directions. The NSCT is flexible in that it allows any
number of2l directions in each scale. In particular, it can satisfy the anisotropic scaling law — a key
property in establishing the expansion nonlinear approximation behavior [12], [13]. This property is
ensured by doubling the number of directions in the NSDFB expansion at every other scale. The NSCT
has redundancy given by1 +∑J
j=1 2lj , wherelj denotes the number of levels in the NSDFB at thej-th
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 8
scale.
B. Nonsubsampled Filter Banks
At the core of the proposed NSCT structure is the 2-D two-channel nonsubsampled filter bank. Shown
in Figure 6 are the NSFB’s needed to construct the NSCT. In this paper we focus exclusively on the FIR
case simply because it is easier to implement in multiple dimensions. For a general FIR two-channel
NSFB, perfect reconstruction is achieved provided the filters satisfy theBezout identity:
H0(z)G0(z) + H1(z)G1(z) = 1. (3)
The Bezout identity puts no constraint on the frequency response of the filters involved. Therefore, to
obtain good solutions one has to impose additional conditions on the filters.
+
H0(z)
H1(z)
G0(z)
G1(z)
x x
y0
y1
(a)
+x x
y0
y1
U0(z)
U1(z)
V0(z)
V1(z)
(b)
Fig. 6. The two-channel nonsubsampled filter banks used in the NSCT. The system is two times redundant and the reconstruction
is error free when the filters satisfy Bezout’s identity. (a) Pyramid NSFB.(b) Fan NSFB.
C. Frame Analysis of the NSCT
The nonsubsampled filter bank can be interpreted in terms of analysis/synthesis operators offrame
systems.A family of vectors{φn}n∈Γ constitute a frame for a Hilbert spaceH if there exist two positive
constantsA, B such that for eachx ∈ H we have
A‖x‖2 ≤∑
n∈Γ
|〈x, φn〉|2 ≤ B‖x‖2. (4)
In the event thatA = B the frame is said to be tight. Theframe boundsare the tightest positive constants
satisfying (4).
Consider the NSFB of Figure 6 (a). The family{h0[·−n], h1[·−n]}n∈Z2 is a frame forℓ2(Z2) if and
only if there exist constants0 < A ≤ B < ∞ such that [19]
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 9
A ≤ |H0(ejω)|2 + |H1(e
jω)|2︸ ︷︷ ︸
t(ejω)
≤ B. (5)
Thus, the frame bounds of an NSFB can be computed by
A = ess.infω∈[−π,π]2
t(ejω), B = ess.supω∈[−π,π]2
t(ejω), (6)
where ess.inf, and ess.sup denote the essential infimum and essential supremum respectively. From (5),
we see that the frame is tight whenevert(ejω) is almost everywhere constant. As a result, FIR filters
corresponding to tight frames cannot be linear-phase, except for Haar-type filters (see [18] pp. 337-338).
Because the NSFB is redundant, an infinite number of inverses exist. Among them, thepseudo-inverse
is the best inverse in a number of senses [1]. Given a frame of analysis filters, the synthesis filters
corresponding to the frame pseudo-inverse are given byGi(z) = Hi(z)/t(z) for i = 0, 1 [19]. In this
case, the synthesis filters form thedual framewith lower and upper frame bounds given byB−1 and
A−1 respectively. When the analysis filters are FIR, then unless the frame is tight, the synthesis filters
of the pseudo-inverse will be IIR.
From the above discussion we gather two important points: (1)linear phase filters and tight frames are
mutually exclusive; (2) the pseudo-inverse is desirable, but is IIR if the frame is not tight. Consequently,
an FIR NSFB system with linear phase filters and with synthesis filters corresponding to the pseudo-
inverse is not possible. However, we can approximate the pseudo-inverse with FIR filters. For a given
number of filter taps, the closer the frame is to being tight, the better an FIR approximation of the
pseudo-inverse can be [20]. Thus, in the designs we seek linear phase filters that underly a frame that is
as close to a tight one as possible.
In a general FIR perfect reconstruction NSFB system both analysis and synthesis filters form a frame.
If we denote the analysis and synthesis frame bounds byAa, Ba andAs, Bs respectively, the frames will
be close to tight provided
ra := Aa/Ba ≈ 1, andrs := As/Bs ≈ 1.
We always assume that the filters are normalized so that we haveAa ≤ 1 ≤ Ba. In case the pseudo-
inverse is used, then we also haveAs ≤ 1 ≤ Bs. The following result shows the NSCT is a frame
operator forℓ2(Z2) whenever the constituent NSFB’s each forms a frame.
Proposition 1: In the nonsubsampled contourlet transform if the pyramid filter bank constitute a frame
with frame boundsAp andBp, and the fan filters constitute a frame with frame boundsAq andBq, then
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 10
J A actual A estimated B actual B estimated
1 1.0435 1.0435 0.9586 0.9596
2 1.0504 1.0889 0.9393 0.9189
3 1.0515 1.1362 0.9332 0.8808
4 1.0517 1.1857 0.9316 0.8444
TABLE I
FRAME BOUNDS EVOLVING WITH SCALE FOR THE PYRAMID FILTERS GIVEN IN EXAMPLE 1 IN SECTION III
the NSCT is a frame with boundsA andB satisfying
AJp Amin {lj}
q ≤ A ≤ B ≤ BJp Bmax {lj}
q
Proof: See Appendix A.
Remark 1:When both the pyramid and fan filter banks form tight frames with bound 1, thenAp =
Aq = Bp = Bq = 1, and from the above proposition, the nonsubsampled contourlet transform is also a
tight frame with bound 1.
The above estimates onA andB can be accurate in some cases, especially when the frame is close
to a tight one and the number of levels is small (e.g.,J = 4). In general however they are not accurate
estimates. Their purpose is more of giving an interval for theframe bounds rather than the actual values.
Table I shows estimates for different number of scales. The actual frame bound is computed from (5 -
6) whereas the estimates are the ones given according to Proposition 1.
III. F ILTER DESIGN AND IMPLEMENTATION
The filter design problem of the NSCT comprises the two basic NSFB’sdisplayed in Figure 6. The
goal is to design the filters imposing the Bezout identity (i.e. perfect reconstruction) and enforcing other
properties such as sharp frequency response, easy implementation, regularity of the frame elements, and
tightness of the corresponding frames. It is also desirablethat the filters are linear-phase.
Two-channel 1-D NSFB’s that underly tight frames are designedin [19]. However, the design method-
ology of [19] is not easy to extend to 2-D designs since it relies on spectral factorization which is hard
in 2-D. If we relax the tightness constraint then the design becomes more flexible. In addition, as we
alluded to earlier, non-tight filters can be linear phase.
An effective and simple way to design 2-D filters is themappingapproach first proposed by McClellan
[21] in the context of digital filters and then used by several authors [22], [23], [24], [25] in the context
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 11
of filter banks. In such approach, the 2-D filters are obtained from 1-D ones. In the context of NSFB’s,
a set of perfect reconstruction 2-D filters is obtained in the following way:
Step 1. Construct a set of 1-D polynomials{H(1D)i (x), G
(1D)i (x)}i=0,1 that satisfies the Bezout identity.
Step 2. Given a 2-D FIR filterf(z), then{H(1D)i (f(z)), G
(1D)i (f(z))}i=0,1 are 2-D filters satisfying
the Bezout identity.
Thus, one has to design the set of 1-D filters and the mapping function f(z) so that the ideal responses
are well approximated with a small number of filter coefficients.
One advantage of the mapping approach is that one can controlthe frequency and phase responses
through the mapping function. Moreover, if the mapping function is zero-phase, thenf(z) = f(z−1) and
it follows that the mapped filters are also zero-phase. Therefore, we restrict our attention to zero-phase
designs only. In this case, on the unit sphere, the mapping function is a 2-D polynomial in(cos ω1, cos ω2).
We thus denote it byF (x1, x2), where it is implicit thatf(ejω) = F (cos ω1, cos ω2).
+
+
+ +
+
x x
Q(1D)(f(z)) −Q(1D)(f(z)) −P (1D)(f(z))P (1D)(f(z))
Fig. 7. Lifting structure for the nonsubsampled filter bank designed with themapping approach. The 1-D prototype is factored
with the Euclidean algorithm. The 2-D filters are obtained by replacingx 7→ f(z).
A. Implementation Through Lifting
Filters designed with the mapping approach can be efficiently factored into a ladder [26] or lifting [27]
structure that simplifies computations. To see this, assume without loss of generality that the degree of
the highpass prototype polynomialH(1D)1 (x) is smaller than that ofH(1D)
0 (x). Suppose also that there are
synthesis filtersG(1D)0 (x) andG
(1D)1 (x) such that the Bezout identity is satisfied. In this case it follows
that gcd{H(1D)0 , H
(1D)1 } = 1. The Euclidean algorithm then enables us to factor the filters inthe following
way [28], [26], [27]:
H
(1D)0 (x)
H(1D)1 (x)
=N∏
i=0
1 0
P(1D)i (x) 1
1 Q
(1D)i (x)
0 1
1
0
. (7)
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 12
As a result, we can obtain a 2-D factorization by replacingx with f(z). This factorization characterizes
every 2-D NSFB derived from 1-D through the mapping method. Figure 7 illustrates the ladder structure
with one stage.
In general, the lifting implementation at least halves the number of multiplications and additions of
the direct form [27]. The complexity can be reduced further ifthe lifting steps in the 1-D prototype are
monomials and the mapping filterf(z) has the form
f(z) = f1(zp1
1 zp2
2 )f2(zq1
1 zq2
2 ), (8)
for suitablef1(z), f2(z), and integersp1, p2, q1, q2. Note that iff(z) is a 1-D filter, then the 2-D filter
f(zp1zq
2) for integersp andq has the same complexity as that off(z). Therefore, filters of the form in (8)
have the same complexity as that of separable filters (i.e., filters of the formf(z) = f(z1)f(z2) ) which
amounts to two 1-D filtering operations. Notice that iff(z) is as in (8), then for an arbitrary sampling
matrix S, f(zS) also has the form in (17). Consequently, all NSFB’s in the NSDFB tree structure can
be implemented with 1-D operations whenever the prototype fan NSFB can be implemented with 1-D
filtering operations. The same reasoning applies to the NSFB’s ofthe NSP.
B. Pyramid Filter Design
In the pyramid case, we imposeline zerosat ω = (π, ω2) and ω = (ω1, π). Notice that anN -th
order line zero atω = (π, ω2), for example, amounts to a(1 + ejω1)N factor in the low pass filter. Such
zeros are useful to obtain good approximation of the ideal frequency response of Figure 6, in addition
to imposing regularity of the scaling function. We point outthat for the approximation of polynomial
images, point zeros at(±π,±π) would suffice [29]. However, our experience shows that point zeros alone
do not guarantee a “reasonable” frequency response of the pyramid filters. The following proposition
characterizes the mapping function that generates zeros inthe resulting 2-D filters.
Proposition 2: Let G(1D)(z) be a polynomial with roots{zi}ni=1 where eachzi has multiplicity ni.
Suppose we want a mapping functionF (x1, x2) such that
G(1D) (F (x1, x2)) = (x1 − c)N1 (x2 − d)N2 L(x1, x2), (9)
where L(x1, x2) is a bivariate polynomial. ThenG(1D)(F (x1, x2)) has the form in (9)if and only if
F (x1, x2) takes the form
F (x1, x2) = zj + (x1 − c)N ′
1 (x2 − d)N ′
2 LF (x1, x2), (10)
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 13
for some rootzj ∈ {zi}ni=1, where LF (x1, x2) is a bivariate polynomial, andN ′
1, N′2 are such that
N ′1ni ≥ N1 andN ′
2nj ≥ N2.
Proof: See Appendix B.
The above result holds in general, even for point zeros (as opposed to line zeros as in Proposition 2).
We will explore this extensively in the designs that follow.
Suppose the prototype filtersH(1D)0 (x), G
(1D)0 (x) each has zeros atx = −1. Then, in order to produce
a suitable zero-phase mapping function for the pyramid NSFB weconsider the class of maximally-flat
filters given by the polynomials [30]
PN,L(x) :=
(1 + x
2
)N L−1−N∑
l=0
(N + l − 1
l
)(1 − x
2
)l
, (11)
whereN controls the degree of flatness atx = −1 andL controls the degree flatness atx = 1. Following
Proposition 2, we can construct a family of mapping functionsas
F (x1, x2) = −1 + 2PN0,L0(x1)PN1,L1
(x2) (12)
so that zero moments atx1 = −1 andx2 = −1 are guaranteed. Note thatF (x1, x2) can be implemented
with 1-D filtering operations only.
C. Fan Filter Design
To design the fan filters idealized in Figure 6(b) we use the samemethodology as in the pyramid
case. The distinction occurs in the mapping function. The fan-shaped response can be obtained from
a diamond-shaped response by simple modulation in one of thefrequency variables. This modulation
preserves the perfect reconstruction property. A useful family of mapping functions for the diamond-
shaped response is obtained by imposing flatness aroundω = (±π,±π) and ω = (0, 0) in addition to
point zeros at(±π,±π). If the mapping function is zero-phase, this amounts to imposing flatness in a
polynomial at(x1, x2) = (−1,−1) and (x1, x2) = (1, 1), and zeros at(x1, x2) = (−1,−1). Mapping
functions satisfying these desiderata with minimum numberof polynomial coefficients are given by
FN (x1, x2) = −1 + QN (x1, x2), (13)
where the polynomialsQN (x1, x2) give the class of maximally-flat half-band filters with diamondsupport.
A closed-form expression forQN (x1, x2) is given in [31]. Table II displays the first six mapping functions
FN (x1, x2) for the diamond filter bank.
We point out that diamond maximally-flat mapping polynomialscan be generated from 1-D ones by
a separable product and then an appropriate change of variables. In this case the mapping function is
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 14
N FN (x1, x2)
1 12(x1 + x2)
2 14(x1 + x2)(3 − x1x2)
3 116
(x1 + x2)(15 − x21 − 8x1x2 − x2
2 + 3x21x
22)
4 132
(x1 + x2)(35 − 5x21 − 25x1x2 + 3x3
1x2 − 5x22 + 15x2
1x22 + 3x1x
32 − 5x3
1x32)
5 1256
(x1 + x2)�315 − 70x2
1 + 3x41 − 280x1x2 + 72x3
1x2 − 70x22 + 228x2
1x22
−30x41x
22 + 72x1x
32 − 120x3
1x32 + 3x4
2 − 30x21x
42 + 35x4
1x42
�6 1
512(x1 + x2)
�693 − 210x2
1 + 21x41 − 735x1x2 + 294x3
1x2 − 15x51x2 − 210x2
2 + 756x21x
22 − 210x4
1x22
+294x1x32 − 540x3
1x32 + 70x5
1x32 + 21x4
2 − 210x21x
42 + 245x4
1x42 − 15x1x
52 + 70x3
1x52 − 63x5
1x52
�TABLE II
MAXIMALLY FLAT MAPPING POLYNOMIALS USED IN THE DESIGN OF THE NONSUBSAMPLED FAN FILTER BANK.
separable and has the formf(z) = f1(z1z2)f2(z1z−12 ). This has been done for instance in [23]. However,
the nonseparable solution is generally shorter (roughly bya factor of 2) while yielding the same number
of zeros at the aliasing frequencies and a similar frequencyresponse. For faster implementation, one may
choose the longer mapping filters which can be implemented with 1-D filtering operations.
D. Design Examples
The design through mapping is based on a set of 1-D polynomialsthat satisfies the Bezout identity,
that is, H(1D)0 (x)G
(1D)0 (x) + H
(1D)1 (x)G
(1D)1 (x) = 1. The design can be simplified if we impose the
restriction thatH(1D)1 (x) = G
(1D)0 (−x) and G
(1D)1 (x) = H
(1D)0 (−x). One advantage of this choice is
that the frequency response of the filters can be controlled bythe lowpass branch — the highpass will
automatically have the complementary response. Another advantage is that, under an additional condition,
the frame bounds of the analysis and synthesis frames are thesame and can be computed from the 1-D
prototypes. To see this, suppose thatf is the mapping function and thatRan(f) = Ran(−f) with
Ran(f) denoting the range of the mapping functionf . Then we have that
Aa = infω∈[π,π]2
|H(1D)0 (f(ejω))|2 + |G(1D)
0 (−f(ejω))|2
= infx∈Ran(f)
|H(1D)0 (x)|2 + |G(1D)
0 (−x)|2 = infx∈Ran(−f)
|H(1D)0 (−x)|2 + |G(1D)
0 (x)|2
= infx∈Ran(f)
|H(1D)0 (−x)|2 + |G(1D)
0 (x)|2 = As. (14)
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 15
A similar argument shows that
Ba = Bs = supx∈Ran(f)
|H(1D)0 (−x)|2 + |G(1D)
0 (x)|2. (15)
Example 1 ( Pyramid filters very close to tight ones ):In order to get filters that are almost tight we
design the prototypesH(1D)0 (x) and G
(1D)0 (x) to be very close to each other. If we letH
(1D)1 (x) =
G(1D)0 (−x) andG
(1D)1 (x) = H
(1D)0 (−x), then the following filters can be checked to satisfy the Bezout
identity:
H(1D)0 (x) = K
(1 + k2x + k2k3x
2),
G(1D)0 (x) = K−1
(1 − k1x − k3x + k1k2x
2 − k1k2k3x3).
To obtain the prototype we choose the constantsK, k1, k2, and k3 such that each filter has a zero at
z = −1 and in addition thatH(1D)0 (−1) = G
(1D)0 (−1) =
√2. We obtain
H(1D)0 (x) =
1
2(x + 1)
(√2 + (1 −
√2)x)
,
G(1D)0 (x) =
1
2(x + 1)
(√2 + (4 − 3
√2)x + (2
√2 − 3)x2
)
.
The lifting factorization of the prototype filters is given by
H
(1D)0 (x)
H(1D)1 (x)
=
1 αx
0 1
1 0
βx 1
1 γx
0 1
K1
K2
(16)
with
α = γ = 1 −√
2, β =1√2, K1 = K2 =
1√2.
Notice that this implementation has 5 multiplies/sample whereas the direct form has 10 multiplies/sample.
In this example we setF (x1, x2) = −1 + 2P2,4(x)P2,4(y) so that each of the filters has a 4-th order
zero atω1 = ±π and atω2 = ±π. Since the ladder steps are monomials, the NSFB can be implemented
with 1-D filtering operations. The frame bounds are computed using (14-15):
Aa = As = 0.96, Ba = Bs = 1.04.
Thus we havera = rs = 1.083 and the frame is almost tight. The support size ofH0(z) is 13 × 13
whereasG0(z) has support size19 × 19. Figure 8 shows the response of the resulting filters.
Example 2 (Maximally-flat fan filters):Using the prototype filters of Example 1, we modulate a maximally-
flat diamond mapping function to obtain the maximally-flat fan mapping function. Thus we choose
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 16
-2
0
2
-2
0
2
0
0.25
0.5
0.75
1
-2
0
2
(a) |H0(ejω)|
-2
0
2
-2
0
2
0
0.25
0.5
0.75
1
-2
0
2
(b) |G0(ejω)|
-2
0
2
-2
0
2
0
0.25
0.5
0.75
1
-2
0
2
(c) |H1(ejω)|
-2
0
2
-2
0
2
0
0.25
0.5
0.75
1
-2
0
2
(d) |G1(ejω)|
Fig. 8. Magnitude response of the filters designed in Example 1 with maximally-flat filters. The nonsubsampled pyramid filter
bank underlies almost tight analysis and synthesis frames.
FN (x1, x2) in Table II with N = 3. Figure 9 shows the magnitude response of the filters. The support
size ofU0(z) is 21 × 21 whereas the support size ofV0(z) is 31 × 31.
E. Regularity of the NSCT basis functions
The multi-scale property of the NSCT is obtained through the nonsubsampled pyramid implemented
with a 2-D a trous algorithm. Denoting byH0(ω) the scaling lowpass filter used in the pyramid (we
write H0(ω) instead ofH0(ejω) for convenience), we have the associated scaling function
Φ(ω) :=∞∏
j=1
H0(2−j
ω),
where convergence is in the weak sense. In our proposed design the filterH0(ω) can be factored as
H0(ω) =
(1 + ejω1
2
)N1(
1 + ejω2
2
)N2
RH0(ω). (17)
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 17
-2
0
2
-2
0
2
0
0.25
0.5
0.75
1
-2
0
2
(a) |U0(ejω)|
-2
0
2
-2
0
2
0
0.25
0.5
0.75
1
-2
0
2
(b) |V0(ejω)|
-2
0
2
-2
0
2
0
0.25
0.5
0.75
1
-2
0
2
(c) |U1(ejω)|
-2
0
2
-2
0
2
0
0.25
0.5
0.75
1
-2
0
2
(d) |V1(ejω)|
Fig. 9. Fan filters designed with prototype filters of Example 1 and diamond maximally-flat mapping filters.
Notice that the remainder filterRH0(ω) is not separable. Therefore, one cannot separate the regularity
estimation as two 1-D problems. Nonetheless, a similar argument can be developed and an estimate of
the 2-D regularity of the scaling filter is obtained.
Proposition 3: Let H0(ω) be a scaling filter as in (17) with the corresponding scaling functionΦ(ω).
Let
B = supω∈[π,π]2
|RH0(ω)|.
Then
|Φ(ω)| ≤ C
(1
1 + |ω1|
)N1−log2B ( 1
1 + |ω2|
)N2−log2B
.
Proof: See Appendix C.
As an example, consider the prototype filter in Example 1 and themapping F (x1, x2) = −1 +
2P1,2(x)P1,2(y). The resulting filter has second-order zeros atω1 = ±π and atω2 = ±π. It can be verified
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 18
that|RH0(ω)| / 1.83 and|RG0
(ω)| / 1.49 so that the regularity exponent2 is at least2−log2 1.83 ≈ 1.13
for H0(ω) and2 − log2 1.49 ≈ 1.43 for G0(ω). Thus the corresponding scaling functions and wavelets
are at least continuous. We point out that better estimates are possible applying similar 1-D techniques.
For instance, one could prove a result similar to Lemma 7.1.2 in [30], pp. 217, as a consequence of
Proposition 3.
Figure 10 shows the basis functions of the NSCT obtained with the filters designed via mapping. As
the picture shows, the functions have a good degree of regularity.
IV. A PPLICATIONS
A. Image Denoising
In order to illustrate the potential of the NSCT designed using the techniques previously discussed we
study additive white Gaussian noise (AWGN) removal from images by means of thresholding estimators.
We test the NSCT under two denoising schemes described next.
1) Hard Threshold:For the hard threshold estimator we choose a global threshold Ti,j = Kσnijfor
each directional subband. This has been termedK-sigma thresholding in [32]. We setK = 4 for the finer
scale andK = 3 for the remaining ones. We refer to this method as NSWT-HT whenapplied to NSWT
coefficients and NSCT-HT when applied to NSCT coefficients. We usefive scales of decomposition for
both NSCT and NSWT. For the NSCT we use 4,8,8,16,16 directions inthe scales from coarser to finer
respectively.
2) Local adaptive shrinkage:We perform soft thresholding (shrinkage) independently ineach subband.
Following the work in [5] we choose the threshold
Ti,j =σ2
Nij
σi,j,n
,
whereσi,j,n denotes the variance of then-th coefficient at thei-th directional subband of thej-th scale,
andσ2Nij
is the noise variance at scalej and directioni. It is shown in [5] that shrinkage estimation with
T = σ2
σX, and assumingX generalized Gaussian distributed yields a risk within 5% ofthe optimal Bayes
risk. As studied in [33], contourlet coefficients are well modeled by generalized Gaussian distributions.
The signal variances are estimated locally using the neighboring coefficients contained in a square window
within each subband and a maximum likelihood estimator. The noise variance in each subband is inferred
using a Monte-Carlo technique where the variances are computed for a few normalized noise images
2The regularity exponent of a scaling functionφ(t) is the largest numberα such thatΦ(ω) decays as fast as 1(1+|ω1|+|ω2|)
α .
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 19
Basis functions, Level 2
(a)
Basis functions, Level 3
Basis functions, Level 4
(b)
Fig. 10. Basis functions of the nonsubsampled contourlet transform. (a) Basis functions of the second stage of the pyramid,
(b) Basis functions of third (top 8) and fourth (bottom 8) stages of the pyramid.
and then averaged to stabilize the results. We refer to this method aslocal adaptive shrinkage(LAS).
Effectively, our LAS method is a simplified version of the denoising method proposed in [34] that works
in the NSCT or NSWT domain.
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 20
To benchmark the performance of the NSCT we have used some of the best denoising schemes reported
in the literature:
1) Hard thresholding with the NSWT.
2) Hard thresholding in the curvelet transform domain [32].
3) Bivariate shrinkage with local variance estimation [6].
4) Bayes least-squares with a Gaussian scale-mixture model(BLS-GSM) proposed in [7].
For the LAS estimator we use four scales for both the NSCT and NSWT. For the NSCT we use 3,3,4,4
directions in the scales from coarser to finer respectively.
We use three512 × 512 standard images in our test suit: “Lena”, “Barbara”, and “Peppers”. Table
III shows the PSNR results for the various methods used in our study3. The results show that for hard
thresholding, the NSCT is consistently superior to curvelets and NSWT in PSNR measure. For the
“Barbara” image the NSCT-HT yields improvements in excess of1.90 dB in PSNR over the NSWT-HT.
Figure 11 displays the reconstructed images using the the NSWT-HT, curvelets and NSCT-HT. As the
figure shows, both the NSCT and the curvelet transform offers a better recovery of edge information
relative to the NSWT. But improvements can be seen in the NSCT, particularly around the eye.
The NSCT coupled with the local adaptive shrinkage estimator (NSCT-LAS) produced very satisfactory
results as well. In particular, among the methods studied, the NSCT-LAS yields the best results for the
“Barbara” image, being surpassed by the BLS-GSM method for the other images. Despite its slight loss in
performance relative to BLS-GSM, we believe the NSCT has potential for better results. This is because
by comparison, the BLS-GSM is a considerably richer and more sophisticated estimation method than our
simple local thresholding estimator. However, studying more complex denoising methods in the NSCT
domain is beyond the scope of the present paper. Figure 12 displays the denoised images with both BLS-
GLM and NSCT-LAS methods. As the pictures show, the NSCT offers a slightly better reconstruction.
In particular, the table cloth texture is better recovered in the NSCT-LAS scheme.
We briefly mention that in denoising applications, one can reduce the redundancy of the NSCT by using
critically sampled directional filter banks over the nonsubsampled pyramid. This results in a transform
with J + 1 redundancy which is considerably faster. There is however a loss in performance as Table
IV shows. Nonetheless, in some applications, the small performance loss might be a good price to pay
given the reduced redundancy of this alternative construction.
3The PSNR values of the BivShrink method were obtained from the tables in [6]. In the results shown in [6] the authors do
not use the “Peppers” image as a test image, hence we do not have a BivShrink column for “Peppers”.
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 21
“Lena” image, PSNR (dB)
σ Noisy NSWT-HT Curvelets [32] NSCT-HT NSWT-LAS BivShrink [6] BLS-GSM [7] NSCT-LAS
10 28.13 34.26 34.17 34.69 35.19 35.34 35.59 35.46
20 22.13 31.40 31.52 32.03 32.12 32.40 32.62 32.50
30 18.63 29.66 30.01 30.35 30.30 30.54 30.84 30.70
40 16.13 28.37 28.84 29.10 29.01 - 29.58 29.38
50 14.20 27.41 27.78 28.10 28.00 - 28.61 28.34
“Barbara” image, PSNR (dB)
σ Noisy NSWT-HT Curvelets [32] NSCT-HT NSWT-LAS BivShrink [6] BLS-GSM [7] NSCT-LAS
10 28.17 31.58 32.28 33.01 33.40 33.35 34.03 34.09
20 22.15 27.23 28.89 29.41 29.45 29.80 30.28 30.60
30 18.63 25.10 26.93 27.24 27.22 27.65 28.11 28.56
40 16.14 24.02 25.51 25.79 25.76 - 26.58 27.12
50 14.20 23.37 24.31 24.79 24.72 - 25.43 26.02
“Peppers” image, PSNR (dB)
σ Noisy NSWT-HT Curvelets [32] NSCT-HT NSWT-LAS BLS-GSM [7] NSCT-LAS
10 28.17 33.71 33.59 33.81 34.35 34.63 34.41
20 22.15 31.19 31.13 31.60 31.65 32.06 31.82
30 18.63 29.43 29.45 30.07 29.95 30.41 30.19
40 16.13 28.09 28.01 28.85 28.65 29.20 28.95
50 14.20 27.04 26.70 27.82 27.62 28.24 27.93
TABLE III
DENOISING PERFORMANCE OF THENSCT. THE LEFT-MOST COLUMNS ARE HARD THRESHOLDING AND THE RIGHT-MOST
ONES SOFT ESTIMATORS. FOR HARD THRESHOLDING, THE NSCT CONSISTENTLY OUTPERFORMS CURVELETS AND THE
NSWT. THE NSCT-LAS PERFORMS ON A PAR WITH THE MORE SOPHISTICATED ESTIMATORBLS-GSM.
σ “Lena” “Barbara” “Peppers”
10 -0.23 -0.48 -0.32
20 -0.09 -0.44 -0.09
30 -0.05 -0.37 -0.01
40 -0.04 -0.34 -0.00
50 -0.05 -0.31 -0.01
TABLE IV
RELATIVE LOSS IN PSNRPERFORMANCE(DB) WHEN USING THENSPWITH A CRITICALLY SAMPLED DFB AND THE LAS
ESTIMATOR WITH RESPECT TO THENSCT-LAS METHOD.
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 22
(a) Original (b) NSWT-HT
(c) Curvelets-HT (d) NSCT-HT
Fig. 11. Image denoising with the NSCT and hard thresholding. The noisy intensity is20. (a) Original Lena image. (b) Denoised
with the NSWT-HT, PSNR = 31.40dB. (c) Denoised with the curvelet transform and hard thresholding, PSNR = 31.52dB. (d)
Denoised with the NSCT-HT, PSNR = 32.03dB.
B. Image Enhancement
Existing image enhancement methods amplify noise when they amplify weak edges since they cannot
distinguish noise from weak edges. In the frequency domain,both weak edges and noise produce low-
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 23
(a) Original (b) BLS-GSM (c) NSCT-LAS
Fig. 12. Comparison between the NSCT-LAS and BLS-GSM denoising methods. The noise intensity is 20. (a) Original Barbara
image. (b) Denoised with the BLS-GSM method of [7], PSNR = 30.28dB. (c) Denoised with NSCT-LAS, PSNR = 30.60dB.
magnitude coefficients. The NSCT provides not only multiresolution analysis, but also geometrical and
directional representation. Since weak edges are geometricstructures and noise is not, we can use this
geometric representation to distinguish them.
The NSCT is shift-invariant so that each pixel of the transformsubbands corresponds to that of the
original image in the same location. Therefore, we gather thegeometrical information pixel by pixel from
the NSCT coefficients. We observe that there are three classes of pixels: strong edges, weak edges, and
noise. First, the strong edges correspond to those pixels with large magnitude coefficients in all subbands.
Second, the weak edges correspond to those pixels with large magnitude coefficients in some directional
subbands but small magnitude coefficients in other directional subbands within the same scale. Finally,
the noise corresponds to those pixels with small magnitude coefficients in all subbands. Based on this
observation, we can classify pixels into three categories by analyzing the distribution of their coefficients
in different subbands. One simple way is to compute the mean (denoted bymean) and the maximum
(denoted bymax) magnitude of the coefficients for each pixel, and then classify it by
strong edge if mean ≥ cσ
weak edge if mean < cσ, max ≥ cσ
noise if mean < cσ, max < cσ
, (18)
wherec is a parameter ranging from1 to 5, andσ is the noise standard deviation of the subbands at a
specific level. We first estimate the noise variance of the inputimage with the robust median operator
[34] and then compute the noise variance of each subband [32].
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 24
The goal of image enhancement is to amplify weak edges and to suppress noise. To this end, we
modify the NSCT coefficients according to the category of each pixel by a nonlinear mapping function
(similar to [35])
y(x) =
x strong edge pixels
max(( cσ|x|)
p, 1)x, weak edge pixels
0 noise pixels
, (19)
where the inputx is the original coefficient, and0 < p < 1 is the amplifying ratio. This function keeps
the coefficients of strong edges, amplifies the coefficients of weak edges, and zeros the noise coefficients.
We summarize our enhancement method using the NSCT in the following algorithm:
1) Compute the NSCT of the input image forN levels.
2) Estimate the noise standard deviation of the input image.
3) For each level DFB,
a) Estimate the noise variance.
b) Compute the threshold and the amplifying ratio.
c) At each pixel, compute the mean and the maximum magnitude of all directional subbands at
this level, and classify it by (18) into strong edges, weak edges, or noise.
d) For each directional subband, use the nonlinear mapping function given in (19) to modify the
NSCT coefficients according to the classification.
4) Reconstruct the enhanced image from the modified NSCT coefficients.
We compare the enhancement results by the proposed algorithm with those by the NSWT. In the
experiments, we choosec = 4 and p = 0.3. Figure 13 shows the results obtained for the “Barbara”
image. We observe that our proposed algorithm does a better job in enhancing the weak edges in the
textures.
V. CONCLUSION
We have proposed a fully shift-invariant version of the contourlet transform, the nonsubsampled con-
tourlet transform. The proposed construction reduces the filter design problem to that of a nonsubsampled
pyramid filter bank and a nonsubsampled fan filter bank. We have addressed the 2-D design problem
using the mapping approach, thus overcoming the need of factorization. We also developed a lifting/ladder
structure for the 2-D NSFB. This structure, when coupled with the filters designed via mapping, provides
a very efficient implementation that under some additional conditions can be reduced to 1-D filtering
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 25
(a)
(b) (c)
Fig. 13. Image enhancement with the NSCT. (a) Original “Barbara” image. (b) Enhanced by the NSWT. (c) Enhanced by the
NSCT.
operations. Applications of our proposed transform in image denoising and enhancement were studied.
In denoising, we studied the performance of the NSCT when coupled with a hard thresholding estimator
and a local adaptive shrinkage. For hard thresholding, our results indicate that the NSCT provides better
performance than competing transform such as the NSWT and curvelets. Concurrently, our local adaptive
shrinkage results are competitive to other denoising methods. In particular, our results show that a
fairly simple estimator in the NSCT domain yields comparableperformance to state-of-the-art denoising
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 26
methods that are more sophisticated and complex. In image enhancement, the results obtained with the
NSCT are visually superior to those of the NSWT.
APPENDIX
A. Proof of Proposition 1
Consider the pyramid shown in Figure 2 (a). IfJ = 1, then we have that‖y0‖2 + ‖y1‖2 ≤ Bp‖x‖2.
Now, suppose we haveJ levels and assume∑J
j=0 ‖yi‖2 ≤ BJp ‖x‖2. Then if we further splityJ into y′J
andyJ+1, noting thatBp ≥ 1 we have that
J−1∑
j=0
‖yj‖2 + ‖y′J‖2 + ‖yJ+1‖2 ≤J−1∑
j=0
‖yj‖2 + Bp‖yJ‖2
≤ Bp
J−1∑
j=0
‖yj‖2 + ‖yJ‖2
≤ BJ+1p ‖x‖2.
Thus, by induction we conclude that∑J
j=0 ‖yi‖2 ≤ BJp ‖x‖2 for any J ≥ 1. A similar argument shows
that in the NSDFB withlj stages, one has that∑2lj−1
k=0 ‖yj,k‖2 ≤ Bljq ‖yj‖2 so that
‖yJ‖2 +J−1∑
j=0
2lj−1∑
k=0
‖yj,k‖2 ≤ ‖yJ‖2 +J−1∑
j=0
Bljq ‖yj‖2
≤ ‖yJ‖2 + Bmax {lj}q
J−1∑
j=0
‖yj‖2
≤ BJBmax {lj}q ‖x‖2.
The bound forA is proved similarly, by just reversing the inequalities.�
B. Proof of Proposition 2
We prove the claim for the case in which the zeros ofG(1D)(z) are distinct. The proof for the case of
repeated roots can be handled similarly.
Denote
G(1D)(z) = g0
n∏
i=1
(z − zi), g0 ∈ C. (20)
Then sufficiency follows by direct substitution of (10) in (20).
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 27
We prove necessity by induction. SupposeG(1D)(F (x1, x2)) = (x1 − c)N1L(x1, x2) for some poly-
nomial L(x1, x2). Note thatG(1D)(F (c, x2)) = 0 for all x2 if and only if F (c, x2) = zj for all x2 and
some zerozj of G(1D)(z). So, it follows that
(F (x1, x2) − zj)x1=c= 0 for all x2, (21)
which implies thatF (x1, x2) = zj + (x1 − c)L1(x1, x2) with L1(x1, x2) a polynomial. Suppose
F (x1, x2) = zj + (x1 − c)k−1Lk−1(x1, x2),
wherek ≤ N1. By successively applying the chain rule for differentiation we get that
0 =∂k−1G(1D)(F (x1, x2))
∂xk−11
∣∣∣∣∣x1=c
= G(1D)′(F (c, x2))∂k−1F (x1, x2)
∂xk−11
∣∣∣∣∣x1=c
.
BecauseF (c, x2) = zj and thezj ’s are distinct, we have thatG(1D)′(F (c, x2)) 6= 0 and then
∂k−1F (x1, x2)
∂xk−11
∣∣∣∣∣x1=c
= 0
⇒ ∂k−1F (x1, x2)
∂xk−11
= (x1 − c)Lk−1(x1, x2)
⇒ ∂F (x1, x2)
∂x1= (x1 − c)k−1Lk−1(x1, x2). (22)
Combining (22) with (21) we obtain thatF (x1, x2) = zj + (x1 − c)kLk(x1, x2). By induction we
conclude that
F (x1, x2) = zj + (x1 − c)N1LN1(x1, x2).
If G(1D)(F (x1, x2)) = (x2 − d)N2L(x1, x2) for some polynomialL(x1, x2) then a similar argument
shows that
F (x1, x2) = zi + (x2 − d)N2LN2(x1, x2),
wherezi is a zero ofG(1D)(z). Thus,
zi − zj = (x1 − c)N1LN1(x1, x2) − (x2 − d)N2LN2
(x1, x2),
and hence we havezj = zi. Therefore,
(x1 − c)N1LN1(x1, x2) = (x2 − d)N2LN2
(x1, x2),
and so
LN1(x1, x2) = (x2 − d)N2LF (x1, x2),
and (10) is established withN ′1 = N1 andN ′
2 = N2. �
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 28
C. Proof of Proposition 3
The proof follows the same lines as for the 1-D case (see e.g., [1] pp. 245). Using the identity
∞∏
k=1
(
1 + ej 2−kω
2
)N
=
(1 + ejω
ω
)N
we have that
Φ(ω) =
(1 + ejω1
ω1
)N1(
1 + ejω2
ω2
)N2 ∞∏
j=1
R0(2−j
ω).
BecauseH0 is continuously differentiable atω = 0, the same also holds forR0. Now write R0 in polar
coordinates(r, ϕ) wherer2 = |ω1|2 + |ω2|2. SinceR0 is continuously differentiable, for eachϕ, R0(r, ϕ)
is a continuously differentiable function ofr. SinceR0(0) = 1, from the mean value theorem we have
that for ǫ > 0, and0 ≤ r < ǫ,
|R(r, ϕ)| ≤ 1 + |∂rR(ρ, ϕ)| r ≤ 1 + Kr
whereK := sup{|R(ρ, ϕ)| : 0 ≤ ρ < ǫ, ϕ ∈ [0, 2π]} < ∞. This gives
0 ≤ r < ǫ =⇒∞∏
j=1
R0(2−j
ω) ≤∞∏
j=1
(1 + K2−jr) ≤ eKǫ,
where we have used the inequality1 + r ≤ er. Now chooseJ so that2J−1ǫ ≤ r ≤ 2Jǫ. We then obtain,
for r > ǫ,
∞∏
j=1
|R0(2−j
ω)| =J∏
j=1
|R0(2−j
ω)|∞∏
j=1
|R0(2−j−J
ω)| ≤ BJeKǫ ≤ C1rlog
2BeKǫ
so that for eachω ∈ R2,∞∏
j=1
R0(2−j
ω) ≤ eKǫ(
1 + C1rlog
2B)
= eKǫ(
1 + C1(|ω1|2 + |ω2|2)1
2log
2B)
.
Now, putting all together we obtain
|Φ(ω)| ≤ C21
|ω1|N1
1
|ω2|N2
(
1 + C1(|ω1|2 + |ω2|2)1
2log
2B)
≤ C3
(1
1 + |ω1|
)N1−log2B ( 1
1 + |ω2|
)N2−log2B
,
which completes the proof.�
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 29
ACKNOWLEDGMENT
We would like to thank Dr. Jean-Luc Starck for providing the curvelet denoising software and Jason
Laska for helping with the implementation of the NSCT.
REFERENCES
[1] S. Mallat, A Wavelet Tour of Sinal Processing, 2nd ed. Academic Press, 1999.
[2] M. Vetterli and J. Kovacevic, Wavelets and Subband Coding. Englewood Cliffs, NJ, USA: Prencitce Hall, 1995.
[3] E. P. Simoncelli, W. T. Freeman, E. H. Adelson, and D. J. Heeger,“Shiftable multiscale transforms,”IEEE Trans. Info.
Theory, vol. 38, no. 2, pp. 587–607, March 1992.
[4] R. R. Coifman and D. L. Donoho, “Translation invariant de-noising,” in Wavelets and Statistics, A. Antoniadis and
G. Oppenheim, Eds. New York: Springer-Verlag, 1995, pp. 125–150.
[5] S. G. Chang, B. Yu, and M. Vetterli, “Adaptive wavelet thresholdingfor image denoising and compression,”IEEE Trans.
Img. Proc., vol. 9, no. 9, pp. 1532–1546, September 2000.
[6] L. Sendur and I. W. Selesnick, “Bivariate shrinkage with local variance estimation,”IEEE Signal Processing Letters, vol. 9,
no. 12, pp. 438–441, December 2002.
[7] J. Portilla, V. Strela, M. J. Wainwright, and E. P. Simoncelli, “Image denoising using scale mixtures of Gaussians in the
wavelet domain,”IEEE Trans. Img. Proc., vol. 12, no. 11, pp. 1338–1351, 2003.
[8] M. J. Shensa, “The discrete wavelet transform: Wedding thea trous and Mallat algorithms.”IEEE Trans. Signal Proc.,
vol. 40, no. 10, pp. 2464–2482, October 1992.
[9] D. L. Donoho, “Wedgelets: nearly minimax estimation of edges,”Ann. Statist., vol. 27, no. 3, pp. 859–897, 1999.
[10] M. B. Wakin, J. K. Romberg, H. Choi, and R. G. Baraniuk, “Wavelet-domain approximation and compression of piecewise
smooth images,” inIEEE Trans. Img. Proc., to appear, 2005.
[11] E. L. Pennec and S. Mallat, “Sparse geometric image representation with bandelets,”IEEE Tras. Img. Proc., vol. 14, no. 4,
pp. 423–438, April 2005.
[12] E. J. Candes and D. L. Donoho, “New tight frames of curvelets and optimal representations of objects with piecewiseC2
singularities,”Comm. Pure and Appl. Math, vol. 57, no. 2, pp. 219–266, February 2004.
[13] M. N. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution image representation,”IEEE
Trans. Img. Proc., to appear, 2005,http://www.ifp.uiuc.edu/˜minhdo/publications .
[14] P. J. Burt and E. H. Adelson, “The Laplacian pyramid as a compact image code,”IEEE Trans. Communications, vol. 31,
no. 4, pp. 532–540, April 1983.
[15] M. N. Do and M. Vetterli, “Framing pyramids,”IEEE Trans. Signal Processing, vol. 51, no. 9, pp. 2329 – 2342, Sept.
2003.
[16] R. H. Bamberger and M. J. T. Smith, “A filter bank for the directional decomposition of images: Theory and design,”
IEEE Trans. Signal Processing, vol. 40, no. 4, pp. 882–893, April 1992.
[17] J. L. Starck, F. Murtagh, and A. Bijaoui,Image Processing and Data Analysis. Cambridge University Press, 1998.
[18] P. P. Vaidyanathan,Multirate Systems and Filterbanks. Prentice Hall, 1993.
[19] Z. Cvetkovic and M. Vetterli, “Oversampled filter banks,”IEEE Transactions on Signal Processing, vol. 46, no. 5, pp.
1245–1255, May 1998.
May 19, 2005 DRAFT
IEEE TRANSACTIONS ON IMAGE PROCESSING, MAY 2005 30
[20] H. Bolcskei, F. Hlawatsch, and H. G. Feichtinger, “Frame-theoretic analysisof oversampled filter banks,”IEEE Trans.
Signal Proc., vol. 46, no. 12, pp. 3256–3268, December 1998.
[21] J. H. McClellan, “The design of two-dimensional digital filters by transformation,” Proc. 7th Annual Priceton Conf.
Information Sciences and Systems, 1973.
[22] J. Kovacevic and M. Vetterli, “Nonseparable multidimensional perfect reconstruction filter banks and wavelet bases for
Rn,” IEEE Trans. Information Theory, vol. 38, no. 2, pp. 533–555, March 1992.
[23] S.-M. Phoong, C. W. Kim, P. P. Vaidyanathan, and R. Ansari, “Anew class of two-channel biorthogonal filter banks and
wavelet bases,”IEEE Transactions on Signal Processing, vol. 43, no. 3, pp. 649–661, March 1995.
[24] D. B. H. Tay and N. G. Kingsbury, “Flexible design of multidimensional perfect reconstruction FIR 2-band filters using
transformation of variables,”IEEE Trans. Img. Proc., vol. 2, no. 4, pp. 466–480, October 1993.
[25] A. L. Cunha and M. N. Do, “On two-channel filter banks with directional vanishing moments,”IEEE Trans. Image
Processing, submitted,http://www.ifp.uiuc.edu/˜minhdo/publications .
[26] S. Mitra and R. Sherwood, “Digital ladder networks,”IEEE Transactions on Audio and Electroacoustics, vol. AU-21, no. 1,
pp. 30–36, February 1973.
[27] W. Sweldens, “The lifting scheme: A custom-design construction ofbiorthogonal wavelets,”Appl. Comput. Harmon. Anal.,
vol. 3, no. 2, pp. 186–200, 1996.
[28] R. E. Blahut,Fast Algorithms for Digital Signal Processing. Addison-Wesley, 1985.
[29] R. Jia, “Approximation properties of multivariate wavelets,”Mathematics of Computation, vol. 67, pp. 647–665, 1998.
[30] I. Daubechies,Ten Lectures on Wavelets. Philadelphia, PA: SIAM, 1992.
[31] T. Cooklev, T. Yoshida, and A. Nishihara, “Maximally flat half-band diamond-shaped FIR filters using the bernstein
polynomial,” IEEE Tras. CAS-II, vol. 40, no. 11, pp. 749–751, Nov. 1993.
[32] J.-L. Starck, E. J. Candes, and D. L. Donoho, “The curvelet transform for image denoising,” IEEE Trans. Image Proc.,
vol. 11, no. 6, pp. 670–684, June 2002.
[33] D. D.-Y. Po and M. N. Do, “Directional multiscale modeling of imagesusing the contourlet transform,”IEEE Trans. Img
Proc., to appear, 2005,http://www.ifp.uiuc.edu/˜minhdo/publications .
[34] S. G. Chang, B. Yu, and M. Vetterli, “Spatially adaptive wavelet thresholding with context modeling for image denoising,”
IEEE Trans. Img. Proc., vol. 9, no. 9, pp. 1522–1531, September 2000.
[35] K. V. Velde, “Multi-scale color image enhancement,”Proc. IEEE Int. Conf. on Image Proc. (ICIP), vol. 3, pp. 584–587,
1999.
May 19, 2005 DRAFT