ieee transactions on image processing 1 …

IEEE TRANSACTIONS ON IMAGE PROCESSING 1

Multidimensional Directional Filter Banks andSurfacelets

Yue Lu, Student Member, IEEE, and Minh N. Do, Member, IEEE

Abstract— In 1992, Bamberger and Smith proposed the direc-tional filter bank (DFB) for an efficient directional decompositionof two-dimensional (2-D) signals. Due to the nonseparable natureof the system, extending the DFB to higher dimensions while stillretaining its attractive features is a challenging and previouslyunsolved problem. We propose a new family of filter banks,named NDFB, that can achieve the directional decompositionof arbitrary N -dimensional (N ≥ 2) signals with a simple andefficient tree-structured construction. In 3-D, the ideal passbandsof the proposed NDFB are rectangular-based pyramids radiatingout from the origin at different orientations and tiling theentire frequency space. The proposed NDFB achieves perfectreconstruction via an iterated filter bank with a redundancyfactor of N in N -D. The angular resolution of the proposedNDFB can be iteratively refined by invoking more levels ofdecomposition through a simple expansion rule. By combining theNDFB with a new multiscale pyramid, we propose the surfacelettransform, which can be used to efficiently capture and representsurface-like singularities in multidimensional data.

Index Terms— High dimensional transforms, directional filterbanks, filter design, directional decomposition, surfacelets.

I. INTRODUCTION

With the growing capabilities of modern computers andimaging devices, high-resolution 3-D and even higher di-mensional volumetric data are increasingly available in awide gamut of applications, including biomedical imaging,seismic imaging, extragalactic astronomy, computer vision,and video processing and compression. To efficiently analyzeand represent such huge amount of data, we need to create andemploy new tools from various fields of engineering, includingsignal processing. In this work, we present a new set of tools,namely the N -dimensional directional filter banks (NDFB) andsurfacelets, that can capture and represent signal singularitieslying on smooth surfaces. Such singularities are often observedin 3-D medical signals, where the signals are mostly smoothexcept on some boundary surfaces, and in video signals, inwhich moving objects carve out smooth surfaces in the 3-Dspatial/temporal space.

For 2-D signals, the similar problem of capturing singulari-ties along smooth curves has been extensively studied. Withoutclaiming to be exhaustive, we would like to mention a fewexamples, including the steerable pyramid [1], the directionalfilter bank [2], 2-D directional wavelets [3], curvelets [4],

Y. Lu is with the Department of Electrical and Computer Engineeringand the Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana IL 61801 (e-mail: [email protected]).

M. N. Do is with the Department of Electrical and Computer Engineering,the Coordinated Science Laboratory, and the Beckman Institute, University ofIllinois at Urbana-Champaign, Urbana IL 61801 (e-mail: [email protected]).

This work was supported by the US National Science Foundation underGrant CCR-0237633 (CAREER).

0

6

3210

5

6

7

3 2

5

74

1

4

ω1

ω2

(π, π)

−(π, π)

(a)

��

��

��

ω1

ω2

ω3

(π, π, π)

−(π, π, π)

(b)

Fig. 1. (a) Frequency partitioning of the directional filter bank with 3 levelsof decomposition. There are 23 = 8 real wedge-shaped frequency bands. (b)Frequency partitioning of the proposed NDFB in 3-D. The ideal passbands ofthe component filters are rectangular-based pyramids radiating out from theorigin at 3 × 2l (l ≥ 0) different orientations and tiling the entire frequencyspace.

complex wavelets [5], [6], contourlets [7], bandelets [8], andshearlets [9].

Among all these 2-D representations, one approach ofparticular interest to us is the directional filter bank (DFB),which was originally proposed by Bamberger and Smith [2]and subsequently improved by several authors [10]–[16]. TheDFB is efficiently implemented via an l-level tree-structureddecomposition that leads to 2l subbands with wedge-shapedfrequency partitioning as shown in Figure 1(a). Meanwhile, theDFB is a non-redundant transform, and offers perfect recon-struction, i.e., the original signal can be exactly reconstructedfrom its decimated channels. The directional-selectivity and ef-ficient structure of the DFB makes it an attractive candidate formany image processing applications. By combining the DFBwith the Laplacian pyramid, Do and Vetterli [7] constructedthe contourlets, which provides a directional multiresolutiontransform for sparse image representation.

The major contribution of this work is extending the DFBto higher dimensions. For example, in 3-D, we want to achievethe frequency partitioning as shown in Figure 1(b), wherethe ideal passbands of the component filters are rectangular-based pyramids radiating out from the origin at differentorientations and tiling the entire frequency space. We can seethis is a natural extension from the wedge-shaped frequencypartitioning in 2-D.

Filters with this type of pyramid-shaped frequency supportscan be used in selective filtering of plane wave signals basedon their directions of arrival [17], as well as efficiently cap-turing signal singularities (discontinuities) living on surfaceswith different normal directions. For example, consider an N -D continuous plane-wave signal x(t), completely determinedby some 1-D function x(1D)(·), i.e.,

x(t) = x(1D)(dT · t),

2 IEEE TRANSACTIONS ON IMAGE PROCESSING

d

x(1D)

Fig. 2. A plane wave with a step profile x(1D) along direction d.

where t = (t1, . . . , tN )T , and d is a unit norm directionalvector. As illustrated in Figure 2, when x(1D) is a step func-tion, the plane wave x(t) then represents an ideal singularityacross a surface of co-dimension 1 in R

N . We can verify thatthe region of support of the Fourier transform of x(t) lies ona straight line radiating out from the origin and oriented indirection d. If we decompose a sampled version of x(t) usingNDFB, then most of the signal energy will be captured by onlya few pyramid-shaped subbands whose directions are alignedwith d.

Unlike the separable wavelets [18]–[20], whose multidi-mensional generalizations are simply the tensor products oftheir 1-D counterparts, the DFB has a much more involvednon-separable construction. Extending the DFB to higherdimensions while still retaining its various attractive features isa challenging and, to our best knowledge, previously unsolvedproblem. In this paper, we propose a new family of filter banks,named NDFB, with the following distinctive properties:

1) Directional decomposition. The NDFB decomposes N -D (N ≥ 2) signals into directional subbands, as shownin Figs. 1(a) and 1(b) for 2-D and 3-D, respectively. Ingeneral, the ideal passbands of the NDFB in higher di-mensions are hypercube-based hyperpyramids radiatingout from the origin.

2) Construction. The NDFB has an efficient tree-structured implementation using iterated filter banks.

3) Angular resolution. The number of directional sub-bands can be increased by iteratively invoking morelevels of decomposition through a simple expansionrule. In general, there can be N × 2 l (l ≥ 0) differentdirectional subbands in the N -D case.

4) Perfect reconstruction. The original signal can beexactly reconstructed from its transform coefficients inthe absence of noise or other processing.

5) Small redundancy. The NDFB is N -times expansive inthe N -dimensional case.

The quest for extending the DFB to higher dimensions hasbeen considered by several researchers in the past. In [10],Bamberger proposed a 3-D subband decomposition schemeimplemented by applying the checkerboard filter banks sep-arately along two orthogonal signal planes followed by a 2-D DFB decomposition on one of the planes. However, theresulting passband shapes are 3-D triangular prisms and donot correspond to a single dominant direction. Meanwhile,the angular resolution can only be refined along one of theaxes. In [12], Park proposed a 3-D velocity selective filterbank by applying two 2-D DFBs separately along two signalplanes. The resulting frequency partitioning is similar to that ofNDFB. However, that construction has a redundancy factor of2l for l-levels of decomposition. We would like to emphasize

that our proposed NDFB has a relatively small redundancyratio of 3 in 3-D, independent of the number of decompositionlevels. More importantly, our NDFB construction can be easilygeneralized to arbitrary high dimensions.

The outline of the paper is as follows. We introduce thenecessary notations in Section II as well as review someconcepts in multidimensional multirate systems. We thenpresent our main results in two stages. Section III deals withthe construction and properties of the proposed NDFB in 3-D. The generalization to arbitrary N -dimensional (N ≥ 2)cases is given in Section IV. We discuss filter design andimplementation issues in Section V. To efficiently capturelocal surface singularities with different sizes, in Section VI wecombine the NDFB with a multiscale decomposition and con-struct the surfacelet transform, whose basis images are localsurface patches spanning a wide range of normal directions,spatial locations, and scales. We present experimental resultsin Section VII and conclude the paper in Section VIII withsome discussions.

II. PRELIMINARIES

Since the major part of this work is about constructing amultidimensional filter bank, we start by reviewing some ofthe concepts and results in multidimensional multirate systems[21], [22], [19].

Notations: Throughout the paper, N represents the dimen-sion of the signals. We are interested in cases when N ≥ 2.We use lower-case letters, e.g., x[n] to denote N -D discretesignals, where n

def= (n1, n2, . . . , nN)T is an integer vector.The discrete-time Fourier transform of a multidimensionalsignal is defined as

X(ω) =∑

n∈ZN

x[n]e−jωT n.

In multidimensional multirate signal processing, the sam-pling operations are defined on lattices. A lattice in N -dimension is represented by an N × N nonsingular integermatrix M as

LAT(M ) = {m : m = Mn, n ∈ ZN}.

For an M -fold downsampling (or upsampling) operation,the input x[n] and the downsampling output yd[n] (or theupsampling output yu[n]) are related by

yd[n] = x[Mn] and yu[n] =

{x[M−1n] if n ∈ LAT(M )0 otherwise

,

respectively. For the upsampling case, the Fourier transformsof the input and output are related by

Yu(ω) = X(MT ω).

Multirate identities [22] are often useful in analyzing mul-tidimensional multirate systems. The identity for the analysispart of the filter bank is shown in Figure 3; the one forthe synthesis part can be inferred similarly. Downsamplingby M followed by filtering with a filter H(ω) is equivalentto filtering with the filter H(M T ω), which is obtained byupsampling H(ω) by M , before downsampling.

LU AND DO: NDFB AND SURFACELETS 3

M

M H(ω)

H(MT ω)

≡

Fig. 3. The multidimensional multirate identity for interchanging the orderof downsampling and filtering.

1,1

1,1

1

1

1

1

ω1

ω2

ω3

Fig. 4. The pyramid-shaped subband support (dark region) is the intersectionof two wedge-shaped supports (gray regions).

III. THE NDFB IN 3-D

To simplify the exposition, we first discuss the proposedNDFB in 3-D. Once the construction in the 3-D case is clear,the generalization to arbitrary N -dimensional (N ≥ 2) casescan easily be developed. We focus on the analysis part ofthe NDFB, since the synthesis part is exactly symmetric.Meanwhile, since we are mainly interested in the passbandand stopband regions of the filters, we assume all the filtersused in this section and Section IV are ideal, i.e., the filter fre-quency responses are indicator functions of the ideal passbandsupports. In Section V, we design real filters that approximatethese conditions.

A. Key Ideas

Figure 4 illustrates one of the key ideas of NDFB. Thedark region in the figure shows the pyramid-shaped support infrequency that we want to get. We can see that the pyramidsupport can be obtained as an intersection of two wedgesupports shown in gray regions. This observation leads to theidea of obtaining pyramid-shaped supports in 3-D DFB by aconcatenation of two 2-D DFB’s on appropriate dimensions.

Figure 5(a) shows a wedge-shaped decomposition of the 3-D frequency spectrum. This decomposition can be achieved byapplying a 2-D filter bank, e.g., the 2-D DFB with frequencydecomposition as shown in Figure 1(a), on every (n 1, n2)-slicein the 3-D signal. For notations, we use

W(l2)i (ω1, ω2), 0 ≤ i < 2l2 ,

to denote the ideal filter whose frequency support is onthe ith wedge in Figure 5(a). The superscript (l2) indicatesthat there are 2l2 wedge subbands (in this case, l2 = 2)oriented at angles from −45◦ to 45◦. The frequency variablesω1 and ω2 specify that the 2-D filters operate on (n1, n2)-slices. Similarly, we show in Figure 5(b) the wedge-shapedfrequency decomposition along the (ω1, ω3)-plane. With the

0 1 2 3

0123

ω1

ω2

ω3

(a)

01

23

12

3

0

ω1

ω2

ω3

(b)

0,0 1,0 2,0 3,0

0,1 1,1 2,1 3,1

0,2 1,2 2,2 3,2

0,3 1,3 2,3 3,3

3,3 2,3 1,3 0,3

3,2 2,2 1,2 0,2

2,1 1,1 0,1

3,0 2,0 1,0 0,03,1

ω1

ω2

ω3

(c)

Fig. 5. (a) The ideal wedge-shaped frequency support of a 2-D filter operatingalong the (n1, n2)-plane. (b) The wedge-shaped support of a 2-D filter alongthe (n1, n3)-plane. (c) The ideal pyramid-shaped frequency decomposition.

same notation above, we can use W(l3)j (ω1, ω3) (0 ≤ j < 2l3)

to represent the corresponding ideal subband filters.Now by taking the pairwise intersection of the wedge

supports from Figure 5(a) and Figure 5(b), we can get 16“thinner” pyramid supports, as shown in Figure 5(c). In gen-eral, the hourglass-shaped region can be divided into 2 l2 ×2l3

(l2, l3 ≥ 0) different square-based pyramids radiating out fromthe origin. As illustrated in Figure 5(c), each subband can beindexed by a pair of integers (i, j) specifying the square baseof the pyramid. We use

P(l2,l3)i,j (ω1, ω2, ω3), 0 ≤ i < 2l2 and 0 ≤ j < 2l3 ,

to represent the ideal pyramid filters.With these notations, it is easy to check that (see Lemma 1

in Section IV)

P(l2,l3)i,j (ω1, ω2, ω3) = W

(l2)i (ω1, ω2) · W (l3)

j (ω1, ω3), (1)

for all l2, l3 ≥ 0 and 0 ≤ i < 2l2 , 0 ≤ j < 2l3 .Usually, we say a multidimensional filter F (ω) is separable,when it can be written as the product of several 1-D filters,i.e., F (ω) =

∏Ni=1 Fi(ωi). Here, we extend this notion

to Kth-order generalized separability, to describe those N -dimensional filters that can be represented as the product ofseveral filters of dimensions up to K , with 0 < K < N . Wenote that the pyramid filters P

(l2,l3)i,j (ω1, ω2, ω3) defined above

are second-order generalized separable.At this point, a natural question is: since the pyramid filters

are the products of two wedge filters, can we achieve thepyramid-shaped frequency decomposition by directly applyingtwo 2-D DFBs separately along the (n1, n2) and (n1, n3)signal planes? This idea was explored by Park in [12], butat a high price.

Here is why. The flow of operations by applying two 2-DDFB’s sequentially along two signal planes can be illustratedas follows

→ W(l2)i (ω1, ω2) → (↓ m1)︸︷︷︸

along n1

→(↓ m2)︸︷︷︸along n2

→ •

• → W(l3)j (ω1, ω3) → (↓ m1)︸︷︷︸

along n1

→ (↓ m2)︸︷︷︸along n3

,

(2)

where W(l2)i (ω1, ω2) and W

(l3)j (ω1, ω3) are two wedge-

shaped filters from the two DFBs, respectively. In a critically-sampled filter bank decomposition, filtering is always followedby downsampling. It can be shown [13] that the equivalentdownsampling matrix for an l2-level DFB operating along


the (n1, n2)-plane is a diagonal matrix, implemented sepa-rately as downsampling in the n1-dimension by m1 = 2followed by downsampling in the n2-dimension by m2 =2(l2−1). From the multirate identities introduced in Section II(Figure 3), the downsampling by 2 along the n 1 dimensionscrambles the wedge-shaped frequency decomposition pro-vided by W

(l3)j (ω1, ω3). Thus it can be easily checked that the

subsequent application of W(l3)j (ω1, ω3) will not provide the

desired pyramid-shaped frequency decomposition as shown inFigure 5(c).

To overcome this problem, Park [12] proposed to upsampleand interpolate the decimated outputs of the first DFB tothe original size before feeding them to the second DFB.With this step, the DFB essentially becomes a nonsubsampledfilter bank, and hence the generalized separability describedin (1) can be applied. However, this scheme leads to ahighly redundant system (2l-times redundant for l levels ofdecomposition).

In the following, we propose a new filter bank structure, thatcan make use of the generalized separability property with asimple expansion rule, but without the excessive redundancy.Similar to (2), this new construction also contains the con-catenation of two 2-D filter banks along two signal planes.However, a distinctive feature is that all the downsamplingoperation for the first 2-D filter bank is done on the n2-dimension only (i.e. m1 = 1). This ensures that the subsequent2-D filter bank, which operates on the (n1, n3) dimensions,will not be affected by the downsampling of the previous filterbank.

B. First Level: the Hourglass Filter Bank

To obtain the first level of decomposition, we employ athree-channel undecimated filter bank shown in Figure 6. Thisfilter bank decomposes the 3-D frequency spectrum of theinput signal into three hourglass-shaped subbands, with theirdominant directions aligned with the ω1, ω2, and ω3 axes,respectively.

Despite the redundancy it brings in, the undecimated hour-glass filter bank in this step offers several important advantagesover a decimated filter bank. As will be seen shortly, theundecimated 3-D filter bank allows subsequent levels of theNDFB to be implemented by using only 2-D filter banks.Meanwhile, it is easier to design an undecimated filter bankwith perfect reconstruction than a decimated one, since theformer imposes a smaller set of constraints.

We should make one simplification before describing furtherlevels of decomposition. By geometric symmetry, we onlyneed to consider subsequent decomposition steps after the topbranch of the hourglass filter bank, i.e., the y1 subband inFigure 6 whose dominant direction is along the ω1 axis. Thedecomposition after the two lower branches can be constructedby permuting the three dimensions, e.g., (n1, n2, n3) →(n2, n3, n1) and (ω1, ω2, ω3) → (ω2, ω3, ω1), from the corre-sponding channels in the top branch. This applies to both thesampling matrices and the filters used in the decomposition.

analysis synthesis

x x

y1

y2

y3

ω1

ω2

ω3

H1(ω1, ω2, ω3)

H2(ω1, ω2, ω3)

H3(ω1, ω2, ω3)

G1(ω1, ω2, ω3)

G2(ω1, ω2, ω3)

G3(ω1, ω2, ω3)

Fig. 6. The first level of decomposition: a three-channel undecimated filterbank in 3-D. The ideal frequency-domain supports of the component filtersare hourglass-shaped regions, with their corresponding dominant directionsaligned with the ω1, ω2, and ω3 axes, respectively.

C. Subsequent Levels of Decomposition

Figure 7 shows the block diagram of subsequent levels ofdecompositions on one of the three branches. After the 3-D hourglass filter (P 0,0

0,0 (ω)), we sequentially decompose thesignal by two 2-D filter banks, with the first one, denoted asIRC(l2)

12 , operating along the (n1, n2)-plane and the secondone, IRC(l3)

13 , along the (n1, n3)-plane.The 2-D filter bank IRC(l2)

12 , which stands for Iteratively Re-sampled Checkerboard filter bank, has a binary-tree structurewith l2 (≥ 0) levels of decomposition, and therefore has 2 l2

different output branches. The second filter bank IRC (l3)13 has

the same construction as IRC(l2)12 , but operates along a different

signal plane, i.e, (n1, n2) → (n1, n3), and with a differentdecomposition depth, i.e. l2 → l3. Note that we attach anIRC(l3)

13 to every output channel of IRC(l2)12 , so we have a total

of 2l2+l3 output channels in Figure 7. Postponing the detailedconstruction of the IRC filter banks to the next section, herewe first study the conditions the IRCs must satisfy so thatthe overall filter bank in Figure 7 indeed achieves the desiredpyramid-shaped frequency decomposition given in Figure 5(c).

Theorem 1: We index the 2l2+l3 output channels in NDFBfrom top to bottom with a pair of integers (i, j), where 0 ≤ i <

2l2 and 0 ≤ j < 2l3 . Meanwhile, we use P(l2,l3)i,j (ω1, ω2, ω3)

to denote the ideal equivalent subband filter at the (i, j)thchannel, and P

(l2,l3)i,j (ω1, ω2, ω3) for the ideal pyramid-shaped

filters. We have

P(l2,l3)i,j (ω1, ω2, ω3) = P

(l2,l3)i,j (ω1, ω2, ω3), for all i, j and l2, l3,

if the IRC filter banks satisfy the following two conditions:

1) Equivalent sampling matrices. Denote M(l2)k as the

overall sampling matrix for the k-th (0 ≤ k < 2 l2)channel in the l2-level filter bank IRC(l2)

12 , then

M(l2)k =

(1 00 2l2

); (3)

2) Wedge refinement. Denote F(l2)k (ω1, ω2) as the equiva-

lent filter for the k-th (0 ≤ k < 2l2) channel in IRC(l2)12 ,

then

W(l2)k (ω1, ω2) = W

(0)0 (ω1, ω2) · F (l2)

k (ω1, ω2), (4)

for all l2 ≥ 0.


subbands

The Hourglass Filter 2−D Filter Bank

2−D Filter Bank

2−D Filter Bank

subbands

x[n] y[n]P 0,00,0 (ω)

zi[n]

zi,j [n]

n1

n2

n3

2l22l2+l3

IRC(l2)12

IRC(l3)13

IRC(l3)13

Fig. 7. One branch of the proposed filter bank structure of the NDFB in 3-D. The input signal x[n] first goes through the 3-D hourglass filter P(0,0)0,0 (ω).

The output y[n] is then fed into a 2-D filter bank, denoted by IRC(l2)12 , which operates on the (n1, n2)-planes. The l2-level tree-structured filter bank

IRC(l2)12 produces 2l2 output subbands, denoted as zi[n] for 0 ≤ i < 2l2 . Each output is then fed into another 2-D filter bank IRC

(l3)13 operating on the

(n1, n3)-planes. In the end, we get 2l2+l3 outputs, represented by zi,j [n] for 0 ≤ i < 2l2 and 0 ≤ j < 2l3 .

Proof: The flow of operations in each channel of Figure 7can be illustrated as follows

→ P(0,0)0,0 (ω1, ω2, ω3) → F

(l2)i (ω1, ω2) → (↓ M

(l2)i )︸︷︷︸

on (n1,n2) planes

→ •

• → F(l3)j (ω1, ω3) → (↓ M

(l3)j )︸︷︷︸

on (n1,n3) planes

.

In calculating the overall equivalent filter of that channel, thefilter F

(l3)j (ω1, ω3) from the second filter bank IRC(l3)

13 will notbe affected by the downsampling operation in the first filterbank IRC(l2)

12 , because by the sampling matrices condition in(3), the sampling matrix M

(l2)i of IRC(l2)

12 is a diagonal matrixwith its first element (along the n1 dimension) being one.Therefore, we have

P(l2,l3)i,j (ω1, ω2, ω3) = P

(0,0)0,0 (ω1, ω2, ω3)

· F (l2)i (ω1, ω2) · F (l3)

j (ω1, ω3), (5)

for all l2, l3 ≥ 0 and 0 ≤ i < 2l2 , 0 ≤ j < 2l3 . Using theequality in (1), we can decompose the hourglass filter in (5)as the product of two wedge filters:

P(0,0)0,0 (ω1, ω2, ω3) = W

(0)0 (ω1, ω2) · W (0)

0 (ω1, ω3).

Now (5) can be rewritten as

P(l2,l3)i,j (ω1, ω2, ω3) =

(W

(0)0 (ω1, ω2) · F (l2)

i (ω1, ω2))

·(W

(0)0 (ω1, ω3) · F (l3)

j (ω1, ω3))

. (6)

It then follows from the wedge refinement condition given in(4) that

P(l2,l3)i,j (ω1, ω2, ω3) = W

(l2)i (ω1, ω2) · W (l3)

j (ω1, ω3)

= P(l2,l3)i,j (ω1, ω2, ω3),

where the second equality is due to (1) again.

CBF

0

1

Analysis Synthesis

D2D2

D2D2 R0R0

R1R1

F0(ω1, ω2)

F1(ω1, ω2)

(a) The building block of IRC(l2)12 : a resampled checkerboard filter bank.

1

0

1

CBF

CBF

0

1

0

1

CBF

CBF

0

1

CBF

levels of decomposition

Resampling

0

l2

U(l2)0

U(l2)1

U(l2)

2(l2)−2

U(l2)

2(l2)−1

(b) The construction of the filter bank IRC(l2)12 .

Fig. 8. (a) A two-channel 2-D checkerboard filter bank with resampling.The dark regions represent the ideal passband. (b) The filter bank IRC

(l2)12 is

an l2-level tree-structured expansion of the checkerboard filter bank (denotedby CBF) given in (a), with resampling matrices U

(l2)k attached at the end of

each channel.

D. The Construction and Properties of the IRC Filter Banks

In this section we present a detailed construction of the 2-Dfilter banks IRC(l2)

12 , and show that they indeed satisfy the twoconditions required in Theorem 1.

When l2 = 0, the filter bank IRC(0)12 is simply the identity

transform. We need to consider this degenerate case, sincesometimes we just want to decompose the 3-D hourglasssupport along only one, e.g. the ω3, direction.

When l2 ≥ 1, the analysis part of the filter bank IRC(l2)12 is

constructed as an l2-level binary tree by recursively attaching a


copy of the diagram contents enclosed by the dashed rectanglein Figure 8(a) to every output channels from the previous level(see Figure 8(b)). Furthermore, we attach a resampling matrixU

(l2)k (0 ≤ k < 2l2) to each of the 2l2 output channels of the

tree.As illustrated in Figure 8(a), the building block of the

tree is a two-channel 2-D filter bank with a checkerboard-shaped frequency partition. Note that we need to attach tworesampling operations, denoted as R0 and R1, to channel0 and channel 1, respectively. The sampling matrices inFigure 8(a) and Figure 8(b) are defined as

D2 =(

1 00 2

), R0 =

(1 10 1

), R1 =

(1 −10 1

),

and U(l2)k = R2l2−1−2k

1 .We can index the channels of the IRC(l2)

12 from top to bottomwith integers from 0 to 2l2 − 1. Associated with each channelindexed by k (0 ≤ k ≤ 2l2 − 1) is a sequence of pathtypes (t1, t2, . . . , tl2), where a type ti is either 0 for the upperbranch or 1 for the lower branch, as shown in Figure 8(a) andFigure 8(b). According to the expanding rule, (t 1, t2, . . . , tl2)is the binary representation of k, or

k =l2∑

i=1

ti2l2−i, ti ∈ {0, 1}.

With this path type, the sequence of filtering and downsam-pling for channel k can be written as

→ Ft1 → (↓ D2 Rt1) → Ft2 → (↓ D2 Rt2) → . . .

→ Ftl2→ (↓ D2 Rtl2

) → (↓ U(l2)k ),

where F0 and F1 are the two component filters in the 2-Dcheckerboard filter bank as shown in Figure 8(a).

From this, using the multirate identities recursively, we cantransform the analysis side of the channel k (0 ≤ k < 2 l2)of the IRC(l2)

12 into a single filtering with the equivalent filterF

(l2)k (ω) followed by downsampling by the overall sampling

matrix M(l2)k . The following two propositions verify that the

equivalent filters and sampling matrices indeed satisfy theconditions required in Theorem 1.

Proposition 1: The overall sampling matrix for the k-th(0 ≤ k < 2l2) channel in the l2-level filter bank IRC(l2)

12 is

M(l2)k =

(1 00 2l2

). (7)

Proof: See Appendix A.Remark: The sampling matrix M

(l2)k has a determinant

of 2l2 , which makes the l2-level 2-D filter bank IRC(l2)12

maximally-decimated (nonredundant). However, the key is thatall the downsampling operation is done on the n2-dimension,and this ensures the following filter bank IRC(l3)

13 , whichoperates on the (n1, n3) dimensions, will not be affected bythe downsampling. Actually, this is one of the distinctivefeatures of our proposed filter bank in comparison to other3-D constructions [10], [12].

In terms of the wedge refinement condition in (4), we wantto verify that the wedge-shaped frequency support as shown

ω1

ω2

(π, π)

−(π, π)

(a)

ω1

ω2

(π, π)

−(π, π)

(b)

ω1

ω2

(π, π)

−(π, π)

(c)

Fig. 9. (a) The ideal frequency support of W(0)0 (ω1, ω2). (b) The ideal

checkerboard-shaped frequency support of F0(ω1, ω2). (c) Multiplying thesupports in (a) and (b), we get the desired wedge-shaped frequency supportof W

(1)0 (ω1, ω2).

in Figure 5(a) can be achieved by applying the 2-D filterbank IRC(l2)

12 . The special case when l2 = 1 is illustratedin Figure 9, where a wedge-shaped support W

(0)0 (ω1, ω2)

(Figure 9(a)) is divided by the checkerboard filter (Figure 9(b))provided by IRC(1)

12 . The result is a “thinner” wedge supportW

(1)0 (ω1, ω2) shown in Figure 9(c). We can show a general

result holds for l2 ≥ 0, as stated in the following proposition.Proposition 2: Assume the l2-level filter bank IRC(l2)

12 usesideal filters. The equivalent filter F

(l2)k (ω1, ω2) for the k-

th channel, 0 ≤ k < 2l2 , satisfies the wedge refinementcondition, i.e.,

W(l2)k (ω1, ω2) = W

(0)0 (ω1, ω2) · F (l2)

k (ω1, ω2),

for all l2 ≥ 0.Proof: See Appendix B.

IV. THE NDFB IN ARBITRARY N -DIMENSIONAL (N ≥ 2)CASES

In this section, we consider the NDFB construction in thegeneral N -D cases for (N ≥ 2). It turns out that the extensionis surprisingly simple and requires no further filter bank designbeyond the 3-D case.

A. N-D Directional Frequency Supports

In the general N -dimensional case, the ideal frequencysupports of the NDFB are hypercube-based hyperpyramids.Analogous to the 3-D case, the first level of decompositionfor NDFB is achieved by an N -channel undecimated filterbank, whose component filters are N -D “hourglass”-shapedfilter aligned with the ω1, . . . , ωN axes, respectively. For thespecial case when N = 2, this is just a 2-channel filter bankwith a frequency partitioning as shown in Figure 9(a).

From the geometric symmetry among the N -branches,we only need to consider subsequent decompositions onthe first branch, while the construction on the otherbranches can be inferred via a circular shift of vari-ables. For instance, we can obtain the construction forthe ith (i ≥ 2) branch by changing the variables asfollows: (ω1, . . . , ωN ) → (ωi, . . . , ωN , ω1, . . . , ωi−1) and(n1, . . . , nN) → (ni, . . . , nN , n1, . . . , ni−1).

To describe the high dimensional passband shapes, we alsoneed to introduce some new notations. First, we evenly divide[−π, π) into 2l (l ≥ 0) line segments, and let B

(l)k to denote

the kth segment, i.e.,

B(l)k

def=[−π +

2π

2lk,−π +

2π

2l(k + 1)

), for 0 ≤ k < 2l.


Hyperpyramids: We can represent those hypercube-basedhyperpyramid frequency supports aligned with the ω 1 axis bythe following sets

P(l)k

def={ω = α (π, b2, . . . , bN )T : |α| ≤ 1 and bi ∈ B

(li)ki

},

where k = (k2, . . . , kN ), l = (l2, . . . , lN ), and 0 ≤ ki < 2li

for i = 2, . . . , N . In the 3-D case (i.e. N = 3) and withl2 = l3 = 2, we can easily verify that the P (l)

k defined aboveare just the set-theoretic representations of the 16 differentpyramid-shaped support regions shown in Figure 5(c).

The ideal pyramid filter can be defined as

P(l)k (ω) def= 1P(l)

k

(ω), for ω ∈ [−π, π)N , (8)

where 1S(·) is the indicator function on the set S; i.e.,1S(ω) = 1 if ω ∈ S and 1S(ω) = 0 if otherwise.

Wedges: The frequency support of the 2-D wedge filtersoperating along the (ω1, ωi) plane (i = 2, . . . , N ) can berepresented as

W(li)ki

def={(ω1, ωi) = α (π, bi) : |α| ≤ 1 and bi ∈ B

(li)ki

},

(9)and hence the corresponding ideal 2-D wedge filter can bedefined as

W(li)ki

(ω1, ωi)def= 1W(li)

ki

(ω1, ωi), for (ω1, ωi) ∈ [−π, π)2.

For example, W(2)k2

(ω1, ω2) (with k2 = 0, 1, 2, 3) representthe four 2-D wedge filters whose supports are shown inFigure 5(a).

As an N -D generalization of the equality given in (1), wehave the following result, simply stating that the ideal N -Dhyperpyramid filters can be factorized as the product of N −1ideal 2-D wedge filters.

Lemma 1: The hyperpyramid filters defined in (8) are 2nd-order generalized separable. Specifically,

P(l)k (ω1, . . . , ωN) =

N∏i=2

W(li)ki

(ω1, ωi),

for all l = (l2, . . . , lN ), k = (k2, . . . , kN ) and 0 ≤ ki < 2li .Proof: See Appendix C.

B. The NDFB Construction in Higher Dimensions

We show in Figure 10 the construction of the analysis part ofthe proposed NDFB in the N -D case. The synthesis part (notshown in the figure) is exactly symmetric. After the first levelof decomposition, which is an N -D “hourglass”-shaped filterP

(0)0 (ω1, ω2, . . . , ωN ), the input signal is further decomposed

by a series of 2-D iteratively resampled checkerboard filterbanks IRC(li)

1i (i = 2, 3, . . . , N), where IRC(li)1i operates on 2-

D slices of the input signal represented by the dimension pair(n1, ni). Note that, starting from the second level, we attachan IRC filter bank to each output channel from the previouslevel, and hence the entire filter bank shown in Figure 10 hasa total of 2(l2+...+lN ) output channels.

For completeness, we state the following theorem, showingthat the N -D filter bank in Figure 10 indeed achieves the

hyperpyramid-shaped frequency decomposition as defined in(8).

Theorem 2: We index the 2(l2+...+lN ) output channels inFigure 10 from top to bottom with a sequence of N − 1integers k = (k2, . . . , kN ), where 0 ≤ ki < 2li . Meanwhile,we use P

(l)k (ω) to denote the equivalent subband filter at the

kth channel. Assume all filters are ideal, then we have

P(l)k (ω) = P

(l)k (ω),

for all l = (l2, . . . , lN ), k = (k2, . . . , kN ) with 0 ≤ ki < 2li .The proof is similar to the one for the 3-D case and is

omitted here. Again, the key ideas are to use the generalizedseparability of the hyperpyramid support given in Lemma 1and the two properties of the IRC filter banks, i.e., overalldownsampling matrices (Proposition 1) and wedge refinement(Proposition 2).

V. FILTER DESIGN AND IMPLEMENTATION ISSUES

In previous discussions, we assume all the filters usedare ideal. In this section, we will design real filters thatapproximate the desired frequency responses.

A. Designing the Hourglass Filter Bank

The first task is to design the undecimated filter bankwith perfect reconstruction and the desired hourglass-shapedfrequency decomposition. To simplify exposition, we willfocus on the 3-D case, while the design can be extended toarbitrary N -D cases straightforwardly.

Generally speaking, designing three and higher-dimensionalperfect reconstruction filter banks using FIR filters is a verychallenging task with few ready-to-use tools available. How-ever, we can greatly simplify the design problem if we donot confine ourselves to using FIR filters and work in thefrequency domain instead. As suggested by several indepen-dent work [23]–[25], multidimensional filters designed in thefrequency domain can achieve quite satisfactory performance.

In this paper, we also design our hourglass filter bankin the frequency domain. As shown in Figure 6, we useHi(ω1, ω2, ω3) and Gi(ω1, ω2, ω3) for i = 1, 2, 3 to representthe three analysis and synthesis filters in the hourglass filterbank, respectively. As the first step of simplification, weassume the three analysis filters are rotated versions of eachother, i.e.,

H2(ω1, ω2, ω3) = H1(ω2, ω3, ω1),

andH3(ω1, ω2, ω3) = H1(ω3, ω1, ω2).

The same constraint also applies to the synthesis filters. Mean-while, if the filter bank implements a tight frame expansion,we need the synthesis filters to be the time-reversed versionsof the corresponding analysis filters, i.e.,

Gi(ω) = Hi(−ω) = Hi(ω),

for i = 1, 2, 3, where the second equality comes from the sym-metry in the ideal frequency responses of H i(ω). Combining


subbandssubbandssubbands

P(0)0 (ω1, ω2, . . . , ωN )

2l2 2(l2+l3) 2(l2+...+lN )

IRC(l2)12 IRC(l3)

13 IRC(lN )1N

Fig. 10. The analysis part of the NDFB in the N -dimensional case. Only the top branch is shown, while the filter banks for the lower N − 1 branches canbe obtained by permuting the variables from the corresponding channels in the top branch. Note that only the first stage, filtering by P

(0)0 (ω1, ω2, . . . , ωN ),

is in N -D; all subsequent stages are 2-D operations. More specifically, IRC(li)1i operates on 2-D slices of the input N -D signal represented by the dimension

pair (n1, ni).

the above two constraints, we get the condition for perfectreconstruction as

H21 (ω1, ω2, ω3)+H2

1 (ω2, ω3, ω1)+H21 (ω3, ω1, ω2) = 1. (10)

Inspired by the work of Feilner et al. [24] on building 2-Dquincunx wavelets, we propose

H1(ω1, ω2, ω3)

=

√K(ω1, ω2, ω3)λ

K(ω1, ω2, ω3)λ + K(ω2, ω3, ω1)λ + K(ω3, ω1, ω2)λ,

(11)

where λ is a positive even integer and K(ω1, ω2, ω3) is apositive and 2π periodic function of ω1, ω2 and ω3. Wecan verify that the perfect reconstruction condition in (10)is satisfied by arbitrary choices of λ and K(ω1, ω2, ω3). Tocontrol the filter frequency responses so that they approximatethe desired hourglass shape, we let

K(ω1, ω2, ω3) = E (ω1, ω2) · E(ω1, ω3),

where E(·, ·) is a bivariate 2π periodic function such thatE (ω1, ω2) is approximately one in the dark region in Fig-ure 11(a) and zero in the white region, with smooth transitionregions between the two. There can be many ways in designingE(·, ·). In our experiment, we employ the windowing methodproposed by Tay and Kingsbury [26], in which we truncate theideal “sinc”-like 2-D signal corresponding to the fan-shapedsupport in Figure 11(a) with a smooth Kaiser window and takeE(ω1, ω2) to be the Fourier transform of the truncated signal.The parameter λ in (11) can be used to adjust the sharpnessof the frequency response. In out experiment, we find λ = 4to be a suitable value.

Figure 11(b) shows the isosurface of the frequency responseof one analysis filter. We can see that the frequency responseapproximates the ideal hourglass shape fairly well. Note thatthe responses of the other two hourglass filters are rotatedversions of this one. Finally, to extend this design to arbitraryN -dimensional cases, we can simply modify the parametriza-tion to be

H1(ω) =

√√√√ ∏Nj=2 K(ω1, ωj)λ∑N

i=1

∏Nj=1,j �=i K(ωi, ωj)λ

.

We find the hourglass filters designed here to producesatisfactory results in our numerical experiments. However,

0

1

1

0

ω1

ω2

(π, π)

−(π, π)

(a) (b)

Fig. 11. (a) E(ω1, ω2) approximately takes the value one in the darkregion and the value zero in the white region. (b) The frequency responseof one hourglass filter designed by the proposed frequency-domain method.The responses of the other two filters are rotational-symmetric to this one.

since these filters are not FIR or IIR, the filtering operationshave to be implemented in the Fourier domain by first takingthe FFT of the input signal and then multiplying it with thefrequency values of the filters. In practice, this implementationusually requires a large memory space to store the entire multi-dimensional input signal1, and causes long buffer delays whichare undesirable for applications such as video processing. In[27], we report a different design of the hourglass filter banksusing FIR filters, which allows the filtering operations to becarried out in the spatial domain with only partial input signalavailable in the memory.

B. The Checkerboard Filter Bank

The other component in the NDFB construction is the 2-Dcheckerboard filter bank shown in Figure 8(a). Unlike the 3-D hourglass filter bank, the design of the checkerboard filterbank has already been studied by several authors, including[26], [28]–[30]. In our experiment, we choose a design basedon a two-stage ladder/lifting scheme [28], [30]. Attractivefeatures of this design include good frequency selectivity usingFIR filters, and efficient 1-D separable implementation in thepolyphase domain. Due to space limitations, we omit herefurther description of the design and refer readers to [28],[30] for more details.

C. Implementation and Computational Cost

Because of the way we have specified the hourglass filters(see (11)), our implementation of the NDFB is entirely done in

1It is possible to alleviate this memory requirement by employing an out-of-core implementation of the FFT, but at the expense of increased computationand I/O time.


the Fourier domain based on FFT, including subsequent levelsof decomposition. Although the checkerboard filter banks useFIR filters, they are still implemented in the Fourier domainto avoid the time-consuming FFT-IFFT operations in middlesteps. To further improve computational efficiency, we takeadvantage of the fact that the input signal and filters are allreal-valued and hence only keep and operate on half of theFFT data.

Table I shows the performance of a C++ implementationof the NDFB on a computer with a 2.8 GHz processor and1.5 GB memory. We perform the NDFB decomposition on 3-Dsignals of sizes N×N×N , where N = 128, 192 and 256. TheNDFB can also have different number of directional subbands,e.g. 3, 12, 48, 192 directions. We only show the running timefor the forward transform (decomposition), while the timerequired by the inverse transform (reconstruction) is similar.

TABLE I

RUNNING TIME (IN SECONDS) OF AN NDFB IMPLEMENTATION.

Data Size # of Directional SubbandsN × N × N 3 × 1 3 × 4 3 × 16 3 × 64

128 × 128 × 128 0.73 0.89 1.00 1.11192 × 192 × 192 2.88 3.78 4.03 4.33256 × 256 × 256 8.58 8.82 8.87 9.71

As we can see from the table, the FFT-based implementationis quite efficient. Meanwhile, for a fixed data size, the runningtime only increases in a roughly linear fashion when wequadruple the number of directional subbands, thanks to thetree-structured nature of the NDFB.

VI. THE SURFACELET TRANSFORM

A. Construction

As mentioned in Section I, the pyramid-shaped frequencypartitioning makes NDFB a suitable tool in capturing surfacesingularities within multidimensional signals. In this section,we propose a multiscale version of the NDFB, called thesurfacelet transform, to efficiently capture and represent localsurface singularities with different sizes. This strategy isanalogous to the contourlet construction [7], in which theoriginal 2-D DFB is combined with a multiscale decomposi-tion. However, an important distinction is that instead of usingthe Laplacian pyramid as in contourlets, we employ a newmultiscale pyramid structure for the surfacelet transform, asshown in Figure 12, which is conceptually similar to the oneused in the steerable pyramid [1].

In Figure 12, we use Li(ω) (i = 0, 1) to represent thelowpass filters and Di(ω) (i = 0, 1) to represent the highpassfilters in the multiscale decomposition. S(ω) is an anti-aliasingfilter used to cancel the aliasing caused by the upsamplingoperations. The NDFB is attached to the highpass branch atthe finest scale and bandpass branches at coarser scales. Tohave more level of decomposition, we can recursively insertat point an+1 a copy of the diagram contents enclosed by thedashed rectangle in the analysis part, and at point sn+1 a copyof the diagram contents enclosed by the dotted rectangle in thesynthesis part.

In the new multiscale pyramid depicted in Figure 12, thelowpass filter L0(ω) in the first level is downsampled bya non-integer factor of 1.5 (upsampling by 2 followed bydownsampling by 3) along each dimension. Although thisfractional sampling factor makes the new pyramid slightlymore redundant than the Laplacian pyramid (e.g. 1.34 ver-sus 1.14 in redundancy ratios in 3-D), we find the addedredundancy to be very useful in reducing the frequency-domain aliasing of the NDFB, which is concentrated on theboundaries of the frequency cell [−π, π)N and mainly causedby the 2π periodicity of the frequency spectrum of discretesignals. Intuitively, the new multiscale pyramid achieves thetask of eliminating aliasing components by only keepingthe middle (alias-free) portion of the NDFB filter responses.Consequently, the constructed surfacelets are well-localized inboth the spatial and frequency domain. Due to space limitation,we leave a detailed explanation for this issue to [31].

In our current implementation, we specify the lowpass filtersLi(ω) (i = 0, 1) in the frequency domain as Li(ω) =di · ∏N

n=1 L(1D)i (ωn), where d1 = 6N/2 and d2 = 2N/2;

L(1D)i (ωn) is a 1-D lowpass filter along the ωn axis with pass-

band frequency ωp,i, stopband frequency ωs,i and a smoothtransition band, defined as

L(1D)i (ω) =

⎧⎪⎨⎪⎩1 for |ω| ≤ ωp,i,12 + 1

2 cos (|ω|−ωp,i)πωs,i−ωp,i

for ωp,i < |ω| < ωs,i,

0 for ωs,i ≤ |ω| ≤ π,(12)

for |ω| ≤ π and i = 0, 1. Similarly, we also specify the anti-aliasing filter S(ω) as the separable product of 1-D lowpassfilters having the same parametrized form as in (12), but witha different set of passband and stopband frequencies, denotedas ωp,A and ωs,A respectively.

The main advantage of designing the filters in the frequencydomain is that we can let their frequency responses to bestrictly zero beyond some cutoff frequencies. By choosingωs,i(i = 1, 2), ωp,A and ωs,A properly, we can ensure thatthe aliasing introduced by the upsampling and downsam-pling operations will be completely cancelled, and the perfectreconstruction condition for the multiscale pyramid can besimplified as

|Li(ω)|2d2

i

+ |Di(ω)|2 ≡ 1, for i = 0, 1. (13)

To satisfy the alias-free condition with an approximateoctave-band decomposition, we choose the passband and stop-band frequencies in (12) to be wp,0 = π/3, ws,0 = 2π/3,wp,1 = π/4, and ws,1 = π/2, and the frequency parametersof S(ω) to be ωp,A = π/3 and ωs,A = 2π/3. Once we havechosen the lowpass filters, the highpass filters Di(ω) can beobtained from (13) to ensure perfect reconstruction.

B. Properties of the Surfacelet Transform

By combining the multiscale pyramid with the NDFB, thesurfacelet transform has ideal passband supports as pairs ofconcentric cubes radiating out from the origin with differentdirections and scales. In the spatial domain, the surfacelet basis


Analysis Synthesis

D0(ω)

L0(ω)

D1(ω)

L1(ω)

D0(−ω)

L0(−ω)

D1(−ω)

L1(−ω)

2I

2I

2I

2I

3I3IS(ω) S(−ω)

NDFB

NDFB I-NDFB

I-NDFB

an

an+1

sn

sn+1

Fig. 12. The block diagram of the proposed surfacelet transform. The forward transform NDFB and its inverse I-NDFB are attached to the highpasssubbands of the multiscale pyramid at each scale.

are localized surface patches with different normal directionsand spatial locations. Table II summarizes the properties of thesurfacelet transform, in comparison with two related systemsthat can also provide directional multiresolution signal decom-position in 3-D, including the real-valued dual-tree wavelettransform (DTWT) [5], [6] and the recently reported discreteimplementation of the 3-D curvelet transform [25]. Note thatthere are two different versions of the 3-D curvelet transform,denoted in the table by Curvelets and Curvelets-L respectively,corresponding to whether wavelets or curvelets are used inthe finest scale. Compared with the “full” version, Curvelets-L has a substantially lower redundancy ratio and hence ismore computationally feasible. However, since there is nodirectional decomposition at the finest scale, this version ismore suitable for very bandlimited signals.

The surfacelet transform and the DTWT have similar re-dundancy ratios in 3-D. However, a potential advantage of thesurfacelet transform is that its angular resolution can be refined(i.e. with more directional subbands) by invoking more levelsof decomposition. In practice, we usually choose to have 192or more directional subbands at finer scales, in contrast to thefixed 28 directional subbands provided in the DTWT.

The 3-D discrete curvelets and our proposed surfaceletsaim at the same frequency partitioning, but the two trans-forms achieve this goal with two very different approaches.The curvelet implementation starts from defining a “mother”curvelet in the Fourier domain, whose scaled and shearedcopies form a partition of unity. The curvelet coefficients arethen obtained by multiplying the Fourier samples of the inputsignal with curvelet window functions at different scales anddirections, followed by a spatial downsampling (implementedby frequency wrapping). Attractive features of this approachinclude its conceptual simplicity and direct connection withthe continuous theory. However, an intrinsic problem is thatall the downsampling is done in an alias-free fashion, requiringthat the curvelet functions be strictly bandlimited in thefrequency domain. This poses an inevitable trade-off between

redundancy and spatial localization of the curvelets, as lowerredundancy requires higher subsampling ratio, which in turnimplies narrower transition band in frequency supports, andfrom the uncertainty principle, slower decay of the resultingcurvelets in the spatial domain.

In contrast, the NDFB, as the key component of surfacelets,has a tree-structured filter bank construction, in which aliasingis allowed to exist and will be cancelled by carefully designedfilters (see Section V). Consequently, the surfacelet transformis much less redundant than 3-D curvelets2 (see Table II).Meanwhile, in NDFB we can use filters with fast spatial decay(and hence with more gently spread-out frequency support),without losing efficiency in terms of redundancy, since we donot need the filters to be strictly bandlimited. Furthermore, asa potential advantage of the surfacelet construction, it mightbe feasible to implement the multiscale pyramid shown inFigure 12 using FIR filters, with the perfect reconstructioncondition in (13) approximately satisfied. We leave the FIRimplementation of the multiscale pyramid to a future work.

VII. EXPERIMENTAL RESULTS

In this section, we present some experiments with the pro-posed NDFB and the surfacelet transform. All experiments usethe hourglass and checkerboard filters described in Section Vand the multiscale filters specified in Section VI.

A. Surfacelets Basis Images

We show four surfacelets in the frequency domain in thetop row of Figure 13. The isosurface value is chosen to behalf of the largest value in each the basis image. As expected,the frequency support of surfacelets are bandpass concentriccubes tiling the frequency space with different directions.

Next, we show in the bottom row of Figure 13 the samesurfacelets, but in the spatial domain. In each basis image, theblue (or dark) colored isosurface is extracted at half of the

2When both use the same frequency partitioning at the finest scale.


TABLE II

PROPERTIES OF THE SURFACELET TRANSFORM COMPARED WITH TWO OTHER DIRECTIONAL MULTIRESOLUTION SYSTEMS.

# of directional subbands redundancy tree-structured Implementation usingat each scale in 3-D construction FIR filters

Dual-tree Wavelets 28 4 Yes Yes3-D Curvelets-L 3 × 2l, except the finest scale 4 ∼ 5 No No3-D Curvelets 3 × 2l, for all scales ≈ 40

Surfacelets 3 × 2l, for all scales ≈ 4.02 Yes Possible with approximate PR

W_X

0

50

100

150

W_Y

0

50

100

150

W_Z

0

50

100

150

W_X

0

50

100

150

W_Y

0

50

100

150

W_Z

0

50

100

150

W_X

0

50

100

150

W_Y

0

50

100

150

W_Z

0

50

100

150

W_X

0

50

100

150

W_Y

0

50

100

150

W_Z

0

50

100

150

X

0

5

10

15

20

25

30

Y

0

5

10

15

20

25

30

Z

0

5

10

15

20

25

30

X

0

5

10

15

20

25

30

Y

0

5

10

15

20

25

30

Z

0

5

10

15

20

25

30

X

0

5

10

15

20

25

30

Y

0

5

10

15

20

25

30

Z

0

5

10

15

20

25

30

X

0

5

10

15

20

25

30

Y

0

5

10

15

20

25

30

Z

0

5

10

15

20

25

30

Fig. 13. Top row: the isosurfaces of four surfacelet basis images in thefrequency domain. Bottom row: the isosurfaces of the same surfacelets, butin the spatial domain.

most positive value and the red (or light) colored isosurfaceat half of the most negative value. We can see that thesurfacelets in the spatial domain are localized surface patches,smooth along the tangent planes and oscillatory along thenormal directions (equal to the directions of the correspondingfrequency supports).

B. 3-D Zone Plate

We apply the proposed surfacelet transform to a 3-D zoneplate image, which is generated by the formula x[n] =cos(0.02 × r2), with r =

√n2

1 + n22 + n2

3 being the distanceto the image center. As shown in Figure 14(a), the zone platex[n] represents a snapshot of a spherical wave at a certaintime. Locally around a point n0 = r · u with unit directionu and radius r, the zone plate image can be approximatedby a plane wave function that remains constant on all planesorthogonal to u, and oscillates along u at a frequency linearlyproportional to r. In the frequency domain, the spectral supportof that plane wave is localized around ω 0 = c · r ·u for someconstant c. Comparing n0 with ω0, we can see that the spatiallocations of the zone plate image directly correspond with itsspectral locations.

In the experiment, we first decompose the 3-D image usingthe surfacelet transform, and then reconstruct the image bypassing through only one of the subbands to the synthesisfilter bank. As shown in Figure 14(b)-(d), different subbands ofthe surfacelet transform can separate and capture the localizedinformation of the 3-D zone plate with different directions andfrequencies.

C. Video Denoising

Video can be seen as a special type of 3-D signals, with twospatial dimensions and one temporal dimension. Denoisingvideo signals using velocity selective 3-D transforms has been

(a) (b)

(c) (d)

Fig. 14. (a) The original 3-D zone plate image; (b) - (d) The reconstructedimages from three different subbands of the surfacelet transform at differentscales and directions.

studied by Selesnick et al. in [32]. In this experiment, weuse the proposed surfacelet transform to remove the zeromean white Gaussian noises added to several video signals.For benchmark, we also show the denoising performance ofthe 3-D undecimated separable wavelet transform (UDWT),the real-valued dual-tree wavelet transform (DTWT), and 3-DCurvelets-L. We use several standard SIF-sized test sequencesfor video coding and processing, including “Mobile”, “Coast-guard” and “Tempete”, all of which can be obtained fromwww.cipr.rpi.edu.

In the test, the original sequences are truncated to the sizeof 192 × 192 × 192. We use 4 levels of decomposition forall transforms. For the surfacelet transform, the number ofdirectional subbands for each scale, from fine to coarse, isset to be 192, 192, 48 and 12. For the UDWT, we use the“symlet” of length 16. For a fair comparison, we employ thehard thresholding denoising method for all 4 transforms, bytruncating the transform coefficients of the noisy sequences atthe ith subband with a threshold Ti (i = 1, 2, . . .). Althoughnot being the best denoising algorithm available, this simplehard thresholding scheme can often be a good indication ofthe potential of different transforms.

We choose the threshold Ti = 3·Ei ·σn, where Ei is the pre-computed L2 norm of the equivalent filter at the ith subband,and σn is the estimated standard deviation of the input noise,


TABLE III

PSNR VALUES OF THE DENOISED SEQUENCES OF CURVELETS-L, UDWT, DTWT, AND THE PROPOSED SURFACELET TRANSFORM UNDER DIFFERENT

NOISE LEVELS σn . THE REDUNDANCY OF EACH TRANSFORM IS SHOWN IN PARENTHESES.

Mobile Coastguard Tempeteσn 30 40 50 30 40 50 30 40 50

Curvelets-L (4.12) 23.54 23.19 22.86 25.05 24.64 24.29 26.32 25.91 25.54UDWT (29) 24.02 22.99 22.23 25.95 24.95 24.20 27.13 26.03 25.22DTWT (4) 24.56 23.43 22.58 26.06 25.01 24.22 27.18 26.05 25.23

Surfacelets (4.02) 25.86 24.72 23.88 26.82 25.87 25.15 27.94 26.94 26.22

obtained through a robust median estimator [33].Table III shows the PSNR (in dB) of the denoised test

sequences by using different transforms. To help interpret theresults, we also list the redundancy ratio of each transformin the table, since the redundancy of a transform indicatesits computational and memory efficiency, which is an impor-tant practical issue to consider in the context of 3-D signalprocessing.

It is not surprising that Curvelets-L does not perform aswell as other transforms at low noise levels (σn = 30 or 40),mainly because of its lack of directionality in the finest scale.We should mention that the “full” version of the 3-D curvelettranform (i.e. the one with directional decomposition at allscales) is expected to give better performance. However, itslarge redundancy (around 40) makes it beyond the capabilityof our testing platform, and prevents us from including it inthe current experiment.

Among UDWT, DTWT and Surfacelets, we can see thatSurfacelets outperforms the other two by a large margin (from0.76 dB to 1.30 dB) for the tested sequences. This suggeststhe potential of the proposed surfacelet transform. Figure 15shows one frame from the denoised “Mobile” sequence byusing different transforms. We can see that image details arebest preserved by the surfacelet tranform. This difference indenoising quality is much more conspicuous when viewing thevideo sequences.

VIII. CONCLUSION AND DISCUSSIONS

In this paper, we proposed a new family of directionalfilter banks for arbitrary N -dimensional signals. Comparedwith other related systems, the proposed NDFB is built uponan efficient tree-structured construction, which leads to alow redundancy ratio and refinable angular resolution. Bycombining the NDFB with a new multiscale pyramid, weconstructed the surfacelet transform, which has potentials inefficiently capturing and representing surface-like singularitiesin multidimensional signals. We envision that the proposedNDFB and surfacelet transform would find applications invarious areas that involve the processing of multidimensionalvolumetric data, including video processing, seismic imageprocessing, and medical image analysis.

There are several interesting directions worth pursuing. Aconstraint in the current construction is that the multiscalepyramid in the surfacelet transform has to be implementedin the Fourier domain. Therefore, a direction for future workis to find FIR implementations of the multiscale pyramid,perhaps with near perfect reconstruction. Another issue is that

Fig. 15. Denoised frames from the “Mobile” sequence. From left toright: Curvelets-L (25.02 dB), UDWT (25.80 dB), DTWT (26.45 dB), andSurfacelets (28.29 dB). Shown in parentheses are the PSNR values calculatedon each frame.

the current construction of NDFB is N -times redundant inthe N -D case. Exploring new ways to further reduce thisredundancy would be of practical importance, especially forhigher dimensional cases.

APPENDIX

A. Proof of Proposition 1

We can write the equivalent downsampling matrix as

M(l2)k =

(l2∏

i=1

(D2 · Rti)

)· U (l2)

k

=

(l2∏

i=1

(D2 · Rti)

)· R2l2−1−2k

1 . (14)

Since R1 = R−10 , we can write Rti = R−2ti+1

0 for ti =


0, 1; thus from (14),

M(l2)k =

(l2∏

i=1

(D2 · R−2ti+10 )

)· R2l2−1−2k

1 (15)

= Dl22 · R0

s(l2) · R2l2−1−2k1 ,

with s(l2) =l2∑

i=1

(−2ti + 1)2l2−i,

where the second equality is obtained by interchanging thepositions of R−2ti+1

0 and D2 in (15) recursively and usingthe fact that Rn

0 · D2 = D2 · R2n0 for all n ∈ Z. Recall

that the channel index k =∑l2

i=1 ti2l2−i. After some simplemanipulation, we can get

s(l2) =l2∑

i=1

(−2ti + 1)2l2−i = 2l2 − 1 − 2k,

and therefore M(l2)k = Dl2

2 · R2l2−1−2k0 · R2l2−1−2k

1 = Dl22 .

B. Proof of Proposition 2

We first write the equivalent filter as

F(l2)k (ω) = Ft1(ω)

l2∏n=2

Ftn

((Qn−1)

T ω), (16)

where ω = (ω1, ω2)T , and the matrix Qn−1 (n = 2, . . . , l2)is understood as the partial product of the overall samplingmatrix in (14), i.e.,

Qn−1 =n−1∏i=1

(D2 · Rti).

We will prove the proposition by induction. The cases forl2 = 0 and l2 = 1 can be easily verified pictorially, as inFigure 9. Now suppose that the equivalent filters F

(l2)k′ (ω),

0 ≤ k′ < 2l2 , for an l2-level IRC(l2)12 satisfy the desired wedge

refinement condition. The (l2 + 1)-level filter bank IRC(l2+1)12

is obtained by appending a resampled checkerboard filter bankto each channel of the IRC(l2)

12 (excluding the resamplingmatrices U l2

k′ ). We consider the equivalent filter F(l2+1)k (ω) of

the k-th channel with k = 2k ′ + tl2+1, where tl2+1 = 0 or 1.From (16), we have

F(l2+1)k (ω) = Ft1(ω)

l2+1∏n=2

Ftn

((Qn−1)

T ω)

=

(Ft1(ω)

l2∏n=2

Ftn

((Qn−1)

T ω))

Ftl2+1

((Ql2)

T ω)

(17)

= F(l2)k′ (ω) · Ftl2+1

((Ql2)

T ω).

Now by induction, we can write

W(0)0 (ω)·F (l2+1)

k (ω) = W(0)0 (ω) · F (l2)

k′ (ω) · Ftl2+1

((Ql2)

T ω)

= W(l2)k′ (ω) · Ftl2+1

((Ql2)

T ω). (18)

Using the expression for Ql2 (= Dl22 · R2l2−1−2k

0 ) inAppendix A, it can be verified that in (18), upsampling byQl2 effectively squeezes and shears the basic spectrum of the

ω1

ω2

(π, π)

−(π, π)

Fig. 16. An illustration of how the wedge support from the previous levelcan be decomposed into two finer slices by the resampled checkerboard filters.The region inside the thick lines represent one of the wedge supports from anl2-level decomposition. The dark gray and light gray regions inside the thinlines are copies of the basis spectrum of the upsampled checkerboard filtersF0

�(Ql2

)T ω�

and F1

�(Ql2

)T ω�, respectively. The resampled (squeezed

and sheared) checkerboard filters divide the original wedge right in the middleand produce two “thinner” wedges.

checkerboard filter banks Ftl2+1(ω) (which were shown inFigure 8(a)), so that the copy that overlaps with the wedgesupport W

(l2)k′ (ω) exactly divides the original wedge in the

middle. As illustrated in Figure 16, the outcome are two“thinner” wedge supports that correspond to W

(l2+1)2k′ (ω) and

W(l2+1)2k′+1 (ω), respectively.Now the right side of (18) can be expressed as

W(l2+1)2k′+tl2+1

(ω) = W(l2+1)k (ω), and therefore we have verified

the wedge refinement condition for the (l2+1)-level case, i.e.,

W(0)0 (ω) · F (l2+1)

k (ω) = W(l2+1)k (ω).

C. Proof of Lemma 1

Taking the intersection of N − 1 wedge supports, we have

N⋂i=2

W(li)ki

=N⋂

i=2

{(ω1, ωi) = αi (π, bi) : |αi| ≤ 1 and bi ∈ B

(li)ki

}=

N⋂i=2

⋃|αi|≤1

αi ·{(ω1, ωi) = (π, bi) : bi ∈ B

(li)ki

}

=⋃

|α2|≤1

. . .⋃

|αN |≤1

N⋂i=2

αi ·{(ω1, ωi) = (π, bi) : bi ∈ B

(li)ki

}

=⋃

|α|≤1

N⋂i=2

α ·{(ω1, ωi) = (π, bi) : bi ∈ B

(li)ki

}=⋃

|α|≤1

α ·{ω = (π, b2, . . . , bN )T : bi ∈ B

(li)ki

}= P(l)

k

Since the ideal hyperpyramid and wedge filters are just indi-cator functions defined on their corresponding support sets,we have P

(l)k (ω) = 1P(l)

k

(ω) =∏N

i=2 1W(li)ki

(ω1, ωi) =∏Ni=2 W

(li)ki

(ω1, ωi).

ACKNOWLEDGMENT

The authors thank Laurent Demanet and Lexing Ying forhelpful discussions on their 3-D curvelet software, and IvanSelesnick for providing his 3-D dual-tree wavelet software.


REFERENCES

[1] E. P. Simoncelli, W. T. Freeman, E. H. Adelson, and D. J. Heeger,“Shiftable multiscale transforms,” IEEE Trans. Inform. Th., Special Issueon Wavelet Transforms and Multiresolution Signal Analysis, vol. 38,no. 2, pp. 587–607, March 1992.

[2] R. H. Bamberger and M. J. T. Smith, “A filter bank for the directionaldecomposition of images: theory and design,” IEEE Trans. Signal Proc.,vol. 40, no. 4, pp. 882–893, April 1992.

[3] J. P. Antoine, P. Carrette, R. Murenzi, and B. Piette, “Image analysiswith two-dimensional continuous wavelet transform,” Signal Processing,vol. 31, pp. 241–272, 1993.

[4] E. J. Candes and D. L. Donoho, “Curvelets – a suprisingly effectivenonadaptive representation for objects with edges,” in Curve and SurfaceFitting, A. Cohen, C. Rabut, and L. L. Schumaker, Eds. Saint-Malo:Vanderbilt University Press, 1999.

[5] N. Kingsbury, “Complex wavelets for shift invariant analysis and fil-tering of signals,” Journal of Appl. and Comput. Harmonic Analysis,vol. 10, pp. 234–253, 2001.

[6] I. W. Selesnick, “The double-density dual-tree DWT,” IEEE Trans.Signal Proc., vol. 52, no. 5, pp. 1304–1314, May 2004.

[7] M. N. Do and M. Vetterli, “The contourlet transform: an efficientdirectional multiresolution image representation,” IEEE Trans. ImageProc., vol. 14, no. 12, December 2005.

[8] E. L. Pennec and S. Mallat, “Sparse geometric image representationswith bandelets,” IEEE Trans. Image Proc., vol. 14, no. 4, pp. 423–438,April 2005.

[9] K. Guo, W.-Q. Lim, D. Labate, G. Weiss, and E. Wilson, “Waveletswith composite dilations and their MRA properties,” Journal of Appl.and Comput. Harmonic Analysis, vol. 20, pp. 231–249, 2006.

[10] R. H. Bamberger, “New results on two and three dimensional directionalfilter banks,” in Twenty-Seventh Asilomar Conf. on Signals, Systems andComputers, vol. 2, 1993, pp. 1286–1290.

[11] S. Park, M. J. T. Smith, and R. M. Mersereau, “Improved structures ofmaximally decimated directional filter banks for spatial image analysis,”IEEE Trans. Image Proc., vol. 13, no. 11, pp. 1424–1431, November2004.

[12] S. Park, “New directional filter banks and their applications in imageprocessing,” Ph.D. dissertation, Georgia Institute of Technology, 1999.

[13] M. N. Do, “Directional multiresolution image repre-sentations,” Ph.D. dissertation, Swiss Federal Instituteof Technology, Lausanne, Switzerland, December 2001,http://www.ifp.uiuc.edu/˜minhdo/publications.

[14] P. S. Hong and M. J. T. Smith, “An octave-band family of non-redundantdirectional filter banks,” in Proc. IEEE Int. Conf. Acoust., Speech, andSignal Proc., Orlando, USA, 2002, pp. 1165–1168.

[15] Y. Lu and M. N. Do, “CRISP-contourlets: a critically-sampled direc-tional multiresultion image representation,” in Proc. of SPIE conferenceon Wavelet Applications in Signal and Image Processing X, San Diego,USA, August 2003.

[16] T. T. Nguyen and S. Oraintara, “Multiresolution direction filter banks:theory, design and applications,” IEEE Trans. Signal Proc., vol. 53,no. 10, pp. 3895–3905, October 2005.

[17] L. T. Bruton, “Three-dimensional cone filter banks,” IEEE Trans. Circ.and Syst., Part I Fundamental Theory and Applications, vol. 50, no. 2,pp. 208–216, February 2003.

[18] I. Daubechies, Ten Lectures on Wavelets. Philadelphia, PA: SIAM,1992.

[19] M. Vetterli and J. Kovacevic, Wavelets and Subband Coding. Prentice-Hall, 1995.

[20] S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1998.[21] E. Viscito and J. P. Allebach, “The analysis and design of multidi-

mensional FIR perfect reconstruction filter banks for arbitrary samplinglattices,” IEEE Trans. Circ. and Syst., vol. 38, pp. 29–41, Jan 1991.

[22] P. P. Vaidyanathan, Multirate Systems and Filter Banks. Prentice Hall,1993.

[23] F. Nicolier, O. Laligant, and F. Truchetet, “Discrete wavelet transformimplementation in Fourier domain for multidimensional signal,” J.Electron. Imag., vol. 11, no. 3, pp. 338–346, July 2002.

[24] M. Feilner, D. V. D. Ville, and M. Unser, “An orthogonal family ofquincunx wavelets with continuously adjustable order,” IEEE Trans.Image Proc., vol. 14, no. 4, pp. 499–510, Apr. 2005.

[25] L. Ying, L. Demanet, and E. Candes, “3D discrete curvelet transform,” inProc. of SPIE conference on Wavelet Applications in Signal and ImageProcessing XI, San Diego, USA, 2005.

[26] D. B. H. Tay and N. Kingsbury, “Flexible design of multidimensionalperfect reconstruction FIR 2-band filters using transformations of vari-ables,” IEEE Trans. Image Proc., vol. 2, pp. 466–480, 1993.

[27] Y. Lu and M. N. Do, “Multidimensional nonsubsampled hourglass filterbanks: geometry of passband support and filter design,” in Fortieth Asilo-mar Conference on Signals, Systems, and Computers, Pacific Grove,USA, October 2006.

[28] Y.-P. Lin and P. P. Vaidyanathan, “Theory and design of two-dimensionalfilter banks: A review,” Multidimensional Systems and Signal Proc.,vol. 7, pp. 263–330, 1996.

[29] J. G. Rosiles, “Image and texture analysis using biorthogonal angularfilter banks,” Ph.D. dissertation, Georgia Institute of Technology, 2004.

[30] Y. Lu and M. N. Do, “The finer directional wavelet transform,” in Proc.IEEE Int. Conf. Acoust., Speech, and Signal Proc., Philadelphia, March2005.

[31] ——, “A new contourlet transform with sharp frequency localization,”in Proc. IEEE Int. Conf. on Image Proc., Atlanta, USA, October 2006.

[32] I. W. Selesnick and K. Y. Li, “Video denoising using 2D and 3Ddual-tree complex wavelet transforms,” in Proc. of SPIE conference onWavelet Applications in Signal and Image Processing X, San Diego,USA, August 2003.

[33] D. L. Donoho and I. M. Johnstone, “Ideal spatial adaptation by waveletshrinkage,” Biometrika, vol. 81, no. 3, pp. 425 – 455, December 1994.

Yue Lu received the B.Eng and M.Eng degreesin electrical engineering, both with highest honors,from Shanghai Jiao Tong University, China, in 1999and 2002. He is currently pursuing the Ph.D. de-gree in electrical engineering and the MS degreein mathematics at University of Illinois at Urbana-Champaign.

He was a visiting student at Microsoft ResearchAsia, Beijing, China, from March to July 2002,and a visiting researcher at Siemens Corporate Re-search, Princeton, NJ, from January to June 2006.

His research interests include the theory, constructions, and applications ofmultiscale geometric representations for multidimensional signals; image andvideo processing; and medical image analysis for computer-aided diagnosis.

He received the Most Innovative Paper Award of ICIP in 2006 for his paper(with Minh N. Do) on the construction of directional multiresolution imagerepresentations.

Minh N. Do was born in Thanh Hoa, Vietnam, in1974. He received the B.Eng. degree in computerengineering from the University of Canberra, Aus-tralia, in 1997, and the Dr.Sci. degree in commu-nication systems from the Swiss Federal Instituteof Technology Lausanne (EPFL), Switzerland, in2001.

Since 2002, he has been an Assistant Professorwith the Department of Electrical and ComputerEngineering and a Research Assistant Professor withthe Coordinated Science Laboratory and the Beck-

man Institute, University of Illinois at Urbana-Champaign. His researchinterests include wavelets, image and multidimensional signal processing,multiscale geometric analysis, and visual information representation.

He received a Silver Medal from the 32nd International MathematicalOlympiad in 1991, a University Medal from the University of Canberra in1997, the best doctoral thesis award from the Swiss Federal Institute ofTechnology Lausanne in 2001, and a CAREER award from the NationalScience Foundation in 2003.

ieee transactions on image processing 1 …

Documents