evaluation of spiht coding parameters - ntut.edu.tw

Evaluation of SPIHT Coding Parameters

Shih-Hsuan Yang and Wu-Jie Liao Email: [email protected]

Department of Computer Science and Information Engineering National Taipei University of Technology

Taipei, Taiwan, ROC

Abstract SPIHT (set partitioning in hierarchical

trees) is one of the most successful compression algorithms for still images. SPIHT derives from the embedded zerotree wavelet (EZW) algorithm that generates scalable and highly compressed bit streams. The superiority of SPIHT mainly attributes to its ingenious integration of wavelet transform with successive scalar quantization for spatial orientation trees. In this paper, we investigate the implementation factors crucial to SPIHT coding. We first evaluate a variety of wavelet filters commonly employed in image coding on the basis of their coding efficiency and computational complexity. Several types of image sample extension across the boundary are then examined. The presented results can provide good guidelines for practical design of SPIHT-based codecs. Keywords: image coding, wavelet transform, embedded zerotree wavelet (EZW), SPIHT

1. Introduction

Image compression is essential to many multimedia applications since uncompressed pictorial information is too huge to be accommodated by typical transmission and storage facilities. Most of the modern image coders are transformation based; a linear (usually orthogonal) transformation converts pixels into uncorrelated and compacted coefficients so lossy quantization can be effectively taken on the transform coefficients. To date, the most commonly used transformations are the discrete cosine

transform (DCT) and discrete wavelet transform (DWT). Entropy coding for the quantized symbol sequence can further squeeze the bit rate. One such paradigm is the baseline JPEG coding that combines, DCT, perceptually weighted scalar quantization, and Huffman coding of the zig-zag scanned symbols.

Shapiro’s seminal paper on the embedded zerotree wavelets (EZW) coding [1] erected a milestone in image coding. Under the same transform-coding framework as JPEG, EZW is otherwise equipped with wavelet transform, successive scalar quantization, and arithmetic coding. The major contribution of EZW is to invent an efficient representation (zerotree) for a set of insignificant wavelet coefficients that correspond to the same spatial location and orientation. Of the various improvements of EZW, the set partitioning in hierarchical trees (SPIHT) coding [2] is regarded the most efficient. Because of its excellent compression performance and implementation elegancy, SPIHT has in fact become the yardstick of every new image-coding algorithm.

The design parameters of each coding stage (transformation, quantization, and entropy coding) could greatly affect the performance of EZW and SPIHT. Several issues pertaining to embedded wavelet coding have been independently addressed in the literature. The first category unveils the importance of wavelet transforms. Li et al. [3] investigate the choice of several wavelet transforms and extension methods for EZW. In [4] and [5], the authors devise good IIR filters under EZW and SPIHT, respectively. In [6], several reversible

1

wavelet transforms are compared under JPEG-2000 on the basis of their compression performance and computational complexity. The concept of mutual information is used in [7] to model the interscale and intrascale dependencies between wavelet coefficients. The second category tries to disclose the subtlety of embedded quantization. It is shown in [8] that under the same embedded coding structure wavelet transform outperforms DCT within 1dB. In [9] the authors present a unified rate-distortion analysis framework for transform and subband coding. The rate-distortion curve is characterized by the fraction of zeros among the quantized coefficients. Some variations of quantization schemes have been proposed for SPIHT [10], though the improvement is usually marginal.

This paper is most similar to [3] but the evaluation is made far more comprehensive for the SPIHT coding. In Section 2, we review the aspects of SPIHT coding relevant to our discussion. Common wavelet filters together with possible extension types are evaluated and analyzed in Sections 3 and 4. Throughout this paper, we use the two 512×512 gray-level USC-database images Lena and Baboon [11] shown in Figure 1 as the test images.

(a)

(b)

Figure 1. Test images: (a) Lena (b) Baboon.

2. Design Parameters of SPIHT Coding SPIHT employs a two-dimensional

pyramidal DWT to generate hierarchical wavelet subbands. In this paper we take 5-level decomposition, which produces 16 subbands indexed from 0 (DC subband) to 15 (highest detail subband) in the zig-zag order. Five-level decomposition is sufficient for good coding performance. A typical parent node has four child nodes in the next layer, as shown in Figure 2. The parent-child nodes group into a spatial-orientation tree that bears the information associated with the same location and orientation. Shapiro observed an important self-similar relationship of wavelet coefficients [1]: a set of descendent nodes tends to be insignificant if their parent node is insignificant. This property results from energy compaction and self-similarity of wavelet transform, as is manifest in Figure 3.

Figure 2. Parent-child relationship of SPIHT. The nodes marked with asterisks have no children.

Figure 3. Three-level decomposition of Lena using the 9/7 filter. (The coefficients are properly scaled for better visual quality.)

2

A simple scalar deadzone quantizer for the DWT coefficients is used for SPIHT. Quantization is performed in two passes, the sorting pass and the refinement pass (depicted in Figure 4). The sorting pass identifies the significant coefficients with respect to a threshold T and gives their sign. The identified significant coefficients are recorded in the list of significant pixels (LSP). On the other hand, insignificant spatial-orientation trees are stored in the list of insignificant sets (LIS), previously recognized as zerotrees in EZW. The other isolated insignificant coefficients are kept in the list of insignificant pixels (LIP). One of the major improvements of SPIHT over EZW is the more effective and flexible identification of zerotrees to increase the chance of forming zerotrees. The refinement pass narrows the quantization cell size to T for those coefficients having magnitude greater than 2T. The sorting pass and refinement pass are iterated with halved threshold in the next iteration. An adaptive arithmetic coding can be applied to the quantization bits to further reduce the rate. However, it is observed [2] that the added entropy coding provides very limited improvement in rate reduction while involving much more intensive computation. We have thus dispensed with entropy coding in later discussions.

(a)

(b)

Figure 4. SPIHT coding process (a) overall (b) quantization alone

3. Effects of Wavelet Filters The coding performance of wavelet-

based codecs is sensitive to the choice of wavelet filters. For instance, the 5/3 and 9/7 biorthogonal filter banks have been accepted in the state-of-the-art image coding standard JPEG-2000 based on the performance evaluation on various filters [6]. Favorable properties of wavelets include [12]:

Desirable time-frequency localization. ♦ ♦ ♦ ♦

♦

Compact support. Orthogonality. Smoothness, regularity, or vanishing moments. Symmetry (linear-phase constraint).

In this study, we consider the following wavelet filters commonly referred in the literature. These filters fall into three categories: 1) the Haar wavelet D2 and the Daubechies 4-tap and 6-tap orthogonal filters D4 and D6 [13], 2) the best real biorthogonal filters 9/7 [14] and 10/18 [15], 3) the seven integer biorthogonal wavelets (IWT) 5/3, 9/7F, 9/7M, 5/11A, 5/11C, 13/7C, 13/7T [6]. IWT is a fixed-point approximation to its parent RWT [16], and can be implemented in the lifting framework [17], [18] without costly floating-point operations. It is reversible and suitable for a unified lossy and lossless codec.

The SPIHT coding performance at various bit per pixel (bpp) equipped with each of the aforementioned filters is given in Table 2. The visual quality is measured by the peak signal-to-noise ratio (PSNR). The PSNR in dB of an M-pixel 8 bpp gray-level image X = (x1, x2, …, xM) and its distorted version Y = (y1, y2, …, yM) is computed by

−=

∑=

M

iii yx

M 1

2

2

10

)(1255log 10PSNR (1)

The best real and integer wavelet transforms have been highlighted respectively. For RWT, both 9/7 and 10/18 achieve excellent performance. For IWT, the 9/7F filter attains most of the best results. Also, 13/7C behaves well at high bit rates. It should also be noted that the simple integer 5/3 filter achieves

3

relatively good performance. In Table 3 we measure the computation time required for transformations. The simulation is conducted on a Pentium-IV 2.4 GHz PC. The results reveal that the good coding efficiency of 9/7 (9/7F) and 10/18 filters is achieved at the expense of more intensive computation, especially for the 10/18 filter. Table 3. Relative computation time required for transformation.

4. Effects of Extension Types

Although applied to all kinds of wavelet filters, periodic extension together with circular convolution usually introduces severe boundary artifacts due to the discontinuous patching. To circumvent this problem, symmetric extension of input samples in conjunction with linear-phase filters could result in symmetric output sequences of the same length [19]. Unfortunately, the linear-phase constraint usually breaks the orthogonality. It has been shown that the only real-valued orthogonal linear-phase wavelet with compact support

is the trivial Haar filter. By relaxing the orthogonality constraint, there exist good linear-phase biorthogonal FIR wavelets such as the last 2 categories mentioned in Section 3.

Depending on the intrinsic properties of filters (even or odd number of taps, symmetric or anti-symmetric), proper extension methods should be chosen for perfect reconstruction and good coding performance. Some common types of symmetric extension are shown in Figure 5. The rule of thumb is to use the same symmetric type for both the filter and the sequence. Similar consideration should also be taken in the processes of down-sampling and up-sampling. The results shown in Table 2 were obtained by the most appropriate extension method, i.e., periodic extension for orthogonal wavelets and best symmetric extension for biorthogonal wavelets. To see the effects of extension for biorthogonal filters, we recode the test images using the periodic extension and the results are given in Table 4. For biorthogonal wavelets, symmetric extension provides a substantial edge over periodic extension especially at low rates. A reasonable conjecture is that the aliasing (high-frequency) components introduced by periodic extension makes the SPIHT compression less efficient.

RWT D2 D4 D6 9/7 10/18 1.0 1.56 2.06 2.03 3.59

IWT 5/3 9/7F 9/7M 5/11A 5/11C 13/7C 13/7T1.00 1.94 1.01 1.52 1.52 1.13 1.13

Table 2. Compression results in PSNR for (a) Lena (b) Baboon

(a)

RWT IWT bpp D2 D4 D6 9/7 10/18 5/3 9/7F 9/7M 5/11A 5/11C 13/7C 13/7T

0.125 27.53 28.97 29.38 30.53 30.68 29.71 30.25 29.78 29.84 29.79 29.94 29.900.25 30.21 31.85 32.35 33.58 33.75 32.60 33.24 32.87 32.81 32.88 33.04 33.070.5 33.50 35.24 35.75 36.74 36.86 35.75 36.17 35.93 35.92 35.89 36.14 36.131.0 37.47 38.92 39.26 39.92 39.96 38.87 38.84 38.80 38.89 38.80 39.03 39.00

(b)

RWT IWT bpp D2 D4 D6 9/7 10/18 5/3 9/7F 9/7M 5/11A 5/11C 13/7C 13/7T

0.125 20.97 21.28 21.37 21.49 21.60 20.96 21.42 20.85 20.92 20.87 21.05 20.980.25 22.14 22.54 22.64 22.88 22.97 22.25 22.80 22.18 22.23 22.17 22.40 22.350.5 24.60 24.60 24.79 25.11 25.13 24.22 25.07 24.28 24.25 24.23 24.49 24.471.0 27.97 27.97 28.21 28.62 28.61 27.71 28.37 27.80 27.79 27.76 28.02 27.98

4

(a) (b)

(c) (d)

Figure 5. Extension types (a) periodic extension (b) odd-symmetric extension (c) even-symmetric extension (d) anti-symmetric extension.

The DC subband, though consists of only (1/4)5 < 0.1% transformation coefficients, contains most of the signal energy. Table 5 gives the percentage of the squared sum of the DC coefficients to the total squared sum. In this evaluation we assume the most appropriate extension across boundaries. It is observed from Table 5 that the best biorthogonal wavelets 9/7 and 10/18 possess much higher energy compaction ratio than the other biorthogonal wavelets. It seems that the energy-compaction property is the dominant factor that determines the coding performance of biorthogonal wavelets.

Table 4. Coding results for period/symmetric extension (a) Lena (b) Baboon. RWT IWT

bpp D2 D4 D6 9/7 10/18 5/3 9/7F 9/7M 5/11A 5/11C 13/7C 13/7T

0.125 27.53 28.97 29.38 30.06 /30.53

30.20/30.68

29.20/29.71

29.86/30.25

29.40/29.78

29.33/29.84

29.32 /29.79

29.53 /29.94

29.53/29.90

0.25 30.21 31.85 32.35 33.21 /33.58

33.58/33.75

32.12/32.60

32.88/33.24

32.53/32.87

32.36/32.81

32.41 /32.88

32.70 /33.04

32.69/33.07

0.5 33.50 35.24 35.75 36.52 /36.74

36.46/36.86

35.50/35.75

35.96/36.17

35.73/35.93

35.72/35.92

35.72 /35.89

35.90 /36.14

35.89/36.13

1.0 37.47 38.92 39.26 39.77 /39.92

39.75/39.96

38.76/38.87

38.74/38.84

38.71/38.80

38.78/38.89

38.70 /38.80

38.92 /39.03

38.88/39.00

(a)

RWT IWT bpp

D2 D4 D6 9/7 10/18 5/3 9/7F 9/7M 5/11A 5/11C 13/7C 13/7T

0.125 20.97 21.28 21.37 21.43 /21.49

21.55/21.60

20.92/20.96

21.35/21.42

20.81/20.85

20.87/20.92

20.82 /20.87

20.98 /21.05

20.94/20.98

0.25 22.14 22.54 22.64 22.82 /22.88

22.90/22.97

22.17/22.25

22.72/22.80

22.12/22.18

22.16/22.23

22.11 /22.17

22.36 /22.40

22.28/22.35

0.5 24.09 24.60 24.79 25.03 /25.11

25.09/25.13

24.16/24.22

24.98/25.07

24.20/24.28

24.18/24.25

24.16 /24.23

24.40 /24.49

24.37/24.47

1.0 27.31 27.97 28.21 28.53 /28.62

28.54/28.61

27.65/27.71

28.29/28.37

27.97/27.80

27.68/27.79

27.64 /27.76

27.97 /28.02

27.92/27.98

(b)

Table 5. Energy percentage of DC subband (5-level decomposition)

RWT IWT

D2 D4 D6 9/7 10/18 5/3 9/7-F 9/7-M 5/11A 5/11C 13/7C 13/7T

Lena 97.79 97.37 97.28 97.98 97.42 78.27 96.12 82.14 78.09 77.74 81.88 82.10

Baboon 98.85 98.96 98.90 99.18 98.68 88.69 98.40 91.21 88.58 88.39 91.69 91.64

5

5. Conclusions The coding performance of SPIHT

codecs is largely influenced by its intrinsic design parameters. In this paper, we investigate these parameters in depth. We evaluate a variety of wavelet filters and the extension types across boundaries. The presented results can provide guidelines for the best tradeoff of a particular SPIHT-based image compression system. For example, the 9/7 and 10/18 biorthogonal wavelets with symmetric extension provide the best lossy compression performance with highest complexity. For low-complexity codecs, the 5/3 filter may be a reasonably good choice. References [1] J. M. Shapiro, “Embedded image coding

using zerotrees of wavelet coefficients,” IEEE Trans. Signal Processing, vol. 41, no. 12, pp. 3445-3462, Dec. 1993.

[2] A. Said and W. A. Pearlman, “A new, fast, and efficient image codec based on set partitioning in hierarchical trees,” IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 3, pp. 243-250, June 1996.

[3] J. Li, P.-Y. Cheng, and C.-C. J. Kuo, “On the improvements of embedded zerotree wavelet (EZW) coding,” SPIE vol. 2501, pp. 1490-1501, 1994.

[4] C. D. Creusere and S. K. Mitra, “Image coding using wavelets based on perfect reconstruction IIR filter banks,” IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 5, pp. 447-458, Oct. 1996.

[5] N. Polyak and W. A. Pearlman, “A new flexible biorthogonal filter design for multiresolution filterbanks with application to image compression,” IEEE Trans. Signal Processing, vol. 48, no. 8, pp. 2279-2288, Aug. 2000.

[6] M. D. Adams and F. Kossentini, “Reversible integer-to-integer wavelet transforms for image compression: performance and analysis,” IEEE Trans. Image Processing, vol. 9, no. 6, pp. 1010-1024, June 2000.

[7] J. Liu and P. Moulin, “Information-theoretic analysis of interscale and intrascale dependencies between image wavelet coefficients,” IEEE Trans. Image Processing, vol. 10, no. 11, pp. 1647-1658, Nov. 2001.

[8] Z. Xiong, K. Ramchandran, M. T. Orchard, and Y.-Q. Zhang, “A comparative study of DCT- and wavelet- based image coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 5, pp. 692-695, Aug. 1999.

[9] Z. He and S. K. Mitra, “A unified rate-distortion analysis framework for transform coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 11, no. 12, pp. 1221-1236, Dec. 2001.

[10] A. Abu-Hajar and R. Sankar, “Enhanced partial-SPIHT for lossless and lossy image compression,” ICASSP 2003.

[11] http://sipi.usc.edu/services/database/ Database.html

[12] C. S. Burrus, R. A. Gopinath, and H. Guo, Introduction to Wavelets and Wavelet Transforms: A Primer, Prentice-Hall, Inc., 1998.

[13] I. Daubechies, Ten Lectures on Wavelets, SIAM, CBMS series, April 1992.

[14] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, “Image coding using wavelet transform,” IEEE Trans. Image Processing, vol. 1, no. 2, pp. 205-220, Apr. 1992.

[15] M. J. Tsai, J. D. Villasenor, and F. Chen, “Stack-run image coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 5, pp. 519-521, Oct. 1996.

[16] J. Reichel, G. Menegas, M. J. Nadenau, and M. Kunt, “Integer wavelet transform for embedded lossy to lossless image compression,” IEEE Trans. Image Processing, vol. 10, no. 3, pp. 383-392, Mar. 2001.

[17] I. Daubechies and W. Sweldens, “Factoring wavelet transforms into lifting steps,” J. Fourier Anal. Appl., vol. 4, no. 3, pp. 247-269, 1998.

[18] A. R. Calderbank, I. Daubechies, W. Sweldens, and B.-L. Yeo, “Wavelet transforms that map integers to integers,” Appl. Comput. Harmon. Anal., vol. 5 pp. 332-369, July 1998.

[19] H. Kiya, K. Nishikawa, and M. Iwahashi, “A development of symmetric extension method for subband image coding,” IEEE Trans. Image Processing, vol. 3, no. 1, pp. 78-81, Jan. 1994.

6

evaluation of spiht coding parameters - ntut.edu.tw

Documents