fast encoding algorithms for tree-structured vector quantization

5
Image and Vision Computing 15 (1997) 867-87 1 ELSEVIER Short communication Fast encoding algorithms for tree-structured vector quantization Jim Z.C. Lai* Department of Information Engineering and Computer Science. Feng-Chia University, Taichung, Taiwan, ROC Received 8 November 1995; revised 21 March 1997; accepted 26 March 1997 Abstract To overcome extensive computations of searching codewords in the encoding process of vector quantization (VQ), several fast coding approaches has been proposed. As far as we know, the tree-structured search method is the fastest. In this paper, we further develop the encoding algorithm with fast search (EAWFS) and encoding algorithm with fast comparison (EAWFC) to speed up the encoding process of tree-structured vector quantization (TSVQ). These two approaches are also applicable to pruned tree-structured vector quantization (F’TSVQ). The computational complexity of EAWFC is about half of that of the available method for TSVQ. For several encoded real images, EAWFS and EAWFC have a shorter execution time than that of the available algorithm. For images with a few edges, EAWFS will outperform EAWFC. EAWFS may theoretically obtain a lower distortion than that of the conventional method or EAWFC. However from our experiments, we have found that these three methods have the same distortion. 0 1997 Elsevier Science B.V. Keywords: Fast search; Tree-structured vector quantization (TSVQ); Codeword 1. Introduction Vector quantization (VQ) techniques have been used for a number of years for data compression. With its relatively simple structure and computation, VQ has received great attention in the last decade [l-lo]. Many types of VQ, such as classified VQ, mean-removed classified VQ, multi- stage VQ, address VQ, variable rate VQ, finite-state VQ, and tree-structured VQ have been used for various purposes. VQ has also been applied to many other applications, such as progressive image transmission and video compression. More recent works on VQ can be found in references [2,3]. VQ requires dividing the signal to be compressed into VEC- tom (or blocks). Each vector is compared to the codewords of a codebook containing representative vectors. The index of the codeword that is most similar to the input vector is transmitted to the receiver. At the receiver, the index is used to access a codeword from an identical codebook. VQ faces extensive computations of searching codewords in the encoding process. To overcome this problem, several fast coding approaches has been proposed [4-lo]. As far as we know, the tree-structured search method is the fastest, which needs only 2 log*(N) searches for N codewords with a search path arranged by a binary tree structure. In TSVQ (tree-structured vector quantization), the search for a * Fax: +886 4 451 6101. 0262~8856/97/$17.00 0 1997 Elsevier Science B.V. All rights reserved PII SO262-8856(97)00032-2 codeword is performed in stages. In each stage a subset of codewords is eliminated from consideration. In a binary tree search, an input vector is compared with two predesigned test vectors. The nearest (minimum distortion) test vector decides which of the two paths to be followed to reach the next testing. In this paper, fast algorithms will be presented for TSVQ with the codebook arranged by a binary tree. It is noted that no additional storage is required by one of the approaches presented here. The current method can be easily extended to the pruned tree-structured VQ. This paper is organized as follows. Section 2 describes tree-structured vector quantiza- tion (TSVQ). Section 3 presents the algorithms developed in this paper. Some experimental results are shown in Section 4 and concluding remarks are given in Section 5. 2. Tree-structured vector quantization Vector quantization is a generalization of scalar quanti- zation. Basically, the operations of VQ can be divided into two phases, encoding and decoding. A vector quantizer can be defined as a mapping function Q in k-dimensional Eucli- dean space. That is, Q:Rk - C, where C = [ Cili = 1,2,. . .,N) is the set of reproduction vectors (codewords), and N is the number of codewords in C. The nearest codeword is selected through computing the distortion between an

Upload: jim-zc-lai

Post on 16-Sep-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Image and Vision Computing 15 (1997) 867-87 1 ELSEVIER

Short communication

Fast encoding algorithms for tree-structured vector quantization

Jim Z.C. Lai*

Department of Information Engineering and Computer Science. Feng-Chia University, Taichung, Taiwan, ROC

Received 8 November 1995; revised 21 March 1997; accepted 26 March 1997

Abstract

To overcome extensive computations of searching codewords in the encoding process of vector quantization (VQ), several fast coding approaches has been proposed. As far as we know, the tree-structured search method is the fastest. In this paper, we further develop the encoding algorithm with fast search (EAWFS) and encoding algorithm with fast comparison (EAWFC) to speed up the encoding process of tree-structured vector quantization (TSVQ). These two approaches are also applicable to pruned tree-structured vector quantization (F’TSVQ). The computational complexity of EAWFC is about half of that of the available method for TSVQ. For several encoded real images, EAWFS and EAWFC have a shorter execution time than that of the available algorithm. For images with a few edges, EAWFS will outperform EAWFC. EAWFS may theoretically obtain a lower distortion than that of the conventional method or EAWFC. However from our experiments, we have found that these three methods have the same distortion. 0 1997 Elsevier Science B.V.

Keywords: Fast search; Tree-structured vector quantization (TSVQ); Codeword

1. Introduction

Vector quantization (VQ) techniques have been used for a number of years for data compression. With its relatively simple structure and computation, VQ has received great attention in the last decade [l-lo]. Many types of VQ, such as classified VQ, mean-removed classified VQ, multi- stage VQ, address VQ, variable rate VQ, finite-state VQ, and tree-structured VQ have been used for various purposes. VQ has also been applied to many other applications, such as progressive image transmission and video compression. More recent works on VQ can be found in references [2,3]. VQ requires dividing the signal to be compressed into VEC- tom (or blocks). Each vector is compared to the codewords of a codebook containing representative vectors. The index of the codeword that is most similar to the input vector is transmitted to the receiver. At the receiver, the index is used to access a codeword from an identical codebook.

VQ faces extensive computations of searching codewords in the encoding process. To overcome this problem, several fast coding approaches has been proposed [4-lo]. As far as we know, the tree-structured search method is the fastest, which needs only 2 log*(N) searches for N codewords with a search path arranged by a binary tree structure. In TSVQ (tree-structured vector quantization), the search for a

* Fax: +886 4 451 6101.

0262~8856/97/$17.00 0 1997 Elsevier Science B.V. All rights reserved PII SO262-8856(97)00032-2

codeword is performed in stages. In each stage a subset of codewords is eliminated from consideration. In a binary tree search, an input vector is compared with two predesigned test vectors. The nearest (minimum distortion) test vector decides which of the two paths to be followed to reach the next testing.

In this paper, fast algorithms will be presented for TSVQ with the codebook arranged by a binary tree. It is noted that no additional storage is required by one of the approaches presented here. The current method can be easily extended to the pruned tree-structured VQ. This paper is organized as follows. Section 2 describes tree-structured vector quantiza- tion (TSVQ). Section 3 presents the algorithms developed in this paper. Some experimental results are shown in Section 4 and concluding remarks are given in Section 5.

2. Tree-structured vector quantization

Vector quantization is a generalization of scalar quanti- zation. Basically, the operations of VQ can be divided into two phases, encoding and decoding. A vector quantizer can be defined as a mapping function Q in k-dimensional Eucli- dean space. That is, Q:Rk - C, where C = [ Cili = 1,2,. . .,N) is the set of reproduction vectors (codewords), and N is the number of codewords in C. The nearest codeword is selected through computing the distortion between an

868 J.Z.C. L&Image and Vision Computing 15 (1997) 867-871

Fig. I. Basic structure of tree-structured encoding.

input vector and all codewords. The distortion between the input vector X = (xi ,x2,. . .,xJ’ and a codeword C =

(CI,C2,..., ck)’ is defined as

d(X,C)= i Ix;-ci12 i=l

where t denotes matrix transposition. To decide the best match (nearest codeword) of an input

vector, we need N distortion computations. The operation of ‘mapping a vector to the nearest codeword’ requires com- plex computations in the encoding process. To reduce the search complexity, TSVQ is widely used to increase the encoding speed. A binary tree with d stages is said to have breadth 2 and depth d. The basic structure of the binary tree search of depth 3 is shown in Fig. 1, where each node is associated with a test vector Ci. A node is said to be depth of L if it can be reached from the root node through L stages. Each terminal node corresponds to a codeword in the codebook.

Given an input vector, the encoder should determine a corresponding codeword in the leaf node of a tree. Referring to Fig. 1, the traditional (available) encoding process of TSVQ for an input vector X is described as follows.

1. Setk-0. 2. Set I- 2k + 1 and r - 2k + 2. 3. Calculate d(X,C,) and d(X,CI). 4. If d(X,C,) 2 d(X,CI), set k - 1. Otherwise set k - r. 5. Repeat steps 2 to 4 until a terminal node is reached.

From the encoding process described above, it is clear that we need 2 logy distortion computations and logiN comparisons to locate a codeword from a codebook of N codewords. The major effort of encoding is calculating the distortions. In the next section, we will present some methods to reduce the computational complexity.

3. Fast search algorithms

3.1. Fast comparison

For a given node, the key encoding process of TSVQ is to

determine which child has the lowest distortion. For a binary tree, we will present a fast search process in this section. Referring to Fig. 1, let Ci be the test vector associated with node i of a binary tree. Suppose that nodes r and 1 are two child nodes of node i, where 1 = (2i + 1) and r = (2i + 2).

Let C, and Cd, respectively, be the mean and difference vectors of C, and Cl. That is

C, =(cml, . . . . C,k)’ = (C, + C#2

Cd =(Cd,,...,Cdk)‘=C,.-Cl

It is easy to show that

(2)

(X - C,)C, 2 0

implies that

d(X, C,) 5 d(X, C,)

(3)

(4)

where

(X - cm)‘cd = 2 (Xi - cmi)(cdi) i=l

(5)

From Equations (l), and (5), it is easy to see that the inner product and distortion of two vectors have the same com- putational complexity. Now we would like to present the encoding algorithm with fast comparison (EAWFC) for an input vector X as follows.

1. Setk+O. 2. Set 1 - 2k + 1 and r - 2k + 2.

3. Calculate (X - C&Cd. 4. If (X - C,)C, 2 0, set k - r. Otherwise set k - 1.

5. Repeat steps 2 to 4 until a terminal node is reached.

From the modified encoding process described above, the EAWFC needs log2N distortion computations and log2N comparisons to locate a codeword from a codebook of N codewords. That is, the computational complexity of the EAWFC is almost half of that of the traditional encoding process of TSVQ.

3.2. Fast search

There already exist some algorithms to speed up the encoding process of VQ by reducing the search complexity [6-lo]. Assume that we want to determine the closest vector from a set of vectors ( Ci:i = 1,2,. .,N] for an input vector X. As shown in Fig. 2, if the distance r between a vector C, and the vector X is smaller than half of the minimum distance between Ci and other codewords, then the vector Ci must be the best match of the vector X (the closest one). Thus, we may stop the searching process and choose Ci as the closest vector when Ci satisfies

d(X, Ci) < 0.5 min[d(Ci, Cj) j = 1,2, . . ., i - 1, i + 1, . . ., N]

(6) The above formula can also be rewritten as

d(X, Ci) < 0.5ri (7)

J.Z.C. LA/Image and Vision Computing 15 (1997) 867-871 869

Fig. 2. The vector C, and its nearest vector C,. 4. Experimental results

where ri is the minimum distance between Ci and Cj : j= 1,2 ,..., i - 1,i + l,..., N.

The search process in the encoding process of TSVQ usually starts from the root node, passes a series of nodes (one node for each level of a tree) and ends at a leaf node. However, this kind of search process is not necessary. Due to the spatial continuity, neighboring blocks in many cases have the same reproduction vectors. That is neighboring blocks may follow the same path map describing how the encoder went from the root node to the terminal node. In other words, we may use Eq. (7) to check whether a vector is the closest vector to an input vector X. In this way, we may avoid the process of encoding X started always from the root node of a tree. A similar idea was used by Neuhoff and Moayeri [9] and Orchard [lo] to speed up the search for codewords. Now we will present the encoding algorithm with fast search (EAWFS) for a set of input vectors (Xi:i = 1,2,. ..,N} as fOllOWS.

To evaluate the performance of the proposed algorithms, four real images (peppers, lena, baboon, airplane) have been used as our training sets to generate codebooks. In the fol- lowing tables, the encoding algorithm with fast comparison (EAWFC) and encoding algorithm with fast search (EAWFS) are compared with the available encoding scheme of TSVQ. These experimental results were done on a PC with an Intel 486 33 MHz processor and 16 Mb RAM.

Table 1 gives the execution times of encoding the image ‘lena’ for the available encoding algorithm, EAWFC, and EAWFS with the encoding process started at the level of 3. The results show that the encoding time of EAWFC is some- what higher than one half of the computing time of the available encoding algorithm for TSVQ. The result seems to contradict the analysis given in Section 3 that the encod- ing time of EAWFC should be about one half of the tradi- tional algorithm. This is because both methods spend the same time of writing codewords into a file and this time is also part of the execution time. Table 1 also reveals that both EAWFC and EAWFS with c = 3 have the same time of encoding ‘lena’. Table 2 gives the computing times of encoding the image ‘tiffany’, which is outside the training set. The conclusions given for Table 1 also hold for this example.

1.

2.

Leti*1.

3. 4. 5. 6. 7.

If i = 1, set k + 0 and go to step 4. Otherwise check

whether d(Xi,Ci_ t) < 0.5ri, where C’_r is the closest test vector to Xi_, at level c. At this step, we check whether C,_, is the closest test vector to Xi at level c. If d(Xi,Ci_ 1) < 0.5ri, set k + C. Otherwise set k - 0.

Set 1 - 2k + 1 and r - 2k + 2. Calculate (X - C&Cd. If (X - C,,,)-C, e 0, set k - r. Otherwise set k - 1. Repeat steps 4 to 6 until a terminal node is reached and seti+-i+ 1.

8. Repeat steps 2 to 7, if i 5 N.

For each current input vector, the above algorithm first checks whether the test vector, at level c of a tree, closest to the previous input vector is still the closest vector at the same level through step 2. If the checking condition is con- firmed, then the encoding process is started at level c of the tree. Otherwise, it is started at the root node. The advantage

Table 1

The execution time of encoding a real image (lena) for 3 different algorithms”

of the EAWFS is that the calculations of (X - C&Cd, as shown in step 5, from the root node to level c may be avoided. However the present method needs an extra distor- tion calculation of d(X;,Ci_l).

From Tables 1 and 2 we can find that more encoding time will be saved if more codewords are used. The saving of encoding time is 4 seconds in the case of 1024 codewords, while only 2 seconds are saved if only 64 codewords are used. It is noted that when the number of codewords is larger than 1024, the corresponding codebook is seldom used in practical application.

To see the effect of training set on the performance of the proposed methods, seven real images were used to generate the codebook. Table 3 gives the execution times of encoding ‘tiffany’. Comparing Tables 2 and 3, we find that the encod- ing times using four and seven images, respectively, as the

Number of codewords

64 256

1024

Available encoding algorithm (seconds)

6 8

10

EAWFC (seconds)

4 5

6

EAWFS with Speedup compared

c = 3 (seconds) to traditional algorithm

4 1 so 5 1.60

6 1.67

a The training set with dimension 16 comprises 4 real images (peppers, lena, baboon, airplane).

870 J.Z.C. Lai/Imqe and Vision Computing 15 (1997) 867-871

Table 2 The execution time of encoding a real image (tiffany) for 3 different algorithmsa

Number of codewords Available encoding

algorithm (seconds)

EAWFC (seconds) EAWFS with

c = 3 (seconds)

Speedup compared

to traditional algorithm

64 6 4 4 1 so 256 8 5 5 1.60

1024 10 6 6 1.67

a The training set with dimension 16 comprises 4 real images (peppers, lena, baboon, airplane).

Table 3 The execution time of encoding a real image (tiffany) for 3 different algorithms”

Number of codewords Available encoding

algorithm (seconds)

EAWFC (seconds) EAWFS with c = 3 (seconds)

64 6 4 4

256 8 5 5 1024 10 6 6

a The training set with dimension 16 comprises 7 real images (peppers, lena, baboon, airplane, milk, bridge, zelda).

Table 4 The execution time (in seconds) of encoding real images for different algorithms with the number of codewords = 256”

Image Available encoding EAWFS with c = 3 EAWFS with c = 4 EAWFS with c = 5

algorithm

EAWFS with c = 6

Lena 8 5 4 6 5 Airplane 8 5 6 5 5 Baboon 8 5 5 5 5 Peppers 8 5 6 6 5 Tiffany 8 5 5 5 5 Milk 8 5 5 4 4

a The training set with dimension 16 comprises 4 real images (peppers, lena, baboon, airplane).

Table 5

The execution time (in seconds) of encoding real images for different algorithms with the number of codewords = 1024a

Image Available encoding

algorithm

EAWFS with c = 4 EAWFS with c = 5 EAWFS with c = 6 EAWFS with c = 8

Lena 10 7 I 6 Airplane 10 7 6 6 Baboon 10 7 6 6 Peppers 10 7 7 6 Tiffany 10 6 6 6 Milk 10 6 5 5

a The training set with dimension 16 comprises 7 real images (peppers, lena, baboon, airplane, milk, bridge, zelda).

training set are the same. This is because codewords gener- ated using four and seven images, respectively, are of little difference.

Table 4 gives performances of the EAWFS with encoding started at various levels of the tree for the case that the number of codewords is 256. The results show that the EAWFS is better than the traditional algorithm for encoded images that are inside and outside the training set. Table 5 gives encoding times of the EAWFS with encoding started at various levels of the tree when the number of codewords is 1024. Tables 4 and 5 show that the performances of the EAWFC and EAWFS are about the same. However, for smooth images with a few edges, such as ‘tiffany’ and

‘milk’, the EAWFS will have shorter encoding times. This is because there is high correlation between neighboring blocks for a smooth image. In the best case, the EAWFS with c = 8 can reduce the encoding time from 10 seconds of the traditional encoding algorithm to 4 seconds, while the EAWFC can only reduce encoding time from 10 seconds to 6 seconds.

Table 6 gives the average distortion per vector for the available encoding algorithm of TSVQ and the EAWFS. It is noted that both the EAWFC and the available encoding algorithm should have the same distortion. The EAWFS encodes an image by checking whether a test vector at level c of the tree is the closest vector to an input vector

J.Z.C. L.ai/Inmge and Vision Computing 15 (1997) 867-871 871

Table 6

The average distortion of encoding real images for different algorithms with the number of codewords = 256a

Image Available encoding

algorithm

EAWFS with c = 3 EAWFS with c = 4 EAWFS with c = 5 EAWFS with c = 8

Lena 156.23 156.23 156.23

Airplane 193.75 193.75 193.75

Baboon 561.98 561.98 561.98

Peppers 160.12 160.12 160.12

Tiffany 268.22 268.22 268.22

a The training set with dimension 16 comprises 4 real images (peppers, lena, baboon, airplane).

156.23 156.23

193.75 193.75

561.98 561.98

160.12 160.12

268.22 268.22

X. Therefore the EAWFS has the potential of obtaining a lower distortion than those of the traditional method and the EAWFC. However Table 6 shows that the EAWFS with various values of c has made no improvement and also no degradation compared to either the traditional method or the EAWFC for five decoded images that are inside and outside the training set.

5. Conclusions

In this paper, we develop the EAWFS and EAWFC to speed up the encoding process of TSVQ. The computational complexity of the EAWFC is about half of that of the avail- able method. For many real images, the EAWFS and EAWFC have about the same performance and both approaches give a shorter decoding time than that of the available algorithm for TSVQ. However for smooth images with a few edges, the EAWFS may outperform EAWFC. The EAWFS may theoretically obtain a lower distortion than the conventional method or EAWFC. However from our experiments, we have found that these three methods have the same distortions for real images that are inside and outside the training set.

IPPR Conference on Computer Vision, Graphics and Image Proces- sing, Nanto, Taiwan, August 1993, pp. 152-159.

[2] C.F. Barnes, S.A. Rizvi, N.M. Nasrabadi, Advances in residual vector

quantization of images: a review, IEEE Trans. Image Processing 5 (2)

(1996) 226-262.

[3] P.C. Cosman, R.M. Gray, M. Vetterli, Vector quantization of image

subbands: a survey, IEEE Trans. Image Processing 5 (2) (1996) 202-

225.

[4] R.M. Gray, H. Abut, Full search and tree searched vector quantization

of speech wave forms, Proc. of IEEE International Conference on

Acoustics, Speech and Signal Processing, Paris, May 1982, pp.

593-596.

[5] X. Wu, K. Zhang, A better tree-structured vector quantizer, in: J.C.

Tilton (Ed.), Space and Earth Science Data Compression Workshop,

Snowbird, Utah, April 1991, NASA, Washington, DC, 1991, pp. 392-

401. [6] C.M. Huang, Q. Bi, G.S. Stiles, R.W. Harris, Fast full search equiva-

lent encoding algorithms for image compression using vector quanti-

zation, IEEE Trans. Image Processing 1 (3) (1992) 413-416. [7] I. Katsavounidis, C-C.J. Kuo, Z. Zhang, Fast tree-structured nearest

neighbor encoding for vector quantization, IEEE Trans. Image Pro-

cessing 5 (2) (1996) 398-404.

[S] T.W. Chiang, Image compression using classified vector quantization

with mean value prediction, Master Thesis, Institute of Automatic

Control, Feng-Chia University, 1993.

[9] D.L. Neuhoff, N. Moayeri, Tree searched vector quantization with

interblock coding, Proc. 1988 Conf. on Information Science and Sys-

tems, March 1988, pp. 781-783.

[lo] M.T. Orchard, A fast nearest-neighbor search algorithm, Proc. of

IEEE International Conference on Acoustics, Speech and Signal Pro-

cessing, Toronto, ON, May 1991, pp. 2297-23 10.

References

[l] J.Z.C. Lai, C-S. Lo, Mean-removed classified vector quantization,