color image coding and indexing using btcpszqiu/online/ieee-trans-ip-jan-2001.pdf · block...

1

Abstract-- Image indexing and retrieval is a very importantresearch topic which has attracted great interests in recent years.Image coding is a successful field which has been studied for morethan three decades. All image coding techniques attempt toextract visually most important features and represent them ascompactly as possible. Content-based image indexing involves theextraction of compact and discriminative features and uses themas indexing keys. We believe the two fields, image coding andimage indexing, are closely related, especially, image indexing canexploit fruitfully the results and experiences of over 30 yearsresearch in the field of image coding. This paper presents a newapplication of a well-studied image coding technique, namelyblock truncation coding (BTC). It is shown that BTC can not onlybe used for compressing colour images, it can also be convenientlyused for content-based image retrieval from image databases. Westudy a number of single bitmap BTC techniques for colour imagecoding in different colour spaces. From the BTC compressedstream (without performing decoding), we derive two imagecontent description features, one termed the Block Colour Co-occurrence Matrix (BCCM) and the other Block PatternHistogram (BPH). We use BCCM and BPH to compute thesimilarity measures of images in the retrieval of images indatabase applications. Experimental results demonstrate thatBCCM and BPH are comparable to similar state of the arttechniques in image retrieval.

Index Terms--BTC, color imaging, image database, imageindexing, image coding, content-based image retrieval

1. INTRODUCTION

The rapid expansion of the Internet and fast advancement in colourimaging technologies have made digital colour images more and more readilyavailable to professional and amateur users. The large amounts of imagecollections available from a variety of sources (digital camera, digital video,scanner, the Internet etc.) have posed increasing technical challenges tocomputer systems to store/transmit and index/manage the image dataeffectively and efficiently to make such collections easily accessible.

The storage and transmission challenge is tackled by imagecoding/compression, which has been studied for more than 30 years andsignificant advancements have been made. Many successful, efficient andeffective image-coding techniques have been developed and the body ofliterature on image coding is huge. Well-developed and popular internationalstandards, e.g. [6], on image coding have also long been available and widelyused in many applications.

The challenge to image indexing/management is studied in the contextof image database, which has also been actively researched by researchersfrom a wide range of disciplines including those from computer vision, imageprocessing and traditional database areas for over a decade [14]. Oneparticularly promising approach to image database indexing and retrieval isthe query by image content (QBIC) method [1], whereby the visual contentsof the images, such as colour distribution (colour histogram), textureattributes and other image features are extracted from the image usingcomputer vision/image processing techniques and used as indexing keys. Inan image database, these visual keys are stored along with the actual imagery

Manuscript received February 1, 2001G. Qiu is with the School of Computer Science, University of Nottingham,

United Kingdom (telephone: +44 115 8466507, e-mail: [email protected]).

data, and image retrieval from the database is based on the matching of themodels visual keys with those of the query images. Because extra informationhas to be stored with the images, traditional approach to QBIC is not efficientin terms of data storage. Not only is it inefficient, it is also inflexible in thesense that image matching/retrieval can only based on the pre-computed setof image features.

Many image coding methods developed over the years are essentiallybased on the extraction and retention of the most important (visual)information of the image. The retained important information, such as theDCT coefficients of JPEG can be used for image indexing and objectrecognition [2]. However, since JPEG and other similar methods are notexplicitly designed for image indexing purpose, models and features have tobe derived from the transform coefficients, which generally involvescomplicated and complex computation, and also leads to an expansion ofdata. For example, it has been demonstrated that colour is an excellent cue forimage indexing [3], however, it is difficult to explicitly exploit colourinformation from the transform coefficients without decoding. On the otherhand, non-transform based image coding can have image features such ascolour more easily available. For example, recent work on colour imagecoding using vector quantization has demonstrated that colour as well aspattern information can be readily available in the compressed image stream(without performing decoding) to be used as image indices for effective andefficient image retrieval [4].

Block Truncation Coding (BTC) is a relatively simple image codingtechnique developed in the early years of digital imaging more than 20 yearsago [5]. Although it is a simple technique, BTC has played an important rolein the history of digital image coding in the sense that many advanced codingtechniques have been developed based on BTC or inspired by the success ofBTC. Even though the compression ratios achievable by BTC have long beensurpassed by many newer image-coding techniques such as DCT (JPEG) [6]and Wavelet [7], the computational simplicity of BTC has made it and BTC-like image coding techniques attractive in applications whereby real time fastimplementation is desirable. Further more, with the rapid advancement inprocessor speed, storage device technology and faster network connection,low bit rate coding is no longer a critical factor in many practicalapplications. Based on a certain trade-off, higher bit rate and complexity canbe acceptable. On the other hand, with very large image collections becomingmore and more common, effectively managing large image database, makingimages easily accessible have becoming a challenge. Modern imagingsystems not only require efficient coding, but also easy manipulation,indexing, and retrieval, the so-called “4th criterion” in image coding [18].

In this paper we shall show another attractive feature of BTC-typesimage coding methods. We shall demonstrate that BTC can not only be usedfor image coding, it can also be conveniently and effectively used in contentbased image indexing and retrieval for image database applications. In section2, we first briefly review the original BTC idea and its application in colourimage coding; we then describe a number of implementations of single bitmap BTC for colour image coding. In section 3, we develop a method toderive image content description features directly from the BTC streamwithout performing decoding. Two types of features are derived, onecharacterises the statistics of the colour correlation within the blocks, and theother characterises the statistics of the spatial patterns of the blocks. Insection 4, we present experimental results on content-based image retrieval todemonstrate that the new method is comparable to state of the art techniques.Concluding remarks are given in Section 5.

2. BTC FOR COLOUR IMAGE CODING

Block Truncation Coding (BTC) was first developed in 1979 for grey-scale image coding [5]. This method first divides the image to be coded intosmall non-overlapping image blocks (typically of size 4 x 4 pixels to achievereasonable quality). The small blocks are coded one at a time. For each block,the original pixels within the block are coded using a binary bit-map the same

Color Image Coding and Indexing using BTCGuoping Qiu, Member, IEEE

2

size as the original block and two mean pixel values1. The method firstcomputes the mean pixel value of the whole block, and then each pixel in thatblock is compared to the block mean. If a pixel is greater than or equal to theblock mean, the corresponding pixel position of the bitmap will have a valueof 1, otherwise it will have a value of 0. Two mean pixel values, one for thepixels greater than or equal to the block mean and the other for the pixelssmaller than the block mean are also calculated. At decoding stage, the smallblocks are decoded one at a time. For each block, the pixel positions wherethe corresponding bitmap has a value of 1 is replaced by one mean pixel valueand those pixel positions where the corresponding bitmap has a value of 0 isreplaced by another mean pixel value. Formally, let x(i, j,), i = 1,2,…,m, j =1, 2, …, n be the pixel values of an m x n image block to be coded usingBTC, the block mean M is first calculated as

( )∑∑= =×

=m

i

n

j

jixnm

M1 1

,1

(1)

A binary bitmap b(i,j,), i = 1,2,…,m, j = 1, 2, …, n and two mean valuesM1, M2 are computed as

( )( )

( )

<

≥=

Mji, xif 0

Mji, xif 1, jib (2)

( )( ) ( )∑∑

∑∑ = =

= =

=m

i

n

j

n

i

m

j

jixjibjib

M1 1

1 1

1,,

,

1(3)

( )( )( ) ( )∑∑

∑∑ = =

= =

−−×

=m

i

n

j

n

i

m

j

jixjibjibnm

M1 1

1 1

2,,1

,

1(4)

The decoded output of the block y(i,j,), i = 1,2,…,m, j = 1, 2, …, n is asfollows

( )( )

( )

∀

=

== ji

M

Mjiy ,

0ji,b if

1ji,b if ,

2

1

(5)

It was quite nature to extend BTC to multi-spectrum images such ascolour images, and a number of authors have suggested various methods [8 -10]. The simplest extension was to view a colour image as consists of 3independent grey scale images and apply BTC to each colour planeindependently. The disadvantage of this method is that three bit planes areneeded hence the compression ratios achievable are low. A more aggressiveapproach is to exploit the redundancy exits between spectrum bands by usingonly a single bit map for all the spectrum bands. In this work, we study singlebit plane BTC.

2.1 Colour Space SelectionMost colour images are recorded in RGB space, which is perhaps the

most well known colour space. However, RGB colour space is often notsuited for image processing purposes due to a number of reasons. Firstly,there exits significant information redundancy amongst the colour planes,therefore it may not be suitable for image processing tasks such as imagecoding. Secondly, it is not a uniform colour space in the sense that the metricdifferences (e.g. Euclidean distance) of two colours in this space do not relateto the perceived difference of the two colours. This is very inconvenient whena computational measure of colour differences is required. Thirdly, it is verydifficult to predict the visual appearance of the image using RGBs signals; forexample it is very difficult to determine from the RGBs values whether animage is well contrast, colourful or visually appealing. This is veryinconvenient when the task is to enhance the image to make it “look good”.Fortunately, researchers have developed various colour spaces that are suitedfor different image processing tasks. In this paper, we have used threedifference colour spaces in our experiments. Apart from RGB space, theYCbCr space defined by the International Radio Consultative Committee(CCIR) for HDTV systems [11] and the CIE L*a*b* uniform colour space[12] are also used.

1 In the original implementation, the block mean and the variance of thepixels are used to preserve the first and second moments of the block. Thedescriptions here follow a later version of BTC, which was shown to givebetter performance [8], and suitable for our application.

The relation between RGB and YCbCr is as follow [11]

−−−−=

BGR

CCY

r

b

0813.04187.0500.0500.03313.01687.0114.0587.0299.0

(6)

The CIE L*a*b* and RGB is related in the following way

−

=

−

=

−

=

nn

nn

n

ZZ

fYY

fb

YY

fXX

fa

YY

fL

200*

500*

16116*

(7)

where BGRZBGRYBGRX

5943.50565.00000.00601.05907.40000.11300.17518.17690.2

++=++=++=

( )

≤+

>=

0088560 11616

7877

0088560 31

.x x.

. x xxf

nnnZYX ,, are the tri-stimuli of the white stimulus [12].

2.2 Single Bit Plane BTC for Colour Image CodingAs described previously, BTC divides the image to be coded into small

blocks and code them one at a time. For single bit map BTC of colour image,a single binary bitmap the same size as the block is created and two coloursare computed to approximate the pixels within the block. In this work we usetwo approaches to create the binary bitmap, one based on thresholding andthe other based on a clustering technique.

Thresholding Bitmap Generation in RGB Space: To create a binarybitmap in the RGB space, an inter-band average image (IBAI) is first createdand a single scalar value is found as the threshold value. The bitmap is thencreated by comparing the pixels in the IBAI with the threshold value.Formally, Let X = r(i,j), g(i,j), b(i,j), i = 1, 2, …, m, j = 1, 2, …, n be an mx n colour image block in RGB space. Let I = ib(i,j), i = 1, 2, …, m, j = 1, 2,…, n be the inter-band average image, then we have

( ) ( ) jijibjigjirjiib , ),(),(),(31

, ∀++= (8)

and the threshold is computed as the mean of I, i.e.

( )∑∑= =×

=m

i

n

j

jiibnm

T1 1

,1

(9)

The binary bitmap b(i,j,), i = 1,2,…,m, j = 1, 2, …, n, is computed as

( )( )

( )

<

≥=

Tji,ib if 0

Tji,ib if 1, jib (10)

Thresholding Bitmap Generation in YCbCr Space: This is similar tothat of RGB space except that Y plane is used to create the threshold and thebinary bitmap. Here Y will replace the inter-band average image I and the restof the procedures are identical.

Thresholding Bitmap Generation in L*a*b* Space: This is also similarto that of RGB space except that here L* plane is used to create the thresholdand the binary bitmap. In this space L* will replace the inter-band averageimage I and the rest of the procedures are again identical.

After the creation of the bitmap, two representative (mean) colours arethen computed. Let Z = z1(i,j), z2(i,j), z3(i,j), i = 1, 2, …, m, j = 1, 2, …, nbe an m x n colour image block, where zi represents the ith colour plane inany of those three colour spaces described above. Two mean colours, MC1 =(c11, c12, c13), and MC2 = (c21, c22, c23) are computed as follows

( )( ) ( )∑∑

∑∑ = =

= =

=m

i

n

j

kn

i

m

j

kjizjib

jibc

1 1

1 1

1,,

,

1k = 1, 2, 3 (11)

3

( )( )( ) ( )∑∑

∑∑ = =

= =

−−×

=m

i

n

j

kn

i

m

j

kjizjib

jibnmc

1 1

1 1

2,,1

,

1k = 1, 2, 3 (12)

The decoded output of the block Y = y1(i,j), y2(i,j), y3(i,j), i = 1, 2, …, m, j =1, 2, …, n is as follows

( )( )

( )kji

c

cjiy

k

k

k,,

0ji,b if

1ji,b if ,

2

1

∀

=

== (13)

Data Clustering Approach to BTC Coding of Colour Image: Let Z =z1(i,j), z2(i,j), z3(i,j), i = 1, 2, …, m, j = 1, 2, …, n be an m x n colour imageblock, where zi represents the ith colour plane in any of those three colourspaces (RGB, YCbCr or L*a*b*) described above. Each of the m x n pixels inthe block can be regarded as a point in the 3D colour space and the BTCproblem can be reformulated as a data clustering problem. The task is tocluster the m x n points in the 3D colour space into two clusters. Dataclustering has been well studied, well-know algorithms such K-means, fuzzyk-means and neural networks [13] can be used to cluster the data. Since thenumber of points is typically small (16 data points if a 4 x 4 block were used),we use the simple k-means algorithm [13] in this paper. The tworepresentative colours MC1 = (c11, c12, c13), and MC2 = (c21, c22, c23) arefound using an iterative procedure as follows:

Step 1 Initialise MC1(0) = (c11(0), c12(0), c13(0)), and MC2(0) = (c21(0),c22(0), c23(0)) to two of the randomly picked pixels from the block

Step2 Assign all the pixels in the block, Z(i,j) to one of the two classesaccording

( )1

, Ω∈jiZ if ( ) ( ) ( ) ( )tMCjiZtMCjiZ21

,, −≤−

( )2

, Ω∈jiZ if ( ) ( ) ( ) ( )tMCjiZtMCjiZ21

,, −>−

Step 3 Update the mean colours according to

( ) ( )( )

( ) ( )( )∑

∑

Ω∈∀

Ω∈∀

Ω=+

Ω=+

2

1

,2

2

,1

1

,1

1

,1

1

jiZ

jiZ

jiZtMC

jiZtMC

where 1

Ω and 2

Ω denote the number of pixels in Ω1 and Ω2

respectively.

Step 4 Repeat by going to Step 2.

After the iterative process has converged (we found 3 iterations workvery well) the two mean colours are fixed. The binary bitmap can now beeasily generated as

( )( ) ( )

( ) ( )

−>−

−≤−=

21

21

,, if 0

,, if 1,

MCjiZMCjiZ

MCjiZMCjiZjib (14)

Decoding is the same as (13).

Please note we always name the mean colour has a larger brightness as MC1,this is so because we will exploited this in the next section to construct imagecontent description features.

2.3 Post-BTC CodingThe compression ratios achievable by the above-described BTC coding

scheme are not very high. In the case whereby 4 x 4 block size is used,assuming the original image is 24-bit true colour image, and 24 bits are usedto code each of the two representative colours, each block will take(16+24+24) bits to code thus achieving a compression of 6 : 1. Of course,there are redundancies can be exploited to achieve higher compression ratios.However, this is beyond the interest of this paper, readers are referred to, e.g.[9] [10] for various techniques applicable to post BTC coding for reducingthe bit rates and achievable compression ratios.

However, since the images in our application are visual images, i.e., thefinal judgement of the image quality lies on the human observers. Therefore,objective image quality measure, such as mean squared errors which does notrelate proportionally to the visual quality of the image, is not a meaningfulmeasure. Through psychovisual experiment, we found that the visual qualityof the BTC coded images were not greatly affected by colour quantizing the

mean colours [19]. Therefore, for each BTC coded image, a colour table(palette) [20] consists of 256 colours are constructed and stored. The meancolours are coded (8 bits per mean colour) as the indices of the colours in thecolour table that are the closest to the mean colours. Therefore the bit rate inour scheme is 2 bits per pixel or 12:1 compression, plus 768 bytes (256 x 3bytes) colour table overhead in each image.

3. DERIVING IMAGE CONTENT DESCRIPTION FEATURES FROMBTC CODE

It is clear from above descriptions of single bit map BTC for colourimage coding that the BTC code for each small block consists of two coloursand a binary bitmap. In this section, we describe methods to derive imagecontent description features from the mean colours and the bitmap pattern forimage retrieval in database applications.

Content based image indexing and retrieval for image database is animportant topic of research which has attracted extensive interests in recentyears [1, 14]. Early approaches used colour alone for indexing [3]. Thisapproach has been very successful and is extensively used in many of today’sresearch and commercial systems. However, it is well known that the biggestdrawback of colour histogram based method is that histogram is a globalmeasure, it does not contain any spatial information. Recently researchershave developed a better and more discriminative technique known as theColour Correlogram (CC) technique [15], which has been shown to providesignificant improvement over the original colour histogram approach. ColourCorrelogram is very similar to co-occurrence matrix [16] developed some 20year ago for grey scale texture classification. Formally, the CC of imageZ(x,y), x = 1, 2…M, y = 1, 2, …, N is defined as

( ) ( )21212211

,max;),(|),(Pr,, yyxxkCyxZCyxZkjiCCji

−−=∈∈= (15)

where the original image Z(x,y) is quantized to J colours C1, C2, …, CJ andthe distance between the two pixel k∈ [minM, N] is fixed a priori. Inwords, the CC of an image is the probability of joint occurrence of two pixelssome k distance apart that one pixel belongs to colour Ci and the otherbelongs to colour Cj. The size of CC is ( )KLO 2 . To reduce storagerequirement, [15] concentrated on auto-correlogram whereby i = j and its sizeis of ( )LKO .

3.1 BTC Colour Co-occurrence Matrix (BCCM)Similar to CC described above, we exploit the code of BTC described in

the last section by introducing a similar statistical measure, which we termBTC or Block Colour Co-occurrence matrix (BCCM). As in the case of CC, aglobal colour table consists of J colours is first designed. It is important tounderstand that there is only ONE global colour table (its design will bedescribed in the next section) which is used by all images, the query as wellas all images in the database. This global colour table is different from thecolour tables used by each individual image in the post BTC coding describedin the last section. To make the distinction, we shall call the individualimage’s colour table local colour table. The BCCM of an image is constructedas follow:

Let Cg = Cg(0), Cg(1), …, Cg(J) be the J-colour global colour table, Cf

= Cf(0), Cf(1), …, Cf(255)be the 256-colour local colour table of a BTCcoded image. For a BTC coded image block, B, its bit stream consists of abitmap and two 8-bit integer numbers, i.e. B = b(x, y), I1, I2. These twointegers, I1, I2 ∈ [0, 255], are the colour indices of the two mean colourspointing to two colours in the local colour table. We construct a hash tableindexing the global colour indices with the local colour indices. Let IHash bea colour indices hash table which has the local colour indices, If, as its keyand the global colour indices, Ig, as its value. The hash table can beconstructed as:

( )[ ]

( ) ( )( )( )

=

−=⇒=−∈∀

gf

ggffJIggf

IIIHash

ICICIIIIHashg 1,0minarg and

for all If = 0, 1,2, …, 255 (16)i.e, each local colour index is mapped to a global colour index based on aminimum distance criterion (it is possible that two or more local colourindices may be mapped to the same global index). This hash table can eitherbe constructed on-line or stored with the encoded image data which will incura slight overhead (we opt for storing this hash table in our experiments).

4

The BCCM of an image is defined as the probability of the co-occurrence of two mean colours within the BTC blocks. Formally

( ) ( ) [ ]10 , |Pr,21

,J-i,jjIIHashiIIHashjiBCCM ∈∀=== (17)

where I1 and I2 are the pair of local colour indices of the BTC mean colourswithin the same block. Equivalently, ( )jiBCCM , gives the probability that aBTC coded block having one mean colour MC1 (local colour quantizedcolour) being mapped to the i-th colour in the global colour table and othermean colour MC2 (local colour quantized colour) being mapped to the j-thcolour in the global colour table.

Obviously, since the mean colours of the block are coded into localcolour indices, the operation only involves hash table retrieval, and istherefore computationally very efficient. If an image is of the size M x N andthe block size used by the BTC is m x n, then the number of blocks (pairs ofcolours) used to compute BCCM is ( ) ( )nmNM ×× and can be constructed

“on-line” as and when it is required. The size of BCCM is ( )2JO . As havebeen noted in [15], BCCM is also very sparse and can be stored veryefficiently. However, since the computation is so trivia, BCCM is computed"on the fly" and there is no need to store it permanently in a image database.It is also worth noting that there is no need to decode the BTC code in thecomputation of BCCM.

3.2 Block Pattern HistogramWe further introduce another measure, termed block pattern histogram

(BPH) to characterise the image content. Firstly, we create a codebook ofbinary patterns for the bitmap of BTC code using binary vector quantization.Let Q = Q1, Q2, …QN be the bitmap codebook consists of N codewords andQi is an m x n binary pattern2. Let B = b(i, j), i = 1,2,…,m, j = 1,2,…,n bethe bitmap of the BTC code, the pattern index of the bitmap, PI, is the indexof the bitmap codeword that is most similar to B, formally defined as

( ) ( )jjI

QBDHP ,minarg∀

= (18)

where DH(B, Qj) denotes the Hamming distance between the two binarypatterns (vectors). The BPH is defined as

BPH( i ) = Pr(PI = i), i = 1,2,…, N (19)

For the binary vector quantizer, we have developed a robust binary vectorquantization scheme based on a neural network learning technique, theFrequency Sensitive Competitive Leaning (FSCL) [21], to design the bitmapcodebook. In the author’s experience, FSCL is the most reliable VQ codebookdesign technique and is described briefly here. Let X(t) be K dimensional(K = m x n) binary vectors, which form the training samples. The FSCLtechnique for designing a codebook, Qi, l =1,2,…N, of N codewordsfollows the following steps:

1. Initialise the codewords to Qi(0), i = 1, 2, …, N, to random 0 and 1patterns and set the counters associated with each codeword to 1,Counti(0) = 1, i = 1, 2, …, N

2. Present the training sample, X(t), where t is the sequential index, andcalculate the Hamming distance between X(t) and the codewords Di(t) =DH(X(t), Qi(t)), and modify the distance according to

D*i (t) = Counti(t) x Di(t) for all i =1,2,…N

3. Find j, such that D*j <= D*i, for all i = 1, 2, …, N, update the codewordand counter

Qj(t+1) = Qj(t) + (X- Qj(t))Countj(t+1) = Countj(t) + 1

4. Repeat by going to 2. The algorithm stops after a pre-set number ofiterations (10~20 iteration have been found to work very well)

4. EXPERIMENTAL RESULTS

4.1 Image Coding

2 Notice that there is only ONE (global) binary bitmap quantizer which isnot used in coding.

We first tested the single bitmap BTC for image coding. It is worthymentioning that our interest is not to compete with state of the art imagecoding/compression techniques for a fraction of dB or bpp advantage.Instead, our argument is that image coding techniques such as BTC, eventhough they can only achieve moderate compression (in our case 12 : 1whichmaybe acceptable in many applications as have been argued at the beginningof the paper), they may provide other advantages (in our case, provide aconvenient, efficient and effective way for the retrieval of images in databaseapplication as demonstrated later in this section) over state of the artcompression techniques. Nevertheless, it is interesting to see how well themethod can compress images. In the experiments, we have tried all themethods and colour space described in Section 2 on a variety of colourimages. In all the experiments, the block size used was 4 x 4 pixels, the twomean colours were each coded with 8 bit with a local colour table designedbased on the method described in [20] and therefore the compression ratiowas 12 : 1. No post processing was applied to the image. Visual assessmentwas applied to compare the performance of different colour spaces anddifferent techniques. Five images and 5 subjects (undergraduate studentswithout particular knowledge about the purpose of the exercise) were used inthe experiment. Each of the images were coded in RGB, YCbCr and L*a*b*spaces using both thresholding and data clustering methods, thus each imagehad six coded version which were printed in the same size on a single piece ofpaper. The subjects were asked to rate the visual quality of each version of thecoded image using following scales: Excellent, very good, good, satisfactoryand unsatisfactory. Overall, all methods work satisfactorily with the clusteringmethod in L*a*b* had a slight edge. The rating seemed to be affected by theimage contents with colourful image with large areas of busy textures andedges got higher rating and images that were less colourful and withhomogeneous areas got lower rating. Fig. 1 shows an example (BTC based onRGB and thresolding). As can be seen, the visual quality of BTC codedimages is satisfactory. It is also seen that at the same compression ratio, JPEGcompressed image has a better visual quality3.

Fig.1 An example of BTC coded colour image. Top-left: Original image;Top-right: BTC coded image with 24 bit mean colours (6:1 compression);Bottom-left: BTC coded image with 8 bit mean colours (12:1 compression);Bottom-right: JPEG compressed image at a compression ratio of 12:1

4.2 Image Retrieval in Database Applications

3 Please note that the due to the limitations of the reproduction facility(printer), the better quality of the JPEG image is not easily discernible in theprinted copy of this manuscript, but can be clearly seen on a CRT. Anelectronic version of this paper can be viewed inhttp://www.cs.nott.ac.uk/~qiu/Online/Publications.html

5

We can use the BTC coding described in this paper to construct imagedatabase which will enable image retrieval using content-based querying byexample images. Since the different BTC methods described in section 2 givevery similar performances, the results presented here are those based on RGBspace and thresholding. The images in the database are coded by BTC andfrom the coded stream, we derive the images’ BCCM and BPH. In order toconstruct these two image features, we first need to design a global colourquantization table and a global bitmap codebook. To do so, we have used twohundred and thirty five 512 x 512 pixel, 24 bits/pixel true colour textureimages from MIT Media Lab’s VisTex collection4. We designed a 64-colourglobal colour table (please see Appendix for the values of these 64 colours)using all the pixels of these images based on the FSCL [21] algorithm. Todesign the bitmap codebook, we created the training bitmap patterns using allthese images based on equation (10) and a 256-pattern codebook was created.Fig. 2 shows the 4 x 4 bitmap patterns of the 256 codebook. In constructingthe image database, we have also created a colour index hash table (15) foreach image.

Fig. 2 Block bitmap codebook patterns. These 256 4 x 4 bitmap patterns werederived using binary vector quantization trained with over 3.5 millionsamples.

To retrieve images from the database coded by BTC based on queryingby image example, the query image is first coded with BTC and its BCCMand BPH are derived, and then compared with those of the images in thedatabase. Those images that are most similar to the query image are returnedto the user. To compare the similarity of two images based on BCCM andBPH, we can use a relative distance measure [15]. Let BCCMp, BPHp andBCCMq, BPHq be the BCCM and BPH of images p and q respectively. Thesimilarity of the images is measured as the distances between the BCCMs andBPHs d(p,q) and is calculated as follows

( ) ( ) ( )( ) ( )

( ) ( )( ) ( )∑

∑

∀

∀

++

−

+++

−=

k qp

qp

CC qp

qp

kBPHkBPH

kBPHkBPH

jiBCCMjiBCCM

jiBCCMjiBCCMqpd

ji

1

,,1

,,,

2

,

1

λ

λ

(20)

where λ1 and λ2 are weighting constants.

Texture Image DatabaseWe first used a texture database consists of 120 difference texture classes, asubset of which is shown in Fig.3 as thumbnails. The database was formed by

4

http://vismod.www.media.mit.edu/vismod/imagery/VisionTexture/vistex.htm

dividing 120 single textured images (each image was judged by humanobservers as consisting of roughly a single type of colour texture) of size 300x 200 pixels into 6 100 x 100 non-overlapping sub-images, thus creating adatabase of 720 texture images. All 720 images were coded using BTC. In theexperiment, each of the 720 images in the database was used in turn as thequery image. For each query image q, the distances between q and images inthe database, pi, d(q, pi) where i = 1,2,…, 720, pi ≠ q were calculated andsorted in increasing order and the closest set of images were then retrieved. Inthe ideal case the top 5 retrievals should come from the same large image asthe query image. The performance was measured in terms of the retrieval rate(RR) defined as follow: Let q be the query image from texture class Τk, i.e.q∈Τk. and pj, j = 1, 2, 3, 4, 5, be the first 5 retrievals, the retrieval rate forquery image q is

( ) ( )∑=

=5

151

j

jpSqRR where S(pj) = 1, if pj ∈Τk and S(pj) = 0, if pj∉Τk (21)

As a comparison, the colour auto-correlogram of [15] was also implementedusing D = 1,3,5,7 exactly as those described in the original paper. Bothmethods used a single colour table consists of 64 colours.

Fig. 3 A sub set of the 120 texture classes used in the first experiment.

Table 1 shows the number of queries and their corresponding retrievalrates for different methods. For the colour correlogram method of [15], 482queries (out of 720) achieved an RR of 100%, 130 queries achieved an RR of80%, 68 queries achieved an RR of 60%, 33 queries achieved an RR of 40%and 7 queries only achieved an RR of 20%. For BCCM only (λ1 = 1 and λ2 =0), 512 queries (out of 720) achieved an RR of 100%, 115 queries achievedan RR of 80%, 60 queries achieved an RR of 60%, 25 queries achieved anRR of 60% and 8 queries only achieved an RR of 20%. Combining BCCMand BPH together (λ1 = 1 and λ2 = 1), 566 queries (out of 720) achieved anRR of 100%, 86 queries achieved an RR of 80%, 46 queries achieved an RRof 60%, 17 queries achieved an RR of 60% and 4 queries only achieved anRR of 20%. It is seen that the new method provides a better performance. Fig.4, 5, 6 show some examples of the retrieved images by various methods.

RETRIEVAL RATES

METHODS 5/5 4/5 3/5 2/5 1/5CC 482 130 68 33 7

BCCM only 512 115 60 25 8BCCM + BPH 566 86 46 17 4

Table 1. The number of queries and their retrieval rates for the BTC andcolour correlogram methods performed on a texture image database consistsof 720 images and 120 different texture classes. Each image was used in turnas the query image, and therefore 720 queries were performed. This tableshould be interpreted as follow: for CC, 482 queries achieved a retrieval rateof 100% and 130 queries achieved a retrieval rate of 80% etc…

6

(a)

(b)

(c)Fig. 4 Examples of top 48 retrieved images by different methods from a 720-texture image database. The first sub-image on the top left-hand corner is thequery image, and subsequent images from left to right and top to bottom areretrieved image ordered according to their similarity to the query image. (a)colour correlogram, (b) BTC with BCCM only, (c) BTC with BCCM andBPH

(a)

(b)


7

(a)

(b)


Photographic Image DatabaseIn this second experiment, we tested the method on a database consists

of 20,000 photographic colour images from the commercially available CorelPhoto Collections. We hand picked 96 pairs of similar images which wereembedded into the database. Fig. 7 shows a sub set of these images, whichwas divided into set A and set B. When using an image from set A as query,the goal was to retrieve the target image in set B, or vice versa. As acomparison, we have also implemented the colour correlogram method. Foreach querying operation, all images in the database are ranked according totheir similarity to the querying image. In the ideal case, the target imageshould be in the 1st rank. Table 2 shows the result of using set A as query andset B as target. It is seen that the colour correlogram method has more queriesretrieved the target in the first rank and all methods found similar number oftargets within the first 10 retrieved images. On the other hand, the averagerank of the new method is much higher. Notice also that whilst the colourcorrelogram method has the lowest rank at 7299 position, the new method hasthe lowest rank at 5126 position. Therefore, on the whole, the new methodhas a better performance. Fig. 7, 8, 9 show some examples of retrievedimages. It is seen that the new method performed as least as well as the colourcorrelogram method.

Set A

Set BFig. 7 A sub set of the 96 querying image pairs. These pairs are divided intoset A and B. For each image in set A there is a corresponding similar targetimage in set B

8

(a)

(b)

(c)Fig. 8 Examples of top 48 retrieved images by different methods from aphotographic image database consists of 20,000 images. The first sub-imageon the top left-hand corner is the query image, and subsequent images fromleft to right and top to bottom are retrieved image ordered according to theirsimilarity to the query image. (a) colour correlogram, (b) BTC with BCCMonly, (c) BTC with BCCM and BPH.

(a)

(b)


9

(a)

(b)


RanksMethods 1 <=10 <=30 <=100 >100 Lowest Mean

CC 72 81 85 87 9 7299 300BCCM Only 66 80 89 91 5 6864 96

BCCM + BPH 67 81 89 91 5 5126 77

Table 2, Image retrieval performances of colour correlogram and BTC on aphotographic image database consists of 20,000 images. 96 queries wereperformed. The table shows the number of queries and their target images’rank range in the returned list. The table should be interpreted as follow: forthe CC method, 72 queries found their corresponding target image in the firstrank, 81 queries found their corresponding target images within the first 10returned image etc…

5. CONCLUDING REMARKS

In this paper, we have presented a method for using a well known imagecoding technique to achieve both image coding and content based imageretrieval in the compressed domain. We studied a number of methods for theconstruction of the BTC code. Two image content description image featuresdirectly derived from the compressed stream have been developed. Todemonstrate the effectiveness of using the BTC code for image retrieval, wepresented experimental results using a texture database and a largephotographic image database. Results showed that the new method performedat least as well as, sometime even better than state of the art technique. Onesignificant advantage of the current method is that it achieves coding andretrieval simultaneously. As a final remark, image coding has been studied fora long time and has also been very successful. All image coding methodsessentially try to extract the most important visual information and representthem in compact manner. Indexing and retrieval for image databaseapplications have becoming increasingly important. We believe results andexperiences of more than 30 years image coding/compression research canplay a fruitful role in the development of effective and efficient methods forimage indexing and retrieval in large image database. We hope the presentpaper has shown a glimpse of such potential.

References

1. W. Niblack et al, “Querying images by content using color, texture andshape”, Proc. SPIE Conference on Storage and Retrieval for Image andVideo Databases, vol. 1908, pp. 173 – 187, 1993

2. W. B. Seales et al, “Object recognition in compressed imagery”, Imageand Vision Computing, 16, pp. 337-353, 1998

3. M. Swain and D. Ballard, "Color Indexing", International Journal ofComputer Vision, Vol. 7, pp. 11-32, Year 1991

4. G Qiu, “Image indexing using a coloured pattern appearance model”,SPIE Conference on Storage and Retrieval for Media Databases 2001,January 2001, San Jose, CA USA

5. E. J. Delp and O. R. Mitchell, “Image coding using block truncationcoding”, IEEE Trans on Communications, 27, 1335-1342, 1979

6. W. B. Pennebaker and J. L. Mitchell, JPEG still image compressionstandard, Van Nostrand Reinhold, New York, 1993

7. J. M. Shappiro, “Embedded image coding using zero trees of waveletcoefficients”, IEEE Trans Signal Processing, vol. 41, pp. 3445-3462,1993

8. M. D. Lema and O. R. Mitchell, “Absolute moment block truncationcoding and its application to colour images”, IEEE Trans onCommunications, vol. 32, pp. 1148-1157, 1984

9. T. Kurita and N. Otsu, “A method of block truncation coding for colorimage compression”, IEEE Trans. On Communications, vol. 41, pp.1270 –1274, 1993

10. Y. Wu and D. C. Coll, “Single bit-map block truncation coding of colorimages”, IEEE Journal on Selected Areas in Communications, vol. 10,pp. 952 – 959, 1992

11. CCIR, “Encoding parameters of digital television for studios”, CCIRRecommendation 601-2, Int. adio Consult. Committee, Geneva, 1990

12. G. Sharma and H. J. Trussell, “Digital color imaging – Tutorial”, IEEETrans Image Processing, vol. 6 pp. 901 – 932, 1997

13. C. Bishop, “Neural Networks for Pattern Recognition”, OxfordUniversity Press, UK, 1995

14. Y. Rui, T.S. Huang, T.S. and S. F. Chang, “Image Retrieval: CurrentTechniques, Promising Directions, and Open Issues”, Journal of VisualCommunication and Image Representation, Vol. 10, pp. 39-62, 1999

10

15. J. Huang et al, “Image indexing using color correlogram”, Proc.CVPR97, pp. 762-768, 1997

16. R. M. Haralick, “Statistical and structural approaches to texture”, Proc.of IEEE, 67, pp. 786-804, 1979

17. B. S. Manjunath and W. Y. Ma, “Texture features for browsing andretrieval of image data”, IEEE Trans Pattern Analysis and MachineIntelligence, vol. 18, pp. 837 – 842, 1996

18. R.W. Picard, "Content Access for Image/Video Coding: "The FourthCriterion"", MIT Media Lab TR No. 295. 1994

19. G. Qiu, "Coding colour quantized image by local colour quantization",Proc. of the 6th Color Imaging Conference, Nov. 1998, Scottsdale,Arizona, USA

20. Gervautz, M. & Purgathofer, W. "A Simple Method for ColorQuantization: Octree Quantization", Graphics Gems, Academic Press,1990

21. S. C. Ahalt, et al, “Competitive learning algorithms for vectorquantization”, Neural Networks, vol. 3, pp. 277-290, 1990

APPENDIX

This is the 64-colour colour table designed using over 62 million pixels (235512 x512 24 bit true colour image from MIT Media Lab VisTex database).Each line represents a colour and the values are R, G, B values. We used thistable for the construction of colour correlogram and BCCM for imageindexing and retrieval.

74,67,5845,30,25234,197,145159,150,136112,44,29241,219,18750,89,153106,71,41118,103,85108,93,75253,254,254148,133,113229,168,8869,78,7652,24,9179,156,10536,41,43136,120,10070,75,3089,55,24203,191,173102,105,88139,144,12992,91,73167,109,59141,153,15722,13,22187,175,15816,11,11106,151,214145,75,85114,117,101174,161,144239,238,22653,45,36140,75,3749,51,2194,96,44205,144,184170,146,12131,18,12121,87,578,4,630,31,22

184,76,34157,174,18090,78,60124,123,58124,132,116211,208,199143,194,250159,131,87139,107,7573,49,34222,109,4286,64,45204,172,12769,37,1924,57,105190,132,80189,224,24457,58,5091,123,137104,139,156

This 64 colours are plotted in Fig. A1 (please note due to thelimitations of the reproduction devices, the reproduction ofthis colours are for illustration purpose only)

Fig. A1, The sixty four colours in the colour table

color image coding and indexing using btcpszqiu/online/ieee-trans-ip-jan-2001.pdf · block...

Documents