1. block truncation coding for image compression...

1. BLOCK TRUNCATION CODING FOR IMAGE COMPRESSION

1.1 DIGITAL IMAGE FUNDAMENTALS

This chapter deals with the fundamentals of digital image signal representation

and the basic Block Truncation Coding (BTC) for image compression. A frame of a

digital image can be visualized as an orderly arrangement of ‘picture elements’ (pixels)

arranged in horizontal lines and many such lines are stacked one below the other. It

may also be visualized as a matrix of pixels arranged in rows and columns. For example

a ‘512 x 512’ image has 512 horizontal lines in a frame, each with 512 pixels. A pixel is

the tiniest, visible part of an image having its own color (hue) and brightness (intensity

of light). The brightness is referred as ‘luminance ‘(luma) and the color is referred as

‘chrominance’ (chroma). Any colour can be represented as a mixture of three primary

colors namely red, green and blue. When an image is scanned electronically, each pixel

of the image produces its own R, G, B (red, green, blue) signals corresponding to the

intensity of the primary colors in that pixel [17], [28]. In digital processors, each of the

R, G, B signals are represented by 8 bits, corresponding to 256 quantization levels,

starting from zero intensity to full intensity. It is customary to explain any image

processing using monochrome (black and white) image, which can be extended to each

of the R, G, B components of the color image, separately [32].

1.2 NEED FOR IMAGE COMPRESSION

Digital images are in general stored in memories, preprocessed, transmitted and

reprocessed for final applications. The quantum of binary data

to be handled by an image processor is enormous. For example, a ‘256 x 256’ frame of

a monochrome image will have 524288 (256 x 256 x8) bits at the rate of 8 bits per pixel.

A 5 minutes video at the rate of 25 such frames per second will have 3932160000

(nearly 40 million) bits! Obviously, it will be advantageous to reduce the number of bits

before transmission with the capability of reproducing an acceptable image quality at

the receiver. This process is known as ‘Lossy Image Compression’. This will primarily

reduce the transmission time and also the storage memory required.

But images could also be compressed without reduction in quality by employing

suitable coding techniques. Inherently, such ‘Lossless Image Compression’ methods [9]

yield lesser compression, compared to ‘Lossy’ methods [30].

The ‘Compression ratio’ (CR) and ‘Bit Rate’ (BR) are used to measure the

amount of image compression, while the ‘Peak Signal to Noise Ratio (PSNR)’ and

‘Root Mean Square Error’ (RMSE) are used to measure the resulting error of image

compression. Contrast (C) is is a measure of image visual quality.

Both time-domain [20] and transform based frequency-domain [37], [38], [39]

image compression techniques are employed in image compression. Block Truncation

Coding (BTC) is an apparently elegant and efficient time-domain compression

technique.

2.3 TRADITIONAL BLOCK TRUNCATION CODING

The Block Truncation Coding (BTC) was introduced by Delp and Mitchell [10], in

1979. This coding is based on dividing the image into non overlapping blocks of equal

size. In digital signal processors, an image is divided into smaller blocks of ‘k x k’ pixels

for processing. For example a ‘512 x 512’ frame may be divided into blocks of ‘8 x 8’

pixels. Sometimes microblocks of ‘2 x 2’ pixels, miniblocks of ‘4 x 4’ pixels, maxiblocks

of ’16 x 16’ pixels and macroblocks of ‘32 x 32’ pixels are also used.

BTC involves replacing the original intensity value of each pixel in a block either

by a ‘low mean’ intensity value ‘a’ or a ‘high mean’ intensity value ‘b’ based on a

threshold intensity value. This threshold is the mean intensity of the pixels in the block.

A ‘bit plane’ is created by representing the ‘a’ value pixels by ‘0’s and ‘b’ value pixels by

‘1’s.

(2.3.1)

(2.3.2)

qm

qxa

q

qmxb

Here, ‘m’ is the total number of pixels equal to 𝑘2 (16 for a 4x4 block)

‘q’ is the number of ‘0’s in the bit plane

�̅� is the mean intensity of ‘m’ pixels

‘σ’ is the standard deviation of intensities of ‘m’ pixels.

(2.3.3)

𝜎 = [𝑥2̅̅ ̅ − (�̅�)2]0.5 (2.3.4)

(2.3.5)

where 𝑚 = 𝑘2and 𝑥𝑖,𝑗is the intensity value of the pixel (i,j) of the image,𝑥 ̅is the mean

intensity, 𝑥2̅̅ ̅is the mean of squared intensities and σ is the standard deviation (SD).

The encoder transmits the ‘bit plane’ of total ‘m’ bits, along with �̅� and ‘σ’ of each

8 bits. In the decoder, the ‘0’s and ‘1’s of the bit plane are replaced by 8-bit ‘a’s and ‘b’s

calculated from Eqns. 2.3.1 and 2.3.2 to reproduce the BTC image, which is a close

approximation of the original image.

2.4. ‘CR’, ‘BR’, ‘RMSE’, ‘PSNR’ & ‘C’ PARAMETERS OF COMPRESSION

As indicated in Section 2.3, the ‘compression ratio’ (CR) and ‘bit rate’ (BR) are

used to measure the amount of image compression, while the ‘Root Mean Square

Error’ (RMSE) and the ‘Peak Signal to Noise Ratio (PSNR)’ are used to measure the

resulting error of image compression. Contrast (C) is a measure of image visual quality.

The ‘compression ratio’ (CR) is defined as the ratio of the number bits of the

original image to the number bits after compression

Hence ‘compression ratio’

(CR) = ( 8 m ) / ( m + 16 ) (2.4.1)

m

jix jix

m 1,

2

,2

1

m

jijix

mx

1,,

1

The ‘Bit Rate’ (BR) is a parameter defined as the ratio of the number bits

generated after BTC, including the bits for �̅� and σ, to the number of pixels in the

image.

Hence ‘Bit Rate’ (BR) = (m +16) / m.

(BR) x (CR) = 8 Bits / pixel in original image.

The ‘ Root Mean Square Error’ (RMSE) is defined as,

0.5

(2.4.2)

where ‘𝑑𝑖’ is the difference between the intensity of the 𝑖𝑡ℎ pixel in the original image

and the reconstructed image, and 262144 is equal to 512 x 512.

The Peak Signal to Noise Ratio (PSNR) is defined as

(2.4.3)

wherein ‘𝑋𝑚𝑎𝑥’ is the maximum pixel intensity in the 512x512 image.

The contrast (C) of an image is equal to the standard deviation of the intensity

values of the all the pixels of the image. Based on block by block approach, for a

‘512x512’ image of 4096 blocks of ‘8x8’ pixels, the Contrast ‘C’ of the image is

𝐶 = (1

64) √[∑ 𝜎2

𝑛4096𝑛=1 ] (2.4.4)

where ‘σn’ is the Standard Deviation of the nth ‘8x8’ block, given by

𝜎𝑛 = (1

8) √[∑ (𝑥𝑖 − �̅�)264

𝑖=1 ] (2.4.5)

dBRMSE

XPSNR maxlog1020

262144

1

2

512

1

id iRMSE

where 𝑥𝑖 = intensity of the 𝑖𝑡ℎpixel of the nth ‘8x8’ block �̅�=

mean intensity of the ‘n’th ‘8x8’ block

The above equation for ‘C’ and ‘𝜎𝑛’ are applicable both for the original and the

reconstructed images.

After the application of BTC, ‘𝑝’ numbers of the pixels, represented by 0s in the

bit plane, are assigned with low-mean intensity ‘a’, and ‘𝑞’ (= 𝑘2 − 𝑝) pixels,

represented by 1s in the bit plane, are assigned with high-mean intensity ‘b’. The

contrast ‘C’ of this BTC block is equal to the standard deviation of the ‘𝑝’ number of ‘a’s

and ‘𝑞’ (= 𝑘2 − 𝑝) number of ‘𝑏’s. Using Eqn. (2.4.5), we get

𝜎𝑛 = [(𝑏−𝑎)

𝑝+𝑞] [𝑝𝑞]

1

2 (2.4.6)

where,

𝑎 = low-mean intensity corresponding to 0s in the BTC bit plane

𝑏 = high-mean intensity corresponding to 1s in the BTC bit plane.

𝑝 = the number of 0s in the bit plane corresponding to low-mean ‘a’,

and

q = the number of 1s in the bit plane corresponding to high-mean ‘b’.

While CR and BR are dependent only on the image block size, PSNR, RMSE and C are

dependent on the intensities of pixels. The CR and BR values are listed in Table 2.1 for

various block sizes of the image.

Table 2.1: CR and BR values for various block sizes of a 512x512 image.

Block

size

2 x2

pixels

4 x4

pixels

8 x8

pixels

16 x16

pixels

32 x32

pixels

64 x 64 pixels

m=k2 4 16 64 256 1024 4096

CR 1.6 4 6.4 7.5294118 7.8769231 7.9688716

BR 5 2 1.25 1.0625 1.015625 1.0039063

This Table 2.1 is graphically shown in Fig.2.1

Figure 2.1: Graph showing the variation of Compression Ratio and Bit Rate for various

block sizes.

As the block size increases, the bit rate decreases and the CR increases as

shown in figure 2.1.

2.5 ILLUSTRATION OF BTC APPLIED TO AN ARBITRARY 4X4 BLOCK

For illustration, a 4x4 block of pixels having arbitrary gray level intensities, in

shown in Figure 2.2, along with its corresponding bit plane.

0

1

2

3

4

5

6

7

8

9

2x2 4x4 8x8 16x16 32x32 64x64

CR

an

d B

R

Block size

CR

Bit Rate

(a) (b)

Figure 2.2: (a) 4x4 Pixels Block, (b) Corresponding Bit Plane

Using equations (2.3.3) and (2.3.4) the mean ( 𝑥 ̅̅ ̅) of the 4x4 pixels block is 3

and the standard deviation (𝜎) is 2.64 . The encoder develops a single bit plane of 4x4

size by representing all 𝑥𝑖,𝑗 < 3̅ by 0s ,and all 𝑥𝑖,𝑗 ≥ 3 by 1s.This bit plane along with �̅�

and σ are transmitted to the receiver.

Using the equations (2.3.1) and (2.3.2), the decoder in the receiver estimates a

low-mean value ‘a’ (0.007), to replace the 0s, and a high- mean value ’b’ (5.328),to

replace the 1s, in the received bit plane. Thus the 0s in the bit plane are replaced by

0.007 and the 1s in the bit plane are replaced by 5.328. The 4 x4 block of the image

reconstructed by the decoder is shown in Figure 2.3.

Figure 2.3: Reconstructed 4x4 Block with 𝑥 ̅̅ ̅ ≈ 3.

Thus a 2-gray level truncation of the original block is created. This Block

Truncation Coding procedure is applied to all the blocks of the image. The decoder

recreates the truncated version of every block of the original image by estimating the

block’s ‘a’ and ‘b’ values, from the block’s �̅� and σ values using Eqns.2.3.1 and 2.3.2. In

any 4x4 block, the 8 bits of any pixel intensity are coded by a single bit. Additionally, 8

bits each are needed to code �̅� and σ values of the block.

Thus 128 bits of the block are compressed to 32 bits

Compression Ratio (CR) = 4.

A total of 32 bits are transmitted by the encoder for a 4x4 block of 16 pixels

Bit Rate (BR) is 32 / 16 = 2.

The RMSE for this 4x4 block is:

RMSE4x4 = [0.25] [ (1-0.007)2 + (3-5.328)2 + (5-5.328)2 + (1-0.007)2 +

(6-5.328)2 + (2-0.007)2 + (3-5.328)2 + (4-5.328)2 +

(5-5.328)2 + (1-0.007)2 + (2-0.007)2 + (3-5.328)2 +

(6-5.328)2 + (1-0.007)2 + (2-0.007)2 + (3-5.328)2 ] 0.5

RMSE4x4 ≈ 1.5992

The PSNR for this 4x4 block is:

PSNR4x4 = 20log10 ( 6 / 1.5922 ) ≈ 9.5764 dB

Contrast of the original 4x4 block is:

C4x4= [0.25] [ (1-3)2 + (3-3)2 + (5-3)2 +(1-3)2 +

(6-3)2 + (2-3)2 + (3-3)2 + (4-3)2 +

(5-3)2 + (1-3)2 + (2-3)2 + (3-3)2 +

(6-3)2 + (1-3)2 + (2-3)2 + (3-3)2 ] 0.5

C4x4 ≈ 1.6956

Contrast of the BTC reconstructed block is calculated as shown below:

C4x4 BTC = [0.25] [ (0.007-3)2 + (5.328-3)2 + (5.328-3)2 + (0.007-3)2 +

(5.328-3)2 + (0.007-3)2 + (5.328-3)2 + (5.328-3)2 +

(5.328-3)2 + (0.007-3)2 + (0.007-3)2 + (5.328-3)2 +

(5.328-3)2 + (0.007-3)2 + (0.007-3)2 + (5.328-3)2 ] 0.5

C4x4BTC≈ 2.6396

2.6 BTC APPLICATION ON SAMPLE IMAGES

The BTC is applied to all the blocks of an image and the images are

reconstructed and compared with the original image.

Four sample images, copya.jpg, city.jpg, hurricane.jpg and boat.jpg are subjected

to the traditional BTC for various block sizes and the results are displayed in Figures 2.4

and 2.5, 2.6 and 2.7 respectively. The parameters such as RMSE, PSNR and Contrast

are measured for various block sizes and tabulated in Table 2.2. Figures 2.8, 2.9 and

2.10 show the graphical representation of the Table 2.2.

(a) (b) (c)

(d) (e) (f)

Figure 2.4: (a) Original Image copya.jpg (b) – (f) Traditional BTC for block size 4x4, 8x8,

16x16, 32x32 and 64x64 for copya.jpg.

(a) (b) (c)

(d) (e) (f)

Figure 2.5: (a) Original Image city.jpg, (b) – (f) Traditional BTC for block size 4x4, 8x8,

16x16, 32x32 and 64x64 for city.jpg.

(a) (b) (c)

(d) (e) (f)

Figure 2.6: (a) Original Image hurricane.jpg, (b) – (f) Traditional BTC for block size 4x4,

8x8, 16x16, 32x32 and 64x64 for hurricane.jpg.

(a) (b) (c)

(d) (e) (f)

Figure 2.7: (a) Original Image boat.jpg, (b) – (f) Traditional BTC for block size 4x4, 8x8,

16x16, 32x32 and 64x64 for boat.jpg.

The contrast of the original images are measured and tabulated. These values are

compared with the contrast of the Traditional BTC in this chapter. Table 2.2 shows the

contrast values of the four sample images.

Table 2.2 : Contrast value for the original images.

S.No. Images Contrast

1 Copya.jpg 55.7629

2 City.jpg 55.9065

3 Hurricane.jpg 51.4933

4 Boat.jpg 46.6772

Table 2.3 :RMSE, PSNR and Contrast value for Traditional BTC for various block sizes.

Image Block Size RMSE PSNR Contrast

copya.jpg

4x4 1.4936 45.24 78.8770

8x8 1.4114 45.85 79.5406

16x16 1.2784 46.21 79.6656

32x32 1.1014 46.44 80.7490

64x64 1.0007 46.98 80.9710

city.jpg

4x4 1.4884 45.25 69.0979

8x8 1.4107 45.90 70.4971

16x16 1.2056 46.68 72.3398

32x32 1.0640 46.87 72.5891

64x64 1.0006 46.99 73.0150

hurricane.jpg

4x4 1.4954 45.27 88.7278

8x8 1.4106 45.86 88.8158

16x16 1.2526 46.65 88.8159

32x32 1.1152 46.83 88.9196

64x64 1.0061 46.97 89.2277

boat.jpg

4x4 1.4887 45.29 66.5552

8x8 1.4130 45.83 67.6581

16x16 1.2423 46.64 69.5588

32x32 1.1064 46.88 69.5962

64x64 1.0078 47.01 69.6929

The graphical representations of the parameters which are tabulated in Table 2.3 are

shown in the Figure 2.8, 2.9 and 2.10 respectively. The sample images which are taken

for processing are of 300 dpi (dots per inch).

Figure 2.8: Graph showing RMSE values for copya.jpg, city.jpg, hurricane.jpg and

boat.jpg.

In figure 2.8, the RMSE decreases as the block size increases.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

4x4 8x8 16x16 32x32 64x64

RM

SE

Block Size

RMSE(copya.jpg)

RMSE (city.jpg)

RMSE (hurricane.jpg)

RMSE (boat.jpg)

Figure 2.9: Graph showing PSNR values for copya.jpg, city.jpg, hurricane.jpg and

boat.jpg.

In figure 2.9, the PSNR increases as the block size increases.

Figure 2.10: Graph showing Contrast values for copya.jpg, city.jpg, hurricane.jpg and

boat.jpg.

60

65

70

75

80

85

90

95

100

4x4 8x8 16x16 32x32 64x64

Co

ntr

ast

Block Size

Contrast(copya.jpg)

Contrast(city.jpg)

Contrast(hurricane.jpg)

Contrast(boat.jpg)

44

44.5

45

45.5

46

46.5

47

47.5

4x4 8x8 16x16 32x32 64x64

Pe

ak S

ign

al T

o N

ois

e R

atio

Block Size

PSNR (copya.jpg)

PSNR (city.jpg)

PSNR (hurricane.jpg)

PSNR (boat.jpg)

The BTC image, thus reconstructed block by block, is not the exact original

image, but a good approximation, with only low mean and high mean intensities in any

block. Although the compression increases with block size, as already shown in Table

2.1, the quality of the reconstructed BTC image degrades rapidly, as shown in Table

2.3, and as seen in Figures 2.2, 2.3, 2.4 and 2.5. Annoying blocking artifacts and false

contours are visible in larger block sizes BTC images in Figures 2.4, 2.5, 2.6 and 2.7.

2.7 COMPUTATIONAL TIME FOR IMAGE PROCESSING

Another parameter of importance in image processing is the processing time.

Since BTC is a simple algorithm, the processing time is less. Further, the time to

process an entire image decreases with increase in block size.

CPU time (or process time) is the amount of time for which a central processing

unit (CPU) was used for processing instructions of a computer program or operating

system.

The elapsed time is the time taken for waiting for input/output (I/O) operations or

entering low-power (idle) mode.

The following Table 2.4, Table 2.5, Table 2.6 and Table 2.7 show the CPU

processing time and elapsed time for the Traditional BTC, for various block sizes.

Table 2.4: CPU time and elapsed time for various block sizes of Traditional BTC for

copya.jpg.

Table 2.5: CPU time

Block Size Elapsed Time

(in Seconds)

CPU Time

(in Seconds)

4x4 7.4964 4.0716

8x8 3.9663 1.1076

16x16 2.9917 0.7020

32x32 2.8148 0.6396

64x64 2.6026 0.5421

http://en.wikipedia.org/wiki/Time

http://en.wikipedia.org/wiki/Central_processing_unit

http://en.wikipedia.org/wiki/Central_processing_unit

http://en.wikipedia.org/wiki/Instruction_(computer_science)

http://en.wikipedia.org/wiki/Computer_program

http://en.wikipedia.org/wiki/Operating_system

http://en.wikipedia.org/wiki/Operating_system

http://en.wikipedia.org/wiki/Input/output

and elapsed time for various block sizes of Traditional BTC for city.jpg


hurricane.jpg


(in Seconds)

CPU Time

(in Seconds)

4x4 7.7197 3.0888

8x8 5.0510 1.1544

16x16 4.5638 0.6708

32x32 4.0711 0.6084

64x64 3.8789 0.4056


(in Seconds)

CPU Time

(in Seconds)

4x4 7.1648 3.4593

8x8 5.0324 1.1604

16x16 4.0001 0.5072

32x32 3.5744 0.4656

64x64 3.1979 0.4385


boat.jpg


(in Seconds)

CPU Time

(in Seconds)

4x4 6.1134 3.2760

8x8 4.1570 1.2792

16x16 3.5266 0.6864

32x32 3.1266 0.4836

64x64 3.0701 0.4801

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

6.5

4x4 8x8 16x16 32x32 64x64

Tim

e in

Se

con

ds

Block Size

Elapsed Time

CPU Time

Figure 2.11: Graph showing the elapsed time and CPU Time for processing the image

copya.jpg (Traditional BTC) for various block sizes.

In figure 2.11, the elapsed time and the CPU time decreases as the block size

increases.

Figure 2.12: Graph showing the elapsed time and CPU Time for processing the image

city.jpg (Traditional BTC) for various block sizes.


increases.

00.5

11.5

22.5

33.5

44.5

55.5

66.5

77.5

8

4x4 8x8 16x16 32x32 64x64

Tim

e in

Se

con

ds

Block Size

Elapsed Time

CPU Time

1.52

2.53

3.54

4.55

5.56

6.57

7.58

Tim

e in

Se

con

ds

Elapsed Time

CPU Time

Figure 2.13: Graph showing the elapsed time and CPU Time for processing the

hurricane.jpg (Traditional BTC) using various block sizes.


increases.

Figure 2.14: Graph showing the elapsed time and CPU Time for processing the boat.jpg

(Traditional BTC) using various block sizes.


increases.

2.8 CONCLUSION

00.5

11.5

22.5

33.5

44.5

55.5

66.5

77.5

8

4x4 8x8 16x16 32x32 64x64

Tim

e in

Se

con

ds

Block Size

Elapsed Time

CPU Time

In this chapter, the traditional BTC method has been applied to four sample

images and the performance parameters such as PSNR, RMSE, Contrast, Elapsed

Time and CPU Time have been estimated and compared. It is seen that as the block

size is increased for processing, the visual quality of the image degrades rapidly with

severe blocking artifacts and blurred edges.

1. block truncation coding for image compression...

Documents