1. block truncation coding for image compression...
TRANSCRIPT
1. BLOCK TRUNCATION CODING FOR IMAGE COMPRESSION
1.1 DIGITAL IMAGE FUNDAMENTALS
This chapter deals with the fundamentals of digital image signal representation
and the basic Block Truncation Coding (BTC) for image compression. A frame of a
digital image can be visualized as an orderly arrangement of ‘picture elements’ (pixels)
arranged in horizontal lines and many such lines are stacked one below the other. It
may also be visualized as a matrix of pixels arranged in rows and columns. For example
a ‘512 x 512’ image has 512 horizontal lines in a frame, each with 512 pixels. A pixel is
the tiniest, visible part of an image having its own color (hue) and brightness (intensity
of light). The brightness is referred as ‘luminance ‘(luma) and the color is referred as
‘chrominance’ (chroma). Any colour can be represented as a mixture of three primary
colors namely red, green and blue. When an image is scanned electronically, each pixel
of the image produces its own R, G, B (red, green, blue) signals corresponding to the
intensity of the primary colors in that pixel [17], [28]. In digital processors, each of the
R, G, B signals are represented by 8 bits, corresponding to 256 quantization levels,
starting from zero intensity to full intensity. It is customary to explain any image
processing using monochrome (black and white) image, which can be extended to each
of the R, G, B components of the color image, separately [32].
1.2 NEED FOR IMAGE COMPRESSION
Digital images are in general stored in memories, preprocessed, transmitted and
reprocessed for final applications. The quantum of binary data
to be handled by an image processor is enormous. For example, a ‘256 x 256’ frame of
a monochrome image will have 524288 (256 x 256 x8) bits at the rate of 8 bits per pixel.
A 5 minutes video at the rate of 25 such frames per second will have 3932160000
(nearly 40 million) bits! Obviously, it will be advantageous to reduce the number of bits
before transmission with the capability of reproducing an acceptable image quality at
the receiver. This process is known as ‘Lossy Image Compression’. This will primarily
reduce the transmission time and also the storage memory required.
But images could also be compressed without reduction in quality by employing
suitable coding techniques. Inherently, such ‘Lossless Image Compression’ methods [9]
yield lesser compression, compared to ‘Lossy’ methods [30].
The ‘Compression ratio’ (CR) and ‘Bit Rate’ (BR) are used to measure the
amount of image compression, while the ‘Peak Signal to Noise Ratio (PSNR)’ and
‘Root Mean Square Error’ (RMSE) are used to measure the resulting error of image
compression. Contrast (C) is is a measure of image visual quality.
Both time-domain [20] and transform based frequency-domain [37], [38], [39]
image compression techniques are employed in image compression. Block Truncation
Coding (BTC) is an apparently elegant and efficient time-domain compression
technique.
2.3 TRADITIONAL BLOCK TRUNCATION CODING
The Block Truncation Coding (BTC) was introduced by Delp and Mitchell [10], in
1979. This coding is based on dividing the image into non overlapping blocks of equal
size. In digital signal processors, an image is divided into smaller blocks of ‘k x k’ pixels
for processing. For example a ‘512 x 512’ frame may be divided into blocks of ‘8 x 8’
pixels. Sometimes microblocks of ‘2 x 2’ pixels, miniblocks of ‘4 x 4’ pixels, maxiblocks
of ’16 x 16’ pixels and macroblocks of ‘32 x 32’ pixels are also used.
BTC involves replacing the original intensity value of each pixel in a block either
by a ‘low mean’ intensity value ‘a’ or a ‘high mean’ intensity value ‘b’ based on a
threshold intensity value. This threshold is the mean intensity of the pixels in the block.
A ‘bit plane’ is created by representing the ‘a’ value pixels by ‘0’s and ‘b’ value pixels by
‘1’s.
(2.3.1)
(2.3.2)
qm
qxa
q
qmxb
Here, ‘m’ is the total number of pixels equal to 𝑘2 (16 for a 4x4 block)
‘q’ is the number of ‘0’s in the bit plane
�̅� is the mean intensity of ‘m’ pixels
‘σ’ is the standard deviation of intensities of ‘m’ pixels.
(2.3.3)
𝜎 = [𝑥2̅̅ ̅ − (�̅�)2]0.5 (2.3.4)
(2.3.5)
where 𝑚 = 𝑘2and 𝑥𝑖,𝑗is the intensity value of the pixel (i,j) of the image,𝑥 ̅is the mean
intensity, 𝑥2̅̅ ̅is the mean of squared intensities and σ is the standard deviation (SD).
The encoder transmits the ‘bit plane’ of total ‘m’ bits, along with �̅� and ‘σ’ of each
8 bits. In the decoder, the ‘0’s and ‘1’s of the bit plane are replaced by 8-bit ‘a’s and ‘b’s
calculated from Eqns. 2.3.1 and 2.3.2 to reproduce the BTC image, which is a close
approximation of the original image.
2.4. ‘CR’, ‘BR’, ‘RMSE’, ‘PSNR’ & ‘C’ PARAMETERS OF COMPRESSION
As indicated in Section 2.3, the ‘compression ratio’ (CR) and ‘bit rate’ (BR) are
used to measure the amount of image compression, while the ‘Root Mean Square
Error’ (RMSE) and the ‘Peak Signal to Noise Ratio (PSNR)’ are used to measure the
resulting error of image compression. Contrast (C) is a measure of image visual quality.
The ‘compression ratio’ (CR) is defined as the ratio of the number bits of the
original image to the number bits after compression
Hence ‘compression ratio’
(CR) = ( 8 m ) / ( m + 16 ) (2.4.1)
m
jix jix
m 1,
2
,2
1
m
jijix
mx
1,,
1
The ‘Bit Rate’ (BR) is a parameter defined as the ratio of the number bits
generated after BTC, including the bits for �̅� and σ, to the number of pixels in the
image.
Hence ‘Bit Rate’ (BR) = (m +16) / m.
(BR) x (CR) = 8 Bits / pixel in original image.
The ‘ Root Mean Square Error’ (RMSE) is defined as,
0.5
(2.4.2)
where ‘𝑑𝑖’ is the difference between the intensity of the 𝑖𝑡ℎ pixel in the original image
and the reconstructed image, and 262144 is equal to 512 x 512.
The Peak Signal to Noise Ratio (PSNR) is defined as
(2.4.3)
wherein ‘𝑋𝑚𝑎𝑥’ is the maximum pixel intensity in the 512x512 image.
The contrast (C) of an image is equal to the standard deviation of the intensity
values of the all the pixels of the image. Based on block by block approach, for a
‘512x512’ image of 4096 blocks of ‘8x8’ pixels, the Contrast ‘C’ of the image is
𝐶 = (1
64) √[∑ 𝜎2
𝑛4096𝑛=1 ] (2.4.4)
where ‘σn’ is the Standard Deviation of the nth ‘8x8’ block, given by
𝜎𝑛 = (1
8) √[∑ (𝑥𝑖 − �̅�)264
𝑖=1 ] (2.4.5)
dBRMSE
XPSNR maxlog1020
262144
1
2
512
1
id iRMSE
where 𝑥𝑖 = intensity of the 𝑖𝑡ℎpixel of the nth ‘8x8’ block �̅�=
mean intensity of the ‘n’th ‘8x8’ block
The above equation for ‘C’ and ‘𝜎𝑛’ are applicable both for the original and the
reconstructed images.
After the application of BTC, ‘𝑝’ numbers of the pixels, represented by 0s in the
bit plane, are assigned with low-mean intensity ‘a’, and ‘𝑞’ (= 𝑘2 − 𝑝) pixels,
represented by 1s in the bit plane, are assigned with high-mean intensity ‘b’. The
contrast ‘C’ of this BTC block is equal to the standard deviation of the ‘𝑝’ number of ‘a’s
and ‘𝑞’ (= 𝑘2 − 𝑝) number of ‘𝑏’s. Using Eqn. (2.4.5), we get
𝜎𝑛 = [(𝑏−𝑎)
𝑝+𝑞] [𝑝𝑞]
1
2 (2.4.6)
where,
𝑎 = low-mean intensity corresponding to 0s in the BTC bit plane
𝑏 = high-mean intensity corresponding to 1s in the BTC bit plane.
𝑝 = the number of 0s in the bit plane corresponding to low-mean ‘a’,
and
q = the number of 1s in the bit plane corresponding to high-mean ‘b’.
While CR and BR are dependent only on the image block size, PSNR, RMSE and C are
dependent on the intensities of pixels. The CR and BR values are listed in Table 2.1 for
various block sizes of the image.
Table 2.1: CR and BR values for various block sizes of a 512x512 image.
Block
size
2 x2
pixels
4 x4
pixels
8 x8
pixels
16 x16
pixels
32 x32
pixels
64 x 64 pixels
m=k2 4 16 64 256 1024 4096
CR 1.6 4 6.4 7.5294118 7.8769231 7.9688716
BR 5 2 1.25 1.0625 1.015625 1.0039063
This Table 2.1 is graphically shown in Fig.2.1
Figure 2.1: Graph showing the variation of Compression Ratio and Bit Rate for various
block sizes.
As the block size increases, the bit rate decreases and the CR increases as
shown in figure 2.1.
2.5 ILLUSTRATION OF BTC APPLIED TO AN ARBITRARY 4X4 BLOCK
For illustration, a 4x4 block of pixels having arbitrary gray level intensities, in
shown in Figure 2.2, along with its corresponding bit plane.
0
1
2
3
4
5
6
7
8
9
2x2 4x4 8x8 16x16 32x32 64x64
CR
an
d B
R
Block size
CR
Bit Rate
(a) (b)
Figure 2.2: (a) 4x4 Pixels Block, (b) Corresponding Bit Plane
Using equations (2.3.3) and (2.3.4) the mean ( 𝑥 ̅̅ ̅) of the 4x4 pixels block is 3
and the standard deviation (𝜎) is 2.64 . The encoder develops a single bit plane of 4x4
size by representing all 𝑥𝑖,𝑗 < 3̅ by 0s ,and all 𝑥𝑖,𝑗 ≥ 3 by 1s.This bit plane along with �̅�
and σ are transmitted to the receiver.
Using the equations (2.3.1) and (2.3.2), the decoder in the receiver estimates a
low-mean value ‘a’ (0.007), to replace the 0s, and a high- mean value ’b’ (5.328),to
replace the 1s, in the received bit plane. Thus the 0s in the bit plane are replaced by
0.007 and the 1s in the bit plane are replaced by 5.328. The 4 x4 block of the image
reconstructed by the decoder is shown in Figure 2.3.
Figure 2.3: Reconstructed 4x4 Block with 𝑥 ̅̅ ̅ ≈ 3.
Thus a 2-gray level truncation of the original block is created. This Block
Truncation Coding procedure is applied to all the blocks of the image. The decoder
recreates the truncated version of every block of the original image by estimating the
block’s ‘a’ and ‘b’ values, from the block’s �̅� and σ values using Eqns.2.3.1 and 2.3.2. In
any 4x4 block, the 8 bits of any pixel intensity are coded by a single bit. Additionally, 8
bits each are needed to code �̅� and σ values of the block.
Thus 128 bits of the block are compressed to 32 bits
Compression Ratio (CR) = 4.
A total of 32 bits are transmitted by the encoder for a 4x4 block of 16 pixels
Bit Rate (BR) is 32 / 16 = 2.
The RMSE for this 4x4 block is:
RMSE4x4 = [0.25] [ (1-0.007)2 + (3-5.328)2 + (5-5.328)2 + (1-0.007)2 +
(6-5.328)2 + (2-0.007)2 + (3-5.328)2 + (4-5.328)2 +
(5-5.328)2 + (1-0.007)2 + (2-0.007)2 + (3-5.328)2 +
(6-5.328)2 + (1-0.007)2 + (2-0.007)2 + (3-5.328)2 ] 0.5
RMSE4x4 ≈ 1.5992
The PSNR for this 4x4 block is:
PSNR4x4 = 20log10 ( 6 / 1.5922 ) ≈ 9.5764 dB
Contrast of the original 4x4 block is:
C4x4= [0.25] [ (1-3)2 + (3-3)2 + (5-3)2 +(1-3)2 +
(6-3)2 + (2-3)2 + (3-3)2 + (4-3)2 +
(5-3)2 + (1-3)2 + (2-3)2 + (3-3)2 +
(6-3)2 + (1-3)2 + (2-3)2 + (3-3)2 ] 0.5
C4x4 ≈ 1.6956
Contrast of the BTC reconstructed block is calculated as shown below:
C4x4 BTC = [0.25] [ (0.007-3)2 + (5.328-3)2 + (5.328-3)2 + (0.007-3)2 +
(5.328-3)2 + (0.007-3)2 + (5.328-3)2 + (5.328-3)2 +
(5.328-3)2 + (0.007-3)2 + (0.007-3)2 + (5.328-3)2 +
(5.328-3)2 + (0.007-3)2 + (0.007-3)2 + (5.328-3)2 ] 0.5
C4x4BTC≈ 2.6396
2.6 BTC APPLICATION ON SAMPLE IMAGES
The BTC is applied to all the blocks of an image and the images are
reconstructed and compared with the original image.
Four sample images, copya.jpg, city.jpg, hurricane.jpg and boat.jpg are subjected
to the traditional BTC for various block sizes and the results are displayed in Figures 2.4
and 2.5, 2.6 and 2.7 respectively. The parameters such as RMSE, PSNR and Contrast
are measured for various block sizes and tabulated in Table 2.2. Figures 2.8, 2.9 and
2.10 show the graphical representation of the Table 2.2.
(a) (b) (c)
(d) (e) (f)
Figure 2.4: (a) Original Image copya.jpg (b) – (f) Traditional BTC for block size 4x4, 8x8,
16x16, 32x32 and 64x64 for copya.jpg.
(a) (b) (c)
(d) (e) (f)
Figure 2.5: (a) Original Image city.jpg, (b) – (f) Traditional BTC for block size 4x4, 8x8,
16x16, 32x32 and 64x64 for city.jpg.
(a) (b) (c)
(d) (e) (f)
Figure 2.6: (a) Original Image hurricane.jpg, (b) – (f) Traditional BTC for block size 4x4,
8x8, 16x16, 32x32 and 64x64 for hurricane.jpg.
(a) (b) (c)
(d) (e) (f)
Figure 2.7: (a) Original Image boat.jpg, (b) – (f) Traditional BTC for block size 4x4, 8x8,
16x16, 32x32 and 64x64 for boat.jpg.
The contrast of the original images are measured and tabulated. These values are
compared with the contrast of the Traditional BTC in this chapter. Table 2.2 shows the
contrast values of the four sample images.
Table 2.2 : Contrast value for the original images.
S.No. Images Contrast
1 Copya.jpg 55.7629
2 City.jpg 55.9065
3 Hurricane.jpg 51.4933
4 Boat.jpg 46.6772
Table 2.3 :RMSE, PSNR and Contrast value for Traditional BTC for various block sizes.
Image Block Size RMSE PSNR Contrast
copya.jpg
4x4 1.4936 45.24 78.8770
8x8 1.4114 45.85 79.5406
16x16 1.2784 46.21 79.6656
32x32 1.1014 46.44 80.7490
64x64 1.0007 46.98 80.9710
city.jpg
4x4 1.4884 45.25 69.0979
8x8 1.4107 45.90 70.4971
16x16 1.2056 46.68 72.3398
32x32 1.0640 46.87 72.5891
64x64 1.0006 46.99 73.0150
hurricane.jpg
4x4 1.4954 45.27 88.7278
8x8 1.4106 45.86 88.8158
16x16 1.2526 46.65 88.8159
32x32 1.1152 46.83 88.9196
64x64 1.0061 46.97 89.2277
boat.jpg
4x4 1.4887 45.29 66.5552
8x8 1.4130 45.83 67.6581
16x16 1.2423 46.64 69.5588
32x32 1.1064 46.88 69.5962
64x64 1.0078 47.01 69.6929
The graphical representations of the parameters which are tabulated in Table 2.3 are
shown in the Figure 2.8, 2.9 and 2.10 respectively. The sample images which are taken
for processing are of 300 dpi (dots per inch).
Figure 2.8: Graph showing RMSE values for copya.jpg, city.jpg, hurricane.jpg and
boat.jpg.
In figure 2.8, the RMSE decreases as the block size increases.
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
4x4 8x8 16x16 32x32 64x64
RM
SE
Block Size
RMSE(copya.jpg)
RMSE (city.jpg)
RMSE (hurricane.jpg)
RMSE (boat.jpg)
Figure 2.9: Graph showing PSNR values for copya.jpg, city.jpg, hurricane.jpg and
boat.jpg.
In figure 2.9, the PSNR increases as the block size increases.
Figure 2.10: Graph showing Contrast values for copya.jpg, city.jpg, hurricane.jpg and
boat.jpg.
60
65
70
75
80
85
90
95
100
4x4 8x8 16x16 32x32 64x64
Co
ntr
ast
Block Size
Contrast(copya.jpg)
Contrast(city.jpg)
Contrast(hurricane.jpg)
Contrast(boat.jpg)
44
44.5
45
45.5
46
46.5
47
47.5
4x4 8x8 16x16 32x32 64x64
Pe
ak S
ign
al T
o N
ois
e R
atio
Block Size
PSNR (copya.jpg)
PSNR (city.jpg)
PSNR (hurricane.jpg)
PSNR (boat.jpg)
The BTC image, thus reconstructed block by block, is not the exact original
image, but a good approximation, with only low mean and high mean intensities in any
block. Although the compression increases with block size, as already shown in Table
2.1, the quality of the reconstructed BTC image degrades rapidly, as shown in Table
2.3, and as seen in Figures 2.2, 2.3, 2.4 and 2.5. Annoying blocking artifacts and false
contours are visible in larger block sizes BTC images in Figures 2.4, 2.5, 2.6 and 2.7.
2.7 COMPUTATIONAL TIME FOR IMAGE PROCESSING
Another parameter of importance in image processing is the processing time.
Since BTC is a simple algorithm, the processing time is less. Further, the time to
process an entire image decreases with increase in block size.
CPU time (or process time) is the amount of time for which a central processing
unit (CPU) was used for processing instructions of a computer program or operating
system.
The elapsed time is the time taken for waiting for input/output (I/O) operations or
entering low-power (idle) mode.
The following Table 2.4, Table 2.5, Table 2.6 and Table 2.7 show the CPU
processing time and elapsed time for the Traditional BTC, for various block sizes.
Table 2.4: CPU time and elapsed time for various block sizes of Traditional BTC for
copya.jpg.
Table 2.5: CPU time
Block Size Elapsed Time
(in Seconds)
CPU Time
(in Seconds)
4x4 7.4964 4.0716
8x8 3.9663 1.1076
16x16 2.9917 0.7020
32x32 2.8148 0.6396
64x64 2.6026 0.5421
and elapsed time for various block sizes of Traditional BTC for city.jpg
Table 2.6: CPU time and elapsed time for various block sizes of Traditional BTC for
hurricane.jpg
Block Size Elapsed Time
(in Seconds)
CPU Time
(in Seconds)
4x4 7.7197 3.0888
8x8 5.0510 1.1544
16x16 4.5638 0.6708
32x32 4.0711 0.6084
64x64 3.8789 0.4056
Block Size Elapsed Time
(in Seconds)
CPU Time
(in Seconds)
4x4 7.1648 3.4593
8x8 5.0324 1.1604
16x16 4.0001 0.5072
32x32 3.5744 0.4656
64x64 3.1979 0.4385
Table 2.7: CPU time and elapsed time for various block sizes of Traditional BTC for
boat.jpg
Block Size Elapsed Time
(in Seconds)
CPU Time
(in Seconds)
4x4 6.1134 3.2760
8x8 4.1570 1.2792
16x16 3.5266 0.6864
32x32 3.1266 0.4836
64x64 3.0701 0.4801
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
6.5
4x4 8x8 16x16 32x32 64x64
Tim
e in
Se
con
ds
Block Size
Elapsed Time
CPU Time
Figure 2.11: Graph showing the elapsed time and CPU Time for processing the image
copya.jpg (Traditional BTC) for various block sizes.
In figure 2.11, the elapsed time and the CPU time decreases as the block size
increases.
Figure 2.12: Graph showing the elapsed time and CPU Time for processing the image
city.jpg (Traditional BTC) for various block sizes.
In figure 2.12, the elapsed time and the CPU time decreases as the block size
increases.
00.5
11.5
22.5
33.5
44.5
55.5
66.5
77.5
8
4x4 8x8 16x16 32x32 64x64
Tim
e in
Se
con
ds
Block Size
Elapsed Time
CPU Time
1.52
2.53
3.54
4.55
5.56
6.57
7.58
Tim
e in
Se
con
ds
Elapsed Time
CPU Time
Figure 2.13: Graph showing the elapsed time and CPU Time for processing the
hurricane.jpg (Traditional BTC) using various block sizes.
In figure 2.13, the elapsed time and the CPU time decreases as the block size
increases.
Figure 2.14: Graph showing the elapsed time and CPU Time for processing the boat.jpg
(Traditional BTC) using various block sizes.
In figure 2.14, the elapsed time and the CPU time decreases as the block size
increases.
2.8 CONCLUSION
00.5
11.5
22.5
33.5
44.5
55.5
66.5
77.5
8
4x4 8x8 16x16 32x32 64x64
Tim
e in
Se
con
ds
Block Size
Elapsed Time
CPU Time
In this chapter, the traditional BTC method has been applied to four sample
images and the performance parameters such as PSNR, RMSE, Contrast, Elapsed
Time and CPU Time have been estimated and compared. It is seen that as the block
size is increased for processing, the visual quality of the image degrades rapidly with
severe blocking artifacts and blurred edges.