image compression (chapter 8)site.iugaza.edu.ps/rsalamah/files/2012/02/chapter_8.pdfa reversible...

Image Compression (Chapter 8)

Introduction

The goal of image compression is to reduce the

amount of data required to represent a digital image.

Important for reducing storage requirements and

improving transmission rates.

Approaches

Lossless – Information preserving

– Low compression ratios

– e.g., Huffman

Lossy – Does not preserve information

– High compression ratios

– e.g., JPEG

Tradeoff: image quality vs compression ratio

Data vs Information

Data and information are not synonymous terms!

Data is the means by which information is conveyed.

Data compression aims to reduce the amount of data

required to represent a given quantity of information

while preserving as much information as possible.

Data vs Information (cont’d)

The same amount of information can be represented

by various amount of data, e.g.:

Your wife, Helen, will meet you at Logan Airport

in Boston at 5 minutes past 6:00 pm tomorrow

night

Your wife will meet you at Logan Airport at 5

minutes past 6:00 pm tomorrow night

Helen will meet you at Logan at 6:00 pm

tomorrow night

Ex1:

Ex2:

Ex3:

Data Redundancy

Data redundancy is a mathematically quantifiable entity!

compression

Data Redundancy (cont’d)

Example:

Types of Data Redundancy

1. Coding redundancy

2. Interpixel redundancy

3. Psychovisual redundancy

The role of compression is to reduce one or more of

these redundancy types.

Coding Redundancy

Data compression can be achieved using an

appropriate encoding scheme.

Example: binary encoding

Encoding Schemes

Elements of an encoding scheme:

– Code: a list of symbols (letters, numbers, bits etc.)

– Code word: a sequence of symbols used to represent a piece

of information or an event (e.g., gray levels)

– Code word length: number of symbols in each code word

Definitions

Example: variable length coding

Variable length coding

Interpixel redundancy

This type of redundancy – sometimes called spatial

redundancy, interframe redundancy, or geometric

redundancy – exploits the fact that an image very

often contains strongly correlated pixels, in other

words, large regions whose pixel values are the same

or almost the same.


Interpixel redundancy implies that any pixel value can be

reasonably predicted by its neighbors (i.e., correlated).


This redundancy can be explored in several ways, one

of which is by predicting a pixel value based on the

values of its neighboring pixels.

In order to do so, the original 2-D array of pixels is

usually mapped into a different format, e.g., an array

of differences between adjacent pixels.

If the original image pixels can be reconstructed from

the transformed data set the mapping is said to be

reversible.

Example: Run length coding (reversible)

Example: Run length coding (reversible)

For line 100:

(1, 63), (0, 87), (1, 37), (0, 5), (1, 4), (0, 556), (1, 62), (0, 210)

Note that 21=2 and 210= 1024 ⇒ 1+10=11 bits per run

Suppose that there are 12 166 runs in total

Psychovisual redundancy

Takes into advantage the peculiarities of the human visual

system.

The eye does not respond with equal sensitivity to all visual

information.

Humans search for important features (e.g., edges, texture,

etc.) and do not perform quantitative analysis of every pixel

in the image.

Elimination of psychovisual redundancy results in loss

of quantitative information.

Quantization: mapping of a broad range of input values to a

limited number of output values.

Results in lossy data compression

Psychovisual redundancy (cont’d)

256 gray levels 16 gray levels

improved gray-scale quantization 16 gray levels

8/4

=

2:1

i.e., add to each pixel a

pseudo-random number

prior to quantization(IGS)

Fidelity Criteria

How close is to ?

Criteria

– Subjective: based on human observers

– Objective: mathematically defined criteria

Subjective Fidelity Criteria

Measuring image quality by subjective evaluations of a

human observer.

Objective Fidelity Criteria

The level of inf. loss expressed as a function of the

original and compressed image.

Root mean square error (RMS)

Mean-square signal-to-noise ratio (SNR)

Example

original RMS=5.17 RMS=15.67 RMS=20.17

Image Compression Model

Image Compression Model

Source encoder: Removes redundant data

Channel encoder:

– Source-encoded data contains only the essential

and is therefore sensitive to noise.

– Channel encoder increases noise-immunity of

source-encoded data when the channel is noisy.

– Reduces impact of errors by adding controlled

redundancy to source-encoded data (Hamming

algorithm)

– When channel is noise free, this is omitted...

Source Encoder

Mapper: reduces interpixel redundancy

• Run length coding: reversible

• Calculation of DCT: reversible

Quantizer: reduces psycho-visual redundancy

• Reduction of grey scales: not reversible

Symbol encoder: reduces coding redundancy

• Variable length coding: reversible

Decoder

The inverse operations are performed.

But … quantization is irreversible in general.

How do we measure information?

What is the information content of a message/image?

What is the minimum amount of data that is sufficient

to describe completely an image without loss of

information?

Answers are provided by information theory.

Modeling the Information Generation Process

Assume that information generation process is a

probabilistic process.

A random event E which occurs with probability P(E)

contains:

The base for the logarithm depends on the units for

measuring information. Usually, we use base 2, which

gives the information in units of “binary digits” or “bits.”

The amount of self-information attributed to event E

is inversely related to the probability of E.

Suppose that the gray level value of pixels is

generated by a random variable, then rk contains

Entropy: the average information content of each

pixel in an image.

As E increases more information is associated with

the image.

Average information of an image

units of information

Bits/pixel

Redundancy:

Average information of an image

where:

Entropy Estimation

First order

estimate of H:

The first-order estimate gives only a lower-bound on the

compression that can be achieved.

Lossless Compression

Error-free compression techniques generally

composed of:

1. Devising an alternative representation (mapping) of the

image to reduce its interpixel redundancy.

2. Coding the representation to eliminate coding

redundancy.

Huffman, Golomb, Arithmetic coding redundancy

LZW, Run-length, Symbol-based, Bit-plane

interpixel redundancy

Huffman Coding

It is a variable-length coding technique.

Most probable symbol is assigned the shortest code word.

Optimal code: minimizes the number of code symbols per source symbol.

Huffman Coding (cont’d)

Forward Pass

1. Sort probabilities per symbol from top to bottom.

2. Combine the lowest two probabilities.

3. Repeat Step2 until only two probabilities remain.


Backward Pass

Assign code symbols going backwards


Lavg using Huffman coding:

Lavg assuming binary codes:

Example

Huffman Coding: Properties

The resulting code is called a Huffman code. It

has some interesting properties:

(1)The source symbols can be encoded (and decoded)

one at time (with a lookup table).

(2) It is called a block code because each source symbol

is mapped into a fixed sequence of code symbols.

(3) It is instantaneous because each codeword in a

string of code symbols can be decoded without

referencing succeeding symbols.

(4) It is uniquely decodable because any string of

code symbols can be decoded in only one way.

Huffman Coding: Properties

Disadvantage: For a source with J symbols, we need J - 2

source reductions. This can be computationally intensive for

large J (ex. J = 256 for an image with 256 gray levels).

Near Optimal Variable Length Codes

2B

LZW Coding

Also attack inter-pixel redundancies

Assigns fixed-length code words to variable length

sequences of source symbols

Requires no a priori knowledge of probability of

occurrence of symbols

Integrated into GIF, TIFF and PDF file formats

LZW Coding

A codebook or a dictionary has to be constructed.

– Single pixel values and blocks of pixel values

For an 8-bit image, the first 256 entries are assigned

to the gray levels 0,1,2,..,255.

As the encoder examines image pixels, gray level

sequences (i.e., pixel combinations) that are not in the

dictionary are assigned to a new entry.

Example

Consider the following 4 x 4 8 bit image

39 39 126 126

39 39 126 126

39 39 126 126

39 39 126 126 Dictionary Location Entry

0 0

1 1

. .

255 255

256 -

511 -

Initial Dictionary

Example

39 39 126 126

39 39 126 126

39 39 126 126

39 39 126 126

- Is 39 in the dictionary……..Yes

- What about 39-39………….No

- Then add 39-39 in entry 256

Dictionary Location Entry

0 0

1 1

. .

255 255

256 -

511 -

39-39

Example

39 39 126 126

39 39 126 126

39 39 126 126

39 39 126 126

concatenated sequence (CS)

If CS is found:

(1) No Output

(2) CR=CS

(3) No D entry

(CR) (P)

If CS not found:

(1) Output D(CR)

(2) Add CS to D

(3) CR=P

Decoding LZW:

Let the bit stream received be:

39 39 126 126 256 258 260 259 257 126

In LZW, the dictionary which was used for

encoding need not be sent with the image. A

separate dictionary is built by the decoder, as it

reads the received code words.

Decoding LZW

LZW Decoding Example

(1)output the dictionary entry for the pixel value(s)

(2) add a new dictionary entry whose content is the RS plus the 1rst

element of the encoded value being processed

(3) set the RS to the en-coded value being processed.

Bit-plane Coding

Attack inter-pixel redundancy

First decompose the original image

into bit-planes(series of binary images)

Compress each binary image via a

binary compression method

– Run-length coding (RLC)

Bit-plane Decomposition

Bit-plane slicing

Problem: small changes in gray level can have a

significant impact on the complexity of the bit

plane

– 127 vs. 128 0111 1111 vs. 1000 0000

Solution: (gray code)

Example:

– 127 0111 1111 0100 0000

– 128 1000 0000 1100 0000

Run-Length Coding

This technique is very effective in encoding binary images

with large contiguous black and white regions, which

would give rise to a small number of large runs of 1s and

0s.

The run-lengths can in turn be encoded using a variable

length code (ex. Huffman code), for further compression.

Binary Image Compression - RLC

Developed in 1950s

Standard compression approach in FAX coding

Approach

– Code each contiguous group of 0’s or 1’s encountered in a left to right scan of a row by its length

1 1 1 1 1 0 0 0 0 0 0 1 (1,5) (0, 6) (1, 1)

– Establish a convention for determining the value of the run

– Code black and white run lengths separately using variable-length coding

A Complete Example

2 127 128

3 32 33

0000 0010 0111 1111 1000 0000

0000 0011 0010 0000 0010 0001

0000 0011 0100 0000 1100 0000

0000 0010 0011 0000 0011 0001

001 011 000 000 000 000 100 100 2 1 1 2 3 3 3 3 0 1 2 0 1 2

000 000 011 011 000 000 100 001 3 3 1 2 1 2 3 3 0 1 2 2 1

1 0.34 (1) 0.66 (0) 3 1.00 (0) 0 0.34 (1) 1 0.4 (1) 0.6 (0)

2 0.33 (00) 0.34 (1) 1 0.33 (00) 2 0.4 (00) 0.4 (1)

3 0.33 (01) 2 0.33 (01) 0 0.2 (01)

001 100 01 01 0 0 10001 01100

01 01 100 100 0 0 10001 001

Original image

Binary code

XOR binary

8 bit planes

Huffman

coding

Final code

Run-length coding

Lossy Compression

Spatial domain methods

– Lossy predictive coding (not discussed)

Transform coding

– Transform the image into a domain where

compression can be performed more efficiently

– Operate on the transformed image

Transform Coding

A reversible linear transform is used to map the image

into a set of coefficients, which are then be quantized and

coded.

For most natural images, a significant number of coef.

Have small magnitudes and can be quantized or discarded

with little image distortion...

Transform Coding

The goal of the transformation process is to pack as much

information as possible into the smallest number of

transform coefficients.

The quantization stage then selectively eliminates

(quantizes) the coefficients that carry the least

information.

The symbol encoder uses a compression coding (normally

variable length code) to represent the quantized

coefficients.

Transform Selection

The transform of an image can be computed using

various transformations, for example:

– DFT

– DCT (Discrete Cosine Transform)

– WHT (Walsh Hadmard Transformation)

– DWT

Discrete Fourier Transform

Due to its computational efficiency the DFT is very popular

however, it has strong disadvantages for some Applications:

– It is complex

– It has poor energy compaction

Energy compaction:

– Is the ability to pack the energy of the spatial sequence into as

few frequency coefficients as possible.

– most of the signal information tends to be concentrated in a few

low-frequency components of the DCT

–This is very important for image compression if compaction is

high we only have to transmit a few coefficients instead of the

whole set of pixels

Discrete Fourier Transform

The amplitude spectra of the image above

D FT D CT

Discrete Cosine Transform

if v=0

if v>0

forward

inverse

Energy Compaction

8 x 8 subimages

64 coefficients

per subimage

50% of the

coefficients

truncated

Subimage size selection

Usually, images are subdivided so that the correlation

(redundancy) between adjacent subimages is reduced

and so that the subimage size n = 2m for computational

efficiency.

In general, both the level of compression and

computational complexity increase with subimage size.

The most popular subimage sizes are 8x8 and 16x16.

Subimage size selection

Approximation of the original image using 25%

of the DCT coefficients with subimage sizes 2x2,

4x4 and 8x8

Bit Allocation

The reconstruction error is a function of the relative

importance of the transform coefficients that are discarded

and of the precision that is used to present the retained

coefficients.

In most transform coding systems, the retained coefficients

are selected based on maximum variance (zonal coding) or

based on maximum magnitude (threshold coding).

The overall process of truncating, quantizing, and coding the

coefficients of a transformed sub-image is called bit

allocation.

Zonal Coding

Transform coefficients with large variance carry most of the

information about the image. Hence a fraction of the

coefficients with the largest variance is retained

For each subimage,

(1) Compute the variance of each of the transform

coefficients; use the subimages to compute this.

(2) Keep X% of their coeff. which have maximum variance.

(3) Variable length coding (proportional to variance)

Threshold Coding

In each subimage, the transform coefficients of largest

magnitude contribute most significantly and are therefore

retained.

For each subimage:

(1)Arrange the transform coefficients in decreasing order of

magnitude .

(2) Keep only the top X% of the coefficients and discard rest.

(3) Encode the retained coefficient using variable length

code.

Threshold Coding

There are three basic ways to threshold a transformed

subimage:

1. A single global threshold applied to all subimages

– the compression level differs from image to image;

2. Individual thresholds can be used for each

subimage( N-largest )the same number of coefficients

is discarded from each subimage – constant code rate;

3. The threshold can be varied as a function of the

location of each coefficient within the subimage –

variable code rate. Thresholding and quantization are

combined.

Threshold Coding

Example

This array is a typical normalization array, which has

been used in the JPEG standardization efforts.

JPEG Standard

JPEG is the first image compression standard

Acronym for Joint Photographic Experts Group

Goal of the standard is to support a variety of

applications for compression of continuous-tone

still images of most image sizes in any color

space in order to achieve compression

performance at or near the state-of-the-art with

user-adjustable compression ratios and with very

good to excellent reconstructed quality

JPEG Compression

JPEG uses DCT for handling interpixel redundancy.

It defines three different coding systems:

1. A lossy baseline coding system based on DCT

(adequate for most compression applications)

2. An extended coding system for greater compression, higher precision, or progressive reconstruction applications

3. A lossless independent coding system for reversible compression

JPEG Compression (Sequential DCT-based encoding)

1. Divide the image into 8x8 subimages;

For each subimage do:

2. Shift the gray-levels in the range [-128, 127]

3. Apply DCT (64 coefficients will be obtained: 1 DC

coefficient F(0,0), 63 AC coefficients F(u,v)).

4. Quantize the coefficients (i.e., reduce the amplitude of

coefficients that do not contribute a lot).

5. Order the coefficients using zig-zag ordering

- Place non-zero coefficients first

- Create long runs of zeros (i.e., good for run-length

encoding)

JPEG Steps

JPEG Steps (cont’d)

6. Encode coefficients as follows:

Example: Implementing the JPEG

Baseline Coding System *

Example: Level Shifting

Example: Computing the DCT

Example: The Quantization Matrix

Example: Quantization

Zig-Zag Scanning of the Coefficients

Example: Coding the Coefficients

The DC coefficient is coded (difference between the DC coefficient of the

previous block and current block)

Ex. -26-(-17)=-9

The AC coefficients are mapped to runlength pairs

– (0,-3) (0,1) ……………………..(5,-1),(0,-1), EOB

These are then Huffman coded (codes are specified in the JPEG scheme)

Example: Decoding the Coefficients

Example: Denormalization

Example: IDCT

Example: Shifting Back the Coefficients

F(x,y)-F^(x,y)

F^(x,y)

image compression (chapter 8)site.iugaza.edu.ps/rsalamah/files/2012/02/chapter_8.pdfa reversible...

Documents