image compression (chapter 8)site.iugaza.edu.ps/rsalamah/files/2012/02/chapter_8.pdfa reversible...
Post on 09-Feb-2020
0 Views
Preview:
TRANSCRIPT
Image Compression (Chapter 8)
Introduction
The goal of image compression is to reduce the
amount of data required to represent a digital image.
Important for reducing storage requirements and
improving transmission rates.
Approaches
Lossless – Information preserving
– Low compression ratios
– e.g., Huffman
Lossy – Does not preserve information
– High compression ratios
– e.g., JPEG
Tradeoff: image quality vs compression ratio
Data vs Information
Data and information are not synonymous terms!
Data is the means by which information is conveyed.
Data compression aims to reduce the amount of data
required to represent a given quantity of information
while preserving as much information as possible.
Data vs Information (cont’d)
The same amount of information can be represented
by various amount of data, e.g.:
Your wife, Helen, will meet you at Logan Airport
in Boston at 5 minutes past 6:00 pm tomorrow
night
Your wife will meet you at Logan Airport at 5
minutes past 6:00 pm tomorrow night
Helen will meet you at Logan at 6:00 pm
tomorrow night
Ex1:
Ex2:
Ex3:
Data Redundancy
Data redundancy is a mathematically quantifiable entity!
compression
Data Redundancy (cont’d)
Example:
Types of Data Redundancy
1. Coding redundancy
2. Interpixel redundancy
3. Psychovisual redundancy
The role of compression is to reduce one or more of
these redundancy types.
Coding Redundancy
Data compression can be achieved using an
appropriate encoding scheme.
Example: binary encoding
Encoding Schemes
Elements of an encoding scheme:
– Code: a list of symbols (letters, numbers, bits etc.)
– Code word: a sequence of symbols used to represent a piece
of information or an event (e.g., gray levels)
– Code word length: number of symbols in each code word
Definitions
Example: variable length coding
Variable length coding
Interpixel redundancy
This type of redundancy – sometimes called spatial
redundancy, interframe redundancy, or geometric
redundancy – exploits the fact that an image very
often contains strongly correlated pixels, in other
words, large regions whose pixel values are the same
or almost the same.
Interpixel redundancy
Interpixel redundancy implies that any pixel value can be
reasonably predicted by its neighbors (i.e., correlated).
Interpixel redundancy
This redundancy can be explored in several ways, one
of which is by predicting a pixel value based on the
values of its neighboring pixels.
In order to do so, the original 2-D array of pixels is
usually mapped into a different format, e.g., an array
of differences between adjacent pixels.
If the original image pixels can be reconstructed from
the transformed data set the mapping is said to be
reversible.
Example: Run length coding (reversible)
Example: Run length coding (reversible)
For line 100:
(1, 63), (0, 87), (1, 37), (0, 5), (1, 4), (0, 556), (1, 62), (0, 210)
Note that 21=2 and 210= 1024 ⇒ 1+10=11 bits per run
Suppose that there are 12 166 runs in total
Psychovisual redundancy
Takes into advantage the peculiarities of the human visual
system.
The eye does not respond with equal sensitivity to all visual
information.
Humans search for important features (e.g., edges, texture,
etc.) and do not perform quantitative analysis of every pixel
in the image.
Elimination of psychovisual redundancy results in loss
of quantitative information.
Quantization: mapping of a broad range of input values to a
limited number of output values.
Results in lossy data compression
Psychovisual redundancy (cont’d)
256 gray levels 16 gray levels
improved gray-scale quantization 16 gray levels
8/4
=
2:1
i.e., add to each pixel a
pseudo-random number
prior to quantization(IGS)
Fidelity Criteria
How close is to ?
Criteria
– Subjective: based on human observers
– Objective: mathematically defined criteria
Subjective Fidelity Criteria
Measuring image quality by subjective evaluations of a
human observer.
Objective Fidelity Criteria
The level of inf. loss expressed as a function of the
original and compressed image.
Root mean square error (RMS)
Mean-square signal-to-noise ratio (SNR)
Example
original RMS=5.17 RMS=15.67 RMS=20.17
Image Compression Model
Image Compression Model
Source encoder: Removes redundant data
Channel encoder:
– Source-encoded data contains only the essential
and is therefore sensitive to noise.
– Channel encoder increases noise-immunity of
source-encoded data when the channel is noisy.
– Reduces impact of errors by adding controlled
redundancy to source-encoded data (Hamming
algorithm)
– When channel is noise free, this is omitted...
Source Encoder
Mapper: reduces interpixel redundancy
• Run length coding: reversible
• Calculation of DCT: reversible
Quantizer: reduces psycho-visual redundancy
• Reduction of grey scales: not reversible
Symbol encoder: reduces coding redundancy
• Variable length coding: reversible
Decoder
The inverse operations are performed.
But … quantization is irreversible in general.
How do we measure information?
What is the information content of a message/image?
What is the minimum amount of data that is sufficient
to describe completely an image without loss of
information?
Answers are provided by information theory.
Modeling the Information Generation Process
Assume that information generation process is a
probabilistic process.
A random event E which occurs with probability P(E)
contains:
The base for the logarithm depends on the units for
measuring information. Usually, we use base 2, which
gives the information in units of “binary digits” or “bits.”
The amount of self-information attributed to event E
is inversely related to the probability of E.
Suppose that the gray level value of pixels is
generated by a random variable, then rk contains
Entropy: the average information content of each
pixel in an image.
As E increases more information is associated with
the image.
Average information of an image
units of information
Bits/pixel
Redundancy:
Average information of an image
where:
Entropy Estimation
First order
estimate of H:
The first-order estimate gives only a lower-bound on the
compression that can be achieved.
Lossless Compression
Error-free compression techniques generally
composed of:
1. Devising an alternative representation (mapping) of the
image to reduce its interpixel redundancy.
2. Coding the representation to eliminate coding
redundancy.
Huffman, Golomb, Arithmetic coding redundancy
LZW, Run-length, Symbol-based, Bit-plane
interpixel redundancy
Huffman Coding
It is a variable-length coding technique.
Most probable symbol is assigned the shortest code word.
Optimal code: minimizes the number of code symbols per source symbol.
Huffman Coding (cont’d)
Forward Pass
1. Sort probabilities per symbol from top to bottom.
2. Combine the lowest two probabilities.
3. Repeat Step2 until only two probabilities remain.
Huffman Coding (cont’d)
Backward Pass
Assign code symbols going backwards
Huffman Coding (cont’d)
Lavg using Huffman coding:
Lavg assuming binary codes:
Example
Huffman Coding: Properties
The resulting code is called a Huffman code. It
has some interesting properties:
(1)The source symbols can be encoded (and decoded)
one at time (with a lookup table).
(2) It is called a block code because each source symbol
is mapped into a fixed sequence of code symbols.
(3) It is instantaneous because each codeword in a
string of code symbols can be decoded without
referencing succeeding symbols.
(4) It is uniquely decodable because any string of
code symbols can be decoded in only one way.
Huffman Coding: Properties
Disadvantage: For a source with J symbols, we need J - 2
source reductions. This can be computationally intensive for
large J (ex. J = 256 for an image with 256 gray levels).
Near Optimal Variable Length Codes
2B
LZW Coding
Also attack inter-pixel redundancies
Assigns fixed-length code words to variable length
sequences of source symbols
Requires no a priori knowledge of probability of
occurrence of symbols
Integrated into GIF, TIFF and PDF file formats
LZW Coding
A codebook or a dictionary has to be constructed.
– Single pixel values and blocks of pixel values
For an 8-bit image, the first 256 entries are assigned
to the gray levels 0,1,2,..,255.
As the encoder examines image pixels, gray level
sequences (i.e., pixel combinations) that are not in the
dictionary are assigned to a new entry.
Example
Consider the following 4 x 4 8 bit image
39 39 126 126
39 39 126 126
39 39 126 126
39 39 126 126 Dictionary Location Entry
0 0
1 1
. .
255 255
256 -
511 -
Initial Dictionary
Example
39 39 126 126
39 39 126 126
39 39 126 126
39 39 126 126
- Is 39 in the dictionary……..Yes
- What about 39-39………….No
- Then add 39-39 in entry 256
Dictionary Location Entry
0 0
1 1
. .
255 255
256 -
511 -
39-39
Example
39 39 126 126
39 39 126 126
39 39 126 126
39 39 126 126
concatenated sequence (CS)
If CS is found:
(1) No Output
(2) CR=CS
(3) No D entry
(CR) (P)
If CS not found:
(1) Output D(CR)
(2) Add CS to D
(3) CR=P
Decoding LZW:
Let the bit stream received be:
39 39 126 126 256 258 260 259 257 126
In LZW, the dictionary which was used for
encoding need not be sent with the image. A
separate dictionary is built by the decoder, as it
reads the received code words.
Decoding LZW
LZW Decoding Example
(1)output the dictionary entry for the pixel value(s)
(2) add a new dictionary entry whose content is the RS plus the 1rst
element of the encoded value being processed
(3) set the RS to the en-coded value being processed.
Bit-plane Coding
Attack inter-pixel redundancy
First decompose the original image
into bit-planes(series of binary images)
Compress each binary image via a
binary compression method
– Run-length coding (RLC)
Bit-plane Decomposition
Bit-plane slicing
Problem: small changes in gray level can have a
significant impact on the complexity of the bit
plane
– 127 vs. 128 0111 1111 vs. 1000 0000
Solution: (gray code)
Example:
– 127 0111 1111 0100 0000
– 128 1000 0000 1100 0000
Run-Length Coding
This technique is very effective in encoding binary images
with large contiguous black and white regions, which
would give rise to a small number of large runs of 1s and
0s.
The run-lengths can in turn be encoded using a variable
length code (ex. Huffman code), for further compression.
Binary Image Compression - RLC
Developed in 1950s
Standard compression approach in FAX coding
Approach
– Code each contiguous group of 0’s or 1’s encountered in a left to right scan of a row by its length
1 1 1 1 1 0 0 0 0 0 0 1 (1,5) (0, 6) (1, 1)
– Establish a convention for determining the value of the run
– Code black and white run lengths separately using variable-length coding
A Complete Example
2 127 128
3 32 33
0000 0010 0111 1111 1000 0000
0000 0011 0010 0000 0010 0001
0000 0011 0100 0000 1100 0000
0000 0010 0011 0000 0011 0001
001 011 000 000 000 000 100 100 2 1 1 2 3 3 3 3 0 1 2 0 1 2
000 000 011 011 000 000 100 001 3 3 1 2 1 2 3 3 0 1 2 2 1
1 0.34 (1) 0.66 (0) 3 1.00 (0) 0 0.34 (1) 1 0.4 (1) 0.6 (0)
2 0.33 (00) 0.34 (1) 1 0.33 (00) 2 0.4 (00) 0.4 (1)
3 0.33 (01) 2 0.33 (01) 0 0.2 (01)
001 100 01 01 0 0 10001 01100
01 01 100 100 0 0 10001 001
Original image
Binary code
XOR binary
8 bit planes
Huffman
coding
Final code
Run-length coding
Lossy Compression
Spatial domain methods
– Lossy predictive coding (not discussed)
Transform coding
– Transform the image into a domain where
compression can be performed more efficiently
– Operate on the transformed image
Transform Coding
A reversible linear transform is used to map the image
into a set of coefficients, which are then be quantized and
coded.
For most natural images, a significant number of coef.
Have small magnitudes and can be quantized or discarded
with little image distortion...
Transform Coding
The goal of the transformation process is to pack as much
information as possible into the smallest number of
transform coefficients.
The quantization stage then selectively eliminates
(quantizes) the coefficients that carry the least
information.
The symbol encoder uses a compression coding (normally
variable length code) to represent the quantized
coefficients.
Transform Selection
The transform of an image can be computed using
various transformations, for example:
– DFT
– DCT (Discrete Cosine Transform)
– WHT (Walsh Hadmard Transformation)
– DWT
Discrete Fourier Transform
Due to its computational efficiency the DFT is very popular
however, it has strong disadvantages for some Applications:
– It is complex
– It has poor energy compaction
Energy compaction:
– Is the ability to pack the energy of the spatial sequence into as
few frequency coefficients as possible.
– most of the signal information tends to be concentrated in a few
low-frequency components of the DCT
–This is very important for image compression if compaction is
high we only have to transmit a few coefficients instead of the
whole set of pixels
Discrete Fourier Transform
The amplitude spectra of the image above
D FT D CT
Discrete Cosine Transform
if v=0
if v>0
forward
inverse
Energy Compaction
8 x 8 subimages
64 coefficients
per subimage
50% of the
coefficients
truncated
Subimage size selection
Usually, images are subdivided so that the correlation
(redundancy) between adjacent subimages is reduced
and so that the subimage size n = 2m for computational
efficiency.
In general, both the level of compression and
computational complexity increase with subimage size.
The most popular subimage sizes are 8x8 and 16x16.
Subimage size selection
Approximation of the original image using 25%
of the DCT coefficients with subimage sizes 2x2,
4x4 and 8x8
Bit Allocation
The reconstruction error is a function of the relative
importance of the transform coefficients that are discarded
and of the precision that is used to present the retained
coefficients.
In most transform coding systems, the retained coefficients
are selected based on maximum variance (zonal coding) or
based on maximum magnitude (threshold coding).
The overall process of truncating, quantizing, and coding the
coefficients of a transformed sub-image is called bit
allocation.
Zonal Coding
Transform coefficients with large variance carry most of the
information about the image. Hence a fraction of the
coefficients with the largest variance is retained
For each subimage,
(1) Compute the variance of each of the transform
coefficients; use the subimages to compute this.
(2) Keep X% of their coeff. which have maximum variance.
(3) Variable length coding (proportional to variance)
Threshold Coding
In each subimage, the transform coefficients of largest
magnitude contribute most significantly and are therefore
retained.
For each subimage:
(1)Arrange the transform coefficients in decreasing order of
magnitude .
(2) Keep only the top X% of the coefficients and discard rest.
(3) Encode the retained coefficient using variable length
code.
Threshold Coding
There are three basic ways to threshold a transformed
subimage:
1. A single global threshold applied to all subimages
– the compression level differs from image to image;
2. Individual thresholds can be used for each
subimage( N-largest )the same number of coefficients
is discarded from each subimage – constant code rate;
3. The threshold can be varied as a function of the
location of each coefficient within the subimage –
variable code rate. Thresholding and quantization are
combined.
Threshold Coding
Example
This array is a typical normalization array, which has
been used in the JPEG standardization efforts.
JPEG Standard
JPEG is the first image compression standard
Acronym for Joint Photographic Experts Group
Goal of the standard is to support a variety of
applications for compression of continuous-tone
still images of most image sizes in any color
space in order to achieve compression
performance at or near the state-of-the-art with
user-adjustable compression ratios and with very
good to excellent reconstructed quality
JPEG Compression
JPEG uses DCT for handling interpixel redundancy.
It defines three different coding systems:
1. A lossy baseline coding system based on DCT
(adequate for most compression applications)
2. An extended coding system for greater compression, higher precision, or progressive reconstruction applications
3. A lossless independent coding system for reversible compression
JPEG Compression (Sequential DCT-based encoding)
1. Divide the image into 8x8 subimages;
For each subimage do:
2. Shift the gray-levels in the range [-128, 127]
3. Apply DCT (64 coefficients will be obtained: 1 DC
coefficient F(0,0), 63 AC coefficients F(u,v)).
4. Quantize the coefficients (i.e., reduce the amplitude of
coefficients that do not contribute a lot).
5. Order the coefficients using zig-zag ordering
- Place non-zero coefficients first
- Create long runs of zeros (i.e., good for run-length
encoding)
JPEG Steps
JPEG Steps (cont’d)
6. Encode coefficients as follows:
Example: Implementing the JPEG
Baseline Coding System *
Example: Level Shifting
Example: Computing the DCT
Example: The Quantization Matrix
Example: Quantization
Zig-Zag Scanning of the Coefficients
JPEG
JPEG
Example: Coding the Coefficients
The DC coefficient is coded (difference between the DC coefficient of the
previous block and current block)
Ex. -26-(-17)=-9
The AC coefficients are mapped to runlength pairs
– (0,-3) (0,1) ……………………..(5,-1),(0,-1), EOB
These are then Huffman coded (codes are specified in the JPEG scheme)
Example: Decoding the Coefficients
Example: Denormalization
Example: IDCT
Example: Shifting Back the Coefficients
F(x,y)-F^(x,y)
F^(x,y)
top related