lectures 5: image compression
DESCRIPTION
Lectures 5: Image Compression. Professor Heikki Kälviäinen Machine Vision and Pattern Recognition Laboratory Department of Information Technology Faculty of Technology Management Lappeenranta University of Technology (LUT) [email protected] http://www.lut.fi/~kalviai - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/1.jpg)
CT50A6100 Machine Vision and DIA
1 Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
Lectures 5: Image Compression
Professor Heikki Kälviäinen
Machine Vision and Pattern Recognition Laboratory Department of Information Technology
Faculty of Technology ManagementLappeenranta University of Technology (LUT)
[email protected]://www.lut.fi/~kalviai
http://www.it.lut.fi/ip/research/mvpr/
![Page 2: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/2.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
2
Content
• Introduction. • Fundamentals in data compression. • Binary Image Compression. • Continuous tone images. • Video image compression.
• The material at the following site are used: http://cs.joensuu.fi/pages/franti/imagecomp/Special thanks to the authors of the material Prof. Pasi Fränti and
Dr. Alexander Kolesnikov from University of Joensuu, Finland.
![Page 3: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/3.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
3
Introduction
• Why do we need to compress images?
• Image types.• Parameters of digital images.• Lossless vs. lossy
compression.• Measures: rate, distortion,
etc.
![Page 4: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/4.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
4
What is data and image compression?
• Data compression is the art and science of representing information in a compact form.
• Data is a sequence of symbols taken from a discrete alphabet.
• Still image data, that is a collection of 2-D arrays (one for each color plane) of values representing intensity (color) of the point in corresponding spatial location (pixel).
![Page 5: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/5.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
5
Why do we need image compression?
Still image: One page of A4 format at 600 dpi is > 100 MB.
One color image in a digital camera generates
10-30 MB. Scanned 3”7” photograph at 300 dpi is 30 MB.
Digital cinema: 4K2K3 12 bits/pel = 48 MB/frame or 1 GB/sec
or 70 GB/min.
![Page 6: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/6.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
6
Why do we need image compression? (cont.)
1) Storage.2) Transmission.3) Data access. 1990-2000
Disc capacities : 100MB -> 20 GB (200 times!) but seek time: 15 milliseconds 10 milliseconds and transfer rate : 1MB/sec ->2 MB/sec.
Compression improves overall response time in some applications.
![Page 7: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/7.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
7
Source of images
• Image scanner.
• Digital camera.
• Video camera.
• Ultra-sound (US), Computer Tomography (CT),
Magnetic resonance image (MRI), digital X-ray
(XR),
Infrared.
• Etc.
![Page 8: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/8.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
8
Image types
IMAGECOMPRESSION
UNIVERSALCOMPRESSION
Videoimages
G ray -scale im ages
T ru e co lo u r im ag es
B in aryim ag es
Colour palette images
T ex tu al d ata
Why do we need special algorithms for images?
![Page 9: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/9.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
9
Binary image: 1 bit/pixel
![Page 10: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/10.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
10
Grayscale image: 8 bits/pixel
Intensity = 0-255
![Page 11: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/11.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
11
6 bits(64 gray levels)
4 bits(16 gray levels)
2 bits(4 gray levels)
384256
192128
9664
4832
Parameters of digital images
![Page 12: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/12.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
12
True color image: 3*8 bits/pixel
![Page 13: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/13.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
13
RGB color space
Red Green Blue
![Page 14: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/14.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
14
YUV color space
Y U V
![Page 15: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/15.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
15
RGB YUV
R, G, B -- red, green, blueY -- the luminance U,V -- the chrominance components
Most of the information is collected to the Y component,
while the information content in the U and V is less.
YRV
YBU
BGRY
1.06.03.0
![Page 16: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/16.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
16
Palette color image
42 9855 19
Image
R G B012
97
...9899
255
64 64 0 [R,G,B] = LUT[Index]
Example: [64,64,0] = LUT[98]
Look-up-table
![Page 17: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/17.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
17
Multicomponent image: n*8 bits/pixel
Spectral image:
n components
according to
wavelengths.
Three components
R, G, B
=> “usual” color image.
![Page 18: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/18.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
18
Multicomponent image: n*8 bits/pixel (cont.)
Spectral components and spatial components.
For example, remote sensing (satellite images).
![Page 19: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/19.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
19
Why we can compress image?
Statistical redundancy:
1) Spatial correlation
a) Local: pixels at neighboring locations have similar intensities.
b) Global: reoccurring patterns.
2) Spectral correlation – between color planes.
3) Temporal correlation – between consecutive frames.
Tolerance to fidelity: (toistotarkkuus)
1) Perceptual redundancy.
2) Limitation of rendering hardware.
![Page 20: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/20.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
20
Lossy vs. lossless compression
Lossless compression: reversible, information preserving text compression algorithms, binary images, palette images.
Lossy compression: irreversible grayscale, color, video.
Near-lossless compression: medical imaging, remote sensing.
1) Why do we need lossy compression?2) When we can use lossy compession?
![Page 21: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/21.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
21
Lossy vs. lossless compression (cont.)
100%
0%
20%
40%
60%
80%
4.3 %
53.4 %
6.7 %
CCITT-3binaryJBIG(lossless)
LENAgray-scaleJPEG(lossless)
LENAgray-scaleJPEG(lossy)
IMAGE:TYPE:METHOD:
![Page 22: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/22.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
22
What measures?
• Bit rate: How much per pixel?• Compression ratio: How much smaller?• Computation time: How fast?• Distortion: How much error in the
presentation?
![Page 23: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/23.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
23
Rate measures
Bit rate:
Compression ratio:
N
C
image in the pixels
file compressed theof size
C
kN
file compressed theof size
file original theof size
bits/pixel
k = the number of bits per pixel in the original image
C/N = the bit rate of the compressed image
![Page 24: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/24.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
24
Distortion measures
Mean average error (MAE):
N
iii xy
N 1
1MAE
Mean square error (MSE):
N
iii xy
N 1
21MSE
MSElog10PSNR 210 A
Signal-to-noise ratio (SNR):
Pulse-signal-to-noise ratio (PSNR):
MSElog10SNR 210
(decibels)
(decibels) A is amplitude of the signal: A = 28-1=255 for 8-bits signal.
![Page 25: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/25.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
25
Other issues
• Coder and decoder computation complexity.• Memory requirements.• Fixed rate or variable rate.• Error resilience (sensitivity).• Symmetric or asymmetric.• Decompress at multiple resolutions.• Decompress at various bit rates.• Standard or proprietary (application based).
![Page 26: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/26.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
26
Fundamentals in data compression
• Modeling and coding:– How, and in what order the image is processed?– What are the symbols (pixels, blocks) to be coded?– What is the statistical model of these symbols?
• Requirement:– Uniquely decodable: different input => different output.– Instantaneously decodable: the symbol can be
recognized after its last bit has been received.
![Page 27: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/27.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
27
Modeling: Segmentation and order of processing
• Segmentation: – Local (pixels) or global (fractal compression). – Compromise: block coding.
• Order of processing: – In what order the blocks (or the pixels) are
processed?– In what order the pixels inside the block are
processed?
![Page 28: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/28.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
28
Modeling: Order of processing
• Order of processing: – Row-major order: top-to-down, left-to-right.– Zigzag scanning:
• Pixel-wise processing (a). • DCT-transformed block (Discrete Cosine Transform) (b).
– Progressive modeling:• The quality of an image quality increases gradually as data are
received.• For example in pyramid coding: first the low resolution version,
then increasing the resolution.
(a)
(b)
![Page 29: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/29.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
29
Modeling: Order of processing
0.1 % 0.5 % 2.1 % 8.3 %
![Page 30: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/30.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
30
Modeling: Statistical modeling
Set of symbols (alphabet) S={s1, s2, …, sN},N is number of symbols in the alphabet.
Probability distribution of the symbols: P={p1, p2, …, pN}
According to Shannon, the entropy H of an informationsource S is defined as follows:
N
iii ppH
12 )(log
![Page 31: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/31.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
31
Modeling: Statistical modeling
The amount of information in symbol si, i.e., the number of
bits to code or code length for the symbol si:
)(log)( 2 ii psH
N
iii ppH
12 )(log
The average number of bits for the source S:
![Page 32: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/32.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
32
Modeling: Statistical modeling
• Modeling schemes:– Static modeling:
• Static model (code table).• One-pass method: encoding. • ASCII data: p(‘e’)= 10 %, p(‘t’)= 8 %.
– Semi-adaptive modeling:• Two-pass method: (1) analysis, (2) encoding.
– Adaptive (or dynamic) modeling: • Symbol by symbol on-line adaptation during
coding/encoding.• One-pass method: analysis and encoding.
![Page 33: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/33.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
33
Modeling: Statistical modeling
• Modeling schemes:– Context modeling:
• Spatial dependencies between the pixels.• For example, what is the most probable symbol after a
known sequence of symbols? – Predictive modeling (for coding prediction errors):
• Prediction of the current pixel value.• Calculating the prediction error.• Modeling the error distribution. • Differential pulse code modulation (DPCM).
![Page 34: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/34.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
34
Coding: Huffman coding
INIT:Put all nodes in an OPEN list and keep it sorted all timesaccording to their probabilities.
REPEAT
a) From OPEN pick two nodes having the lowest probabilities, create a parent node of them.
b) Assign the sum of the children’s probabilities to the parent node and inset it into OPEN.
c) Assign code 0 and 1 to the two branches of the tree, and delete the children from OPEN.
![Page 35: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/35.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
35
Huffman Coding: Example
Symbol pi -log2(pi) CodeSubtotal
A 15/39 1.38 0 2*15 B 7/39 2.48 100 3*7 C 6/39 2.70 101 3*6 D 6/39 2.70 110 3*6 E 5/39 2.96 111 3*5 Total: 87 bits
0 1
10
10
A
C D E
1
B
H = 2.19 bits
L = 87/39=2.23 bits
Binary tree
![Page 36: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/36.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
36
Huffman Coding: Decoding
A - 0B - 100 C - 101D - 110E - 111
0 1
10
10
A
C D E
1
B
Binary tree
Bit stream: 1000100010101010110111 (22 bits)Codes: 100 0 100 0 101 0 101 0 110 111Message: B A B A C A C A D E
![Page 37: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/37.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
37
Properties of Huffman coding
• Optimum code for a given data set requires two passes.
• Code construction complexity O(N log N).
• Fast lookup table based implementation.
• Requires at least one bit per symbol.
• Average codeword length is within one bit of zero-order entropy (Tighter bounds are known): H R H+1 bit
• Susceptible to bit errors.
![Page 38: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/38.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
38
Coding: Arithmetic coding
• Alphabet extension (blocking symbols) can lead to coding efficiency.• How about treating entire sequence as one symbol!• Not practical with Huffman coding.• Arithmetic coding allows you to do precisely this.• Basic idea: map data sequences to sub-intervals in [0,1) with lengths
equal to the probability of corresponding sequence. • QM-coder is an arithmetic coding tailored for binary data.
1) Huffman coder: H R H + 1 bit/pel
2) Block coder: Hn Rn Hn + 1/n bit/pel
3) Arithmetic coder: H R H + 1 bit/message (!)
![Page 39: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/39.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
39
Arithmetic coding: Example
0.70
![Page 40: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/40.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
40
Binary image compression
• Binary images consist only of two colors, black and white.
• The probability distribution of the alphabet is often very skew: p(white)=0.98, and p(black)=0.02.• Moreover, the images usually have large homogenous areas of the same color.
![Page 41: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/41.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
41
Binary image compression: Methods
• Run-length encoding.• Predictive encoding.• READ code.• CCITT group 3 and group 4 standards.• Block coding.• JBIG, JBIG2 (Joint Bilevel Image Experts Group).
• Standard by CCITT and ISO.
• Context-based compression pixel by pixel.
• QM-coder (arithmetic coder).
![Page 42: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/42.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
42
Run-length coding: Idea
• Pre-processing method, good when one symbol occurs with high probability or when symbols are dependent.
• Count how many repeated symbol occur.• Source ’symbol’ = length of run.
Example: …, 4b, 9w, 2b, 2w, 6b, 6w, 2b, ...
![Page 43: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/43.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
43
Run-length encoding: CCITT standard
Huffman code table
![Page 44: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/44.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
44
JBIG
Graphic(line art)Halftone
• Bilevel (binary) documents.• Both graphics and pictures (halftone).
![Page 45: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/45.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
45
Comparison of algorithms
BLOCK RLE 2D-RLE ORLE G3 G4 JBIG0.0
5.0
10.0
15.0
20.0
25.0Compression ratio
7.9 9.8
18.0 18.9 17.923.3
10.3
PKZIPGZIP
8.9 10.8 11.2
COMPRESS
COMPRESS = Unix standard compression software
GZIP = Gnu compression software
PKZIP = Pkware compression software
BLOCK = Hierarchical block coding [KJ80]
RLE = Run-length coding [NM80]
2D-RLE = 2-dimensional RLE [WW92]
ORLE = Ordered RLE [NM80]
G3 = CCITT Group 3 [YA85]
G4 = CCITT Group 4 [YA85]
JBIG = ISO/IEC Standard draft [PM93]
![Page 46: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/46.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
46
Continuous tone images: lossless compression
• Lossless and near-lossless compression.– Bit-plane coding: to bit-planes of a grayscale
image.– Lossless JPEG (Joint Photographic Experts
Group).• Pixel by pixel by predicting the current pixel
on the basis of the neighboring pixels.• Prediction errors coded by Huffman or
arithmetic coding (QM-coder).
![Page 47: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/47.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
47
Continuous tone images: lossy compression
• Vector quantization: codebooks. • JPEG (Joint Photographic Experts Group).• Lossy coding of continuous tone still images (color and
grayscale).• Based on Discrete Cosine Transform (DCT): 0) Image is divided into block NN. 1) The blocks are transformed with 2-D DCT. 2) DCT coefficients are quantized. 3) The quantized coefficients are encoded.
![Page 48: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/48.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
48
JPEG: Encoding and Decoding
SourceImage Data
8x8 blocks
FDCT Q u a n tiz e r E n trop yE n co der
T a b leS p ec if ica tio n s
T a b leS p ecif ica tio n s
CompressedImage Data
IDCTD eq u a n tiz erEntropyDecoder
TableSpecifications
TableSpecifications
Reconstructed Image Data
CompressedImage Data
![Page 49: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/49.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
49
Divide image into NN blocks
8x8 blockInput image
![Page 50: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/50.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
50
2-D DCT basis functions: N=8
Low
Low
High
High
Low
High
HighLow
8x8 block
![Page 51: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/51.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
51
2-D Transform Coding
+
...
y00
y01 y10y12
y23
![Page 52: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/52.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
52
Zig-zag ordering of DCT coefficients
Converting a 2-D matrix into a 1-D array, so that the frequency (horizontal and vertical) increases in this order and the coefficents
variance are decreasing in this order.
AC: Alternating current
DC: Direct current
![Page 53: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/53.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
53
Example of DCT for image block
Matlab: y=dct(x)
![Page 54: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/54.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
54
Performance of the JPEG algorithm
8 bpp 0.6 bpp
0.37 bpp 0.22 bpp
![Page 55: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/55.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
55
Continuous tone images: more methods
• Pyramid coding.• Fractal coding.• Wavelet transform.
– JPEG 2000.
![Page 56: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/56.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
56
JPEG JPEG2000
JPEG: 0.25 bpp JPEG2000: 0.25 bpp
![Page 57: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/57.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
57
Video image compression
• Video images are three-dimensional generalization of still images (spatial correlation) where the third dimension is
time (spatial and temporal correlation). • Each frame of a video sequence can be compressed by any image compression algorithm.• Motion JPEG (M-JPEG).
– Images separately JPEG coded.• MPEG (Moving Pictures Expert Group).
– Temporal correlations used. – Two basic techniques:
• Block based motion compensation.• DCT based compression.
![Page 58: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/58.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
58
Video images: compression ratios
NTSC TV HDTV Film qualityChannel Bit rate 168 Mb/s 933 Mb/s 2300 Mb/s
PC LAN 30 kb/s 5,600:1 31,000:1 76,000:1Modems 56 kb/s 3,000:1 17,000:1 41,000:1ISDN 64 - 144 kb/s 1,166:1 6,400:1 16,000:1T-1, DSL 1.5 Mb/s 112:1 622:1 1,500:1Ethernet 10 Mb/s 17:1 93:1 230:1T-3 42 Mb/s 4:1 22:1 54:1Fiber optic 200 Mb/s 1:1 5:1 11:1
![Page 59: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/59.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
59
MPEGs
• MPEG-1 (1992): VideoCD. • MPEG-2 (1994): DVD, digital TV, SVCD. * about 50:1 compression, typically 3-10 Mbps. • MPEG-3: was abandoned.
• MPEG-4 (1999+): DivX (starting from Version 5).
* designed specially for low-bandwidth.
• MPEG-7 (>1998):
* searching and indexing of a/v data, using Description
Tools.
![Page 60: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/60.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
60
MPEG-1: Blocks
• The pictures are divided into 16x16 macroblocks, each consisting of four 8x8 elementary blocks. • The choice of the prediction method is chosen for each
macroblock separately. • The intra-coded blocks are quantized differently from the predicted blocks: * Intra-coded blocks contain information in all frequencies and are quantized differently from the predicted blocks * The predicted blocks, contain mostly high frequencies and can be quantized with more coarse quantization tables.
![Page 61: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/61.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
61
MPEG-1: Inter-block Prediction
I B B B P B B B
Forwardprediction
Bidirectional prediction
P B B B I
Forwardprediction
• Bidirectional prediction.• Forward prediction.• Backward prediction.• Intra coding.
I IP P
![Page 62: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/62.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
62
MPEG-1: Predictions schemes
I: Intra pictures are coded as still images by DCT.
P: Predicted pictures are coded with reference to a past
picture. The difference between the prediction and the
original picture is then compressed by DCT.
B: Bidirectional pictures, the prediction can be made both to a past
and a future frame. Bidirectional pictures are never used as
reference.
I B B B P B B B
Forwardprediction
Bidirectional prediction
P B B B I
Forwardprediction
![Page 63: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/63.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
63
Motion estimation and compensation
• The prediction block in the reference frame is not necessarily in the same coordinates than the block in the current frame. • Because of motion in the image sequence, the most suitable predictor for the current block may exist
anywhere in the reference frame. • The motion estimation specifies where the best prediction (best match) is found.
• Motion compensation consists of calculating the difference between the reference and the current block.
![Page 64: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/64.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
64
Motion estimation: 1
• Exhaustive search block matching.
Slow!
![Page 65: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/65.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
65
Motion estimation: 2
• Hierarchical block matching.
![Page 66: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/66.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
66
Multicomponent (spectral) image compression
![Page 67: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/67.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
67
Compression: Spectral reduction and
clustering
![Page 68: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/68.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
68
Spectral video: MPEG for multicomponent images
![Page 69: Lectures 5: Image Compression](https://reader036.vdocuments.us/reader036/viewer/2022062408/56813f92550346895daa8108/html5/thumbnails/69.jpg)
CT50A6100 Machine Vision and DIA
Prof. Heikki Kälviäinen, Prof. P. Franti, Dr. A. Kolesnikov
69
Summary: Image compression
IMAGECOMPRESSION
UNIVERSALCOMPRESSION
Videoimages
G ray -scale im ages
T ru e co lo u r im ag es
B in aryim ag es
Colour palette images
T ex tu al d ata
Why do we need special image compression algorithms?
MPEGJPEG
JPEGJBIG
M-JPEG
Huffman coding
Arithmetic coding
Multicomponent images
Fractal coding, pyramid coding, bit-plane coding, vector quantization
DCT RLEWavelet
transform