image copression
TRANSCRIPT
Introduction to Image and Video Compression
2
Need for Image & Video Compression
Uncompressed video640 x 480 resolution, 8 bit (1 bytes) colour, 24 fps
307.2 Kbytes per image (frame)7.37 Mbytes per second442 Mbytes per minute26.5 Gbytes per hour
640 x 480 resolution, 24 bit (3 bytes) colour, 30 fps921.6 Kbytes per image (frame)27.6 Mbytes per second1.66 Gbytes per minute99.5 Gbytes per hour
Given a 100 Gigabyte disk (100 x 109 bytes), can store about 1-4 hours of high quality video
MPEG-1 compresses video to 187 Kbytes/second
3
Need for Image & Video Compression
– Raw video contains an immense amount of data– Communication and storage capabilities are limited and expensive• Example HDTV video signal:– 720x1280 pixels/frame, progressive scanning at 60 frames/s:
– 20 Mb/s HDTV channel bandwidth→ Requires compression by a factor of 70 (equivalent to .35 bits/pixel)
4
Compression
Based on Information Theory as described by Claude Shannon in 1948
Lossless and lossy methods
Efficient binary representation of information
Remove redundancy
5
Compression Theory (lossy/lossless)
5 4 2
4.9 3.9 1.9
5 4 2
LosslessCODEC
3.1 2.1 0
LossyCODE
C
Original Data
Decoded DataDecoded Data
Mild Error
Decoded Data
High Error
6
Compression theory
Original Picture 800 x 600 Decoded Picture (Low Losses)
1.37 MB 85 KB
7
Compression theory
Original Picture 800 x 600 Decoded Picture (High Losses)
1.37 MB 16 KB
8
Compression Theory (Entropy)
10 22 0 0 3 03 30 03 10 3 30 33 44 05 35 0 50 05 0 50 5 50 8 60 8 60 66 8 66 666 88 86 79 84 67 34 22 0 0 3 03 30 03 10 33 30 33 44 05 05 0 50 05 0 50 5 50 8 60 8 60 66 8 60 8 60 66 8 66 666 88 86 79 84 67 34 22 0 30 3 33 30 03 10 3 30 33 44 05 35 0 50 05 0 50 5 50 8 60 8 60 66 44 33 0 0 0 0 0 0 0
ENTROPYENTROPY
Redundancy
9
Compression Theory (Entropy)
ENTROPYENTROPY
Optimal Lossless Compression
Full Entropy is Preserved
Part of the Entropy
Lossy Compression
Only Part of the Entropy is Preserved
10
Image Compression and Formats
RLEHuffmanLZWGIFJPEGFractalsTIFF, PICT, BMP, etc.
11
Video Compression and Formats
H.261/H.263CinepakSorensenIndeoReal VideoMPEG-1, MPEG-2, MPEG-4, etc.QuickTime, AVI
12
Special Coding Requirements
Viewing real-time source informationEnd-to-end delay (EED) should not exceed 150-200 ms Face-to-face application needs EED of 50ms (including compression and decompression)
Interactive viewing - random access Random access to single images and audio frames, access time should be less than 0.5 secDecompression of images, video, audio should not be linked to other data units to support random access
13
Compression Steps
Picture Preparation
Picture Processing
Quantization
Entropy Coding
CompressedPicture
UncompressedPicture
AdaptiveFeedbackLoop
14
Picture Preparation
Analog-to-digital conversionGenerate appropriate digital representationDivide picture into macro blocks (usually 8x8)Fix the number of bits per pixel
15
Picture Processing
Transform to frequency domainE.g., use the Discrete Cosine Transform (DCT)
Compute motion vectors for each block
16
Quantization and Coding
Map real numbers to integersE.g., the DC and AC coefficients from DCT are real numbers, but only want to store as integers
Entropy codingCompress a sequential bit stream without loss
17
Types of Compression
Symmetric compressionRequires same time for encoding and decodingUsed for dialog mode applications (teleconference)
Asymmetric compressionPerformed once when enough time is availableTwo Pass Encoding Used for retrieval mode applications (e.g., an interactive CD-ROM)
18
Broad Classification of Compression Techniques
Entropy Codinglossless encodingused regardless of media’s specific characteristicsdata taken as a simple digital sequencedecompression process regenerates data completelye.g. Run-length coding, Huffman coding, LZW, Arithmetic coding
Source Codinglossy encodingMay take into account the semantics of the datadegree of compression depends on data contentE.g. DPCM, ADPCM
Hybrid Coding (used by most multimedia systems)combine entropy with source encodingE.g. JPEG, H.263, MPEG-1, MPEG-2, MPEG-4
19
Compression Techniques
Statistical techniques (Entropy)Predictive techniques (Source)Transform techniques (Hybrid)
20
Statistical Techniques
Entropy (H) refers to how much variability is in dataLow/high entropy means low/high variabilityZero-order entropy model First-order entropy model
Huffman encoding, for exampleUse statistical profile to determine encoding schemeUse fewer/more bits to encode more/less frequent dataMust transmit a codebook
Μ= 2logH
)(log)( 21
iPiPHm
i∑=
−=
21
Huffman Coding
Compressed Data
Source Data
1114E1106D1016C1008B020A
CodeCountSymbolHuffman Table
Total Size: 44 Bytes / 352 Bits
466820CountEDCBASymbol
ABABCCDBBABBAAABBAACCDDEEAAC...
010001001011011101001001100100000100100…
Total size : 92 Bits
Compression Ration: 3.8
≅ 2 Bits/PixelEncoder
22
Predictive Techniques
DPCMCompare adjacent pixels and only transmit the difference between themE.g., use 8 bits per pixel and use 4 bits for each difference
ADPCMSimilar to DPCM, but use a variable number of pixels to transmit differencesE.g., use 8 bits per pixel and use 1-5 bits for each difference
23
Transform Techniques
Convert to data to an alternate form that better supports specific operations
Typically operates on blocks of dataLarger blocks give better results, but require more computational overhead
Discrete Cosine TransformConverts an image from spatial to frequency domainOperates on 8 x 8 blocks (64 pixels) of dataDC coefficient represents zero spatial frequency, which is the average value for all the pixels in the 8 x 8 blockAC coefficients represent amplitudes of progressively higher horizontal and vertical spatial frequency components
24
Run Length Encoding (RLE)
Form of entropy coding (lossless)Content dependent coding scheme
Series of repeated values replaced by a single value and a countExample:
The sequence abbbbbbbccddddeedddwould be replaced by 1a7b2c4d2e3d
25
Example Implementation
Use 8 bits (1 byte) data elementsUnsigned [0, 255]; signed [-127, 127]
Encode repeated values using two bytesFirst byte provides count (N) between [-1, -127]The repetition count is –N + 1Second byte contains the data value to repeat
Repeated patterns cannot be longer than 128Must be broken into multiple runs
A non-repeating data value will have a positive first byte, which is the data value
26
Exercise
Assume same sequence from beforeabbbbbbbccddddeeddd
Given previous implementation, what would be the transmitted code
27
Exercise
Assume same sequence from beforeabbbbbbbccddddeeddd
Given previous implementation, what would be the transmitted code
a -6b -1c -3d -1e -2d
28
Differential Encoding (Source)
Consider a sequence of values S1, S2, S3, etc. that differ in value, but not dramatically
Encode differences from a specific value E.g., S1, S2-S1, S3-S2, etc.
E.g. still imageCalculate difference between nearby pixelsAreas of rapid color change characterized by large values, other areas by small valuesAfter differential encoding, apply RLE
29
Differential Encoding Example
0 0 0 0 0
0 255 250 253 251
0 255 251 254 255
0 0 0 0 0
DPCM: 0, 0, 0, 0, 0, 0, 255, -5, 3, -2, 0, 255, -4, 3, 1, …RLE: 6(0), 255, -5, 3, -2, 0, 255, -4, 3, 1, …
30
Compression Steps
Picture Preparation
Picture Processing
Quantization
Entropy Coding
CompressedPicture
UncompressedPicture
AdaptiveFeedbackLoop
31
JPEG Compression Steps
Prepare the image for compressionTransform color spaceDown-sample componentsInterleave the color planesPartition into smaller blocks
Apply the Discrete Cosine TransformQuantize the DC and AC coefficientsApply entropy encoding to quantized numbers
32
JPEG Algorithm
FDCT
SourceImage
Quantizer EntropyEncoder
TableTable
Compressedimage data
DCT-based encoding
8x8 blocks
R
BG
33
Color Transformation
May want to transform RGB to YUV or YCbCrThis is an optional step, so why do it?
Human visual system detects changes better in luminance than in chrominance components
YUV or YCbCr better supports this than RGBFacilitates better compression by enabling
Down-sampling of the image in chrominanceElimination of more of the high-frequency changes in the chrominance dimensions
34
Down-Sampling Chrominance
Optional step, but increases compression ratioOnly down-sample the chrominance components, never the luminance component
Average groups of pixels in the horizontal and/or vertical resolution of the image
Referred to as H2V1, H2V2, 4:2:2, 4:2:0, etc.E.g., H2V2 means average every 2 pixels in horizontal and vertical dimensionsH4V4 means use the SAME resolution in the horizontal and vertical dimensions
35
Color sub-sampling
Sub-Sampling
4:4:4 4:2:2
36
Color sub-sampling
4:4:4 4:1:1
Sub-Sampling
37
Color sub-sampling
4:4:4 4:2:0
Sub-Sampling
38
How does colour sub-sampling aid compression?
100 x 100 pixels frame requires:100 x 100 Y Pixels100 x 100 Cr Pixels100 x 100 Cb Pixels
Using 4:2:0 colour sub-sampling:100 x 100 Y Pixels50 x 50 Cr Pixels50 x 50 Cb Pixels
39
Blocks and Pixel Shift
Divide each component into 8x8 blocksSend to the FDCT
Shift each pixel from unsigned range [0, 255] into signed range [-127, 127]
DCT requires range be centered around 0
SourceImage
40
Forward DCT
Convert from spatial to frequency domainConvert intensity function into weighted sum of elementary frequency componentsIdentify pieces of spectral information that can be thrown away without loss of quality
Intensity values in each color plane often change slowly (see next example)
Contributions from higher frequency bands in the frequency domain can be ignoredBetter compression without loss of quality
41
Equations for 2D DCT
Forward DCT:
Inverse DCT:
⎟⎠⎞
⎜⎝⎛ +
⎟⎠⎞
⎜⎝⎛ +
= ∑∑−
=
−
= mvy
nuxyxIvCuC
nmvuF
m
y
n
x 2)12(cos*
2)12(cos*),()()(2),(
1
0
1
0
ππ
⎟⎠⎞
⎜⎝⎛ +
⎟⎠⎞
⎜⎝⎛ +
= ∑∑−
=
−
= mvy
nuxvCuCuvF
nmxyI
m
v
n
u 2)12(cos*
2)12(cos)()(),(2),(
1
0
1
0
ππ
42
Visualization of Basis Functions
Increasing frequency
Incr
easi
ng fr
eque
ncy
43
Quantization
Divide each coefficient in a block by an integer in the range [1, 255]
Comes in the form of a table, same size as a blockMultiply the block of coefficients by the table, and then round the result to nearest integer
In the decoding process, multiply the quantized coefficients by the inverse of the table
Get back a number close to, but the same as the originalError is always less than half of the quantization number
Larger numbers in quantization table cause more lossThis is the main source of loss in JPEG
44
De facto Quantization Table
9910310192120100112121113776210310410398958781109805655698768647892726455565157606151584029373549242222242640241916171814131416101214121116
Eye becomes less sensitive
Eye becom
es less sensitive
45
Entropy Encoding
Compress the sequence of quantized DC and AC coefficients from the quantization step
Further increase compression, but without loss
Separate DC from AC componentsDC components change slowly, thus will be encoded using difference encoding
46
DC Encoding
DC represents average intensity of a blockBecause image intensity tends to change slowly, DC values tend to change slowlyEncode using difference encoding schemeUse 3x3 pattern of blocks
Because difference tends to be near zero, can use less bits in the encoding
Categorize difference into difference classesSend the index of the difference class, followed by the bits representing the difference
47
AC Encoding
Use a zig-zag ordering of coefficients, why?Orders frequency components from low->highShould produce maximal series of 0s at the endLends well to RLE
Apply RLE to ordering
48
Huffman Encoding
String together the RLE of the AC coefficients along with the DC difference indices and valuesApply Huffman encoding to resulting sequenceAttach appropriate headersFinally have the JPEG image!