image copression

Introduction to Image and Video Compression

2

Need for Image & Video Compression

Uncompressed video640 x 480 resolution, 8 bit (1 bytes) colour, 24 fps

307.2 Kbytes per image (frame)7.37 Mbytes per second442 Mbytes per minute26.5 Gbytes per hour

640 x 480 resolution, 24 bit (3 bytes) colour, 30 fps921.6 Kbytes per image (frame)27.6 Mbytes per second1.66 Gbytes per minute99.5 Gbytes per hour

Given a 100 Gigabyte disk (100 x 109 bytes), can store about 1-4 hours of high quality video

MPEG-1 compresses video to 187 Kbytes/second

3

Need for Image & Video Compression

– Raw video contains an immense amount of data– Communication and storage capabilities are limited and expensive• Example HDTV video signal:– 720x1280 pixels/frame, progressive scanning at 60 frames/s:

– 20 Mb/s HDTV channel bandwidth→ Requires compression by a factor of 70 (equivalent to .35 bits/pixel)

4

Compression

Based on Information Theory as described by Claude Shannon in 1948

Lossless and lossy methods

Efficient binary representation of information

Remove redundancy

5

Compression Theory (lossy/lossless)

5 4 2

4.9 3.9 1.9

5 4 2

LosslessCODEC

3.1 2.1 0

LossyCODE

C

Original Data

Decoded DataDecoded Data

Mild Error

Decoded Data

High Error

6

Compression theory

Original Picture 800 x 600 Decoded Picture (Low Losses)

1.37 MB 85 KB

7

Compression theory

Original Picture 800 x 600 Decoded Picture (High Losses)

1.37 MB 16 KB

8

Compression Theory (Entropy)

10 22 0 0 3 03 30 03 10 3 30 33 44 05 35 0 50 05 0 50 5 50 8 60 8 60 66 8 66 666 88 86 79 84 67 34 22 0 0 3 03 30 03 10 33 30 33 44 05 05 0 50 05 0 50 5 50 8 60 8 60 66 8 60 8 60 66 8 66 666 88 86 79 84 67 34 22 0 30 3 33 30 03 10 3 30 33 44 05 35 0 50 05 0 50 5 50 8 60 8 60 66 44 33 0 0 0 0 0 0 0

ENTROPYENTROPY

Redundancy

9

Compression Theory (Entropy)

ENTROPYENTROPY

Optimal Lossless Compression

Full Entropy is Preserved

Part of the Entropy

Lossy Compression

Only Part of the Entropy is Preserved

10

Image Compression and Formats

RLEHuffmanLZWGIFJPEGFractalsTIFF, PICT, BMP, etc.

11

Video Compression and Formats

H.261/H.263CinepakSorensenIndeoReal VideoMPEG-1, MPEG-2, MPEG-4, etc.QuickTime, AVI

12

Special Coding Requirements

Viewing real-time source informationEnd-to-end delay (EED) should not exceed 150-200 ms Face-to-face application needs EED of 50ms (including compression and decompression)

Interactive viewing - random access Random access to single images and audio frames, access time should be less than 0.5 secDecompression of images, video, audio should not be linked to other data units to support random access

13

Compression Steps

Picture Preparation

Picture Processing

Quantization

Entropy Coding

CompressedPicture

UncompressedPicture

AdaptiveFeedbackLoop

14

Picture Preparation

Analog-to-digital conversionGenerate appropriate digital representationDivide picture into macro blocks (usually 8x8)Fix the number of bits per pixel

15

Picture Processing

Transform to frequency domainE.g., use the Discrete Cosine Transform (DCT)

Compute motion vectors for each block

16

Quantization and Coding

Map real numbers to integersE.g., the DC and AC coefficients from DCT are real numbers, but only want to store as integers

Entropy codingCompress a sequential bit stream without loss

17

Types of Compression

Symmetric compressionRequires same time for encoding and decodingUsed for dialog mode applications (teleconference)

Asymmetric compressionPerformed once when enough time is availableTwo Pass Encoding Used for retrieval mode applications (e.g., an interactive CD-ROM)

18

Broad Classification of Compression Techniques

Entropy Codinglossless encodingused regardless of media’s specific characteristicsdata taken as a simple digital sequencedecompression process regenerates data completelye.g. Run-length coding, Huffman coding, LZW, Arithmetic coding

Source Codinglossy encodingMay take into account the semantics of the datadegree of compression depends on data contentE.g. DPCM, ADPCM

Hybrid Coding (used by most multimedia systems)combine entropy with source encodingE.g. JPEG, H.263, MPEG-1, MPEG-2, MPEG-4

19

Compression Techniques

Statistical techniques (Entropy)Predictive techniques (Source)Transform techniques (Hybrid)

20

Statistical Techniques

Entropy (H) refers to how much variability is in dataLow/high entropy means low/high variabilityZero-order entropy model First-order entropy model

Huffman encoding, for exampleUse statistical profile to determine encoding schemeUse fewer/more bits to encode more/less frequent dataMust transmit a codebook

Μ= 2logH

)(log)( 21

iPiPHm

i∑=

−=

21

Huffman Coding

Compressed Data

Source Data

1114E1106D1016C1008B020A

CodeCountSymbolHuffman Table

Total Size: 44 Bytes / 352 Bits

466820CountEDCBASymbol

ABABCCDBBABBAAABBAACCDDEEAAC...

010001001011011101001001100100000100100…

Total size : 92 Bits

Compression Ration: 3.8

≅ 2 Bits/PixelEncoder

22

Predictive Techniques

DPCMCompare adjacent pixels and only transmit the difference between themE.g., use 8 bits per pixel and use 4 bits for each difference

ADPCMSimilar to DPCM, but use a variable number of pixels to transmit differencesE.g., use 8 bits per pixel and use 1-5 bits for each difference

23

Transform Techniques

Convert to data to an alternate form that better supports specific operations

Typically operates on blocks of dataLarger blocks give better results, but require more computational overhead

Discrete Cosine TransformConverts an image from spatial to frequency domainOperates on 8 x 8 blocks (64 pixels) of dataDC coefficient represents zero spatial frequency, which is the average value for all the pixels in the 8 x 8 blockAC coefficients represent amplitudes of progressively higher horizontal and vertical spatial frequency components

24

Run Length Encoding (RLE)

Form of entropy coding (lossless)Content dependent coding scheme

Series of repeated values replaced by a single value and a countExample:

The sequence abbbbbbbccddddeedddwould be replaced by 1a7b2c4d2e3d

25

Example Implementation

Use 8 bits (1 byte) data elementsUnsigned [0, 255]; signed [-127, 127]

Encode repeated values using two bytesFirst byte provides count (N) between [-1, -127]The repetition count is –N + 1Second byte contains the data value to repeat

Repeated patterns cannot be longer than 128Must be broken into multiple runs

A non-repeating data value will have a positive first byte, which is the data value

26

Exercise

Assume same sequence from beforeabbbbbbbccddddeeddd

Given previous implementation, what would be the transmitted code

27

Exercise

Assume same sequence from beforeabbbbbbbccddddeeddd

Given previous implementation, what would be the transmitted code

a -6b -1c -3d -1e -2d

28

Differential Encoding (Source)

Consider a sequence of values S1, S2, S3, etc. that differ in value, but not dramatically

Encode differences from a specific value E.g., S1, S2-S1, S3-S2, etc.

E.g. still imageCalculate difference between nearby pixelsAreas of rapid color change characterized by large values, other areas by small valuesAfter differential encoding, apply RLE

29

Differential Encoding Example

0 0 0 0 0

0 255 250 253 251

0 255 251 254 255

0 0 0 0 0

DPCM: 0, 0, 0, 0, 0, 0, 255, -5, 3, -2, 0, 255, -4, 3, 1, …RLE: 6(0), 255, -5, 3, -2, 0, 255, -4, 3, 1, …

30

Compression Steps

Picture Preparation

Picture Processing

Quantization

Entropy Coding

CompressedPicture

UncompressedPicture

AdaptiveFeedbackLoop

31

JPEG Compression Steps

Prepare the image for compressionTransform color spaceDown-sample componentsInterleave the color planesPartition into smaller blocks

Apply the Discrete Cosine TransformQuantize the DC and AC coefficientsApply entropy encoding to quantized numbers

32

JPEG Algorithm

FDCT

SourceImage

Quantizer EntropyEncoder

TableTable

Compressedimage data

DCT-based encoding

8x8 blocks

R

BG

33

Color Transformation

May want to transform RGB to YUV or YCbCrThis is an optional step, so why do it?

Human visual system detects changes better in luminance than in chrominance components

YUV or YCbCr better supports this than RGBFacilitates better compression by enabling

Down-sampling of the image in chrominanceElimination of more of the high-frequency changes in the chrominance dimensions

34

Down-Sampling Chrominance

Optional step, but increases compression ratioOnly down-sample the chrominance components, never the luminance component

Average groups of pixels in the horizontal and/or vertical resolution of the image

Referred to as H2V1, H2V2, 4:2:2, 4:2:0, etc.E.g., H2V2 means average every 2 pixels in horizontal and vertical dimensionsH4V4 means use the SAME resolution in the horizontal and vertical dimensions

35

Color sub-sampling

Sub-Sampling

4:4:4 4:2:2

36

Color sub-sampling

4:4:4 4:1:1

Sub-Sampling

37

Color sub-sampling

4:4:4 4:2:0

Sub-Sampling

38

How does colour sub-sampling aid compression?

100 x 100 pixels frame requires:100 x 100 Y Pixels100 x 100 Cr Pixels100 x 100 Cb Pixels

Using 4:2:0 colour sub-sampling:100 x 100 Y Pixels50 x 50 Cr Pixels50 x 50 Cb Pixels

39

Blocks and Pixel Shift

Divide each component into 8x8 blocksSend to the FDCT

Shift each pixel from unsigned range [0, 255] into signed range [-127, 127]

DCT requires range be centered around 0

SourceImage

40

Forward DCT

Convert from spatial to frequency domainConvert intensity function into weighted sum of elementary frequency componentsIdentify pieces of spectral information that can be thrown away without loss of quality

Intensity values in each color plane often change slowly (see next example)

Contributions from higher frequency bands in the frequency domain can be ignoredBetter compression without loss of quality

41

Equations for 2D DCT

Forward DCT:

Inverse DCT:

⎟⎠⎞

⎜⎝⎛ +

⎟⎠⎞

⎜⎝⎛ +

= ∑∑−

=

−

= mvy

nuxyxIvCuC

nmvuF

m

y

n

x 2)12(cos*

2)12(cos*),()()(2),(

1

0

1

0

ππ

⎟⎠⎞

⎜⎝⎛ +

⎟⎠⎞

⎜⎝⎛ +

= ∑∑−

=

−

= mvy

nuxvCuCuvF

nmxyI

m

v

n

u 2)12(cos*

2)12(cos)()(),(2),(

1

0

1

0

ππ

42

Visualization of Basis Functions

Increasing frequency

Incr

easi

ng fr

eque

ncy

43

Quantization

Divide each coefficient in a block by an integer in the range [1, 255]

Comes in the form of a table, same size as a blockMultiply the block of coefficients by the table, and then round the result to nearest integer

In the decoding process, multiply the quantized coefficients by the inverse of the table

Get back a number close to, but the same as the originalError is always less than half of the quantization number

Larger numbers in quantization table cause more lossThis is the main source of loss in JPEG

44

De facto Quantization Table

9910310192120100112121113776210310410398958781109805655698768647892726455565157606151584029373549242222242640241916171814131416101214121116

Eye becomes less sensitive

Eye becom

es less sensitive

45

Entropy Encoding

Compress the sequence of quantized DC and AC coefficients from the quantization step

Further increase compression, but without loss

Separate DC from AC componentsDC components change slowly, thus will be encoded using difference encoding

46

DC Encoding

DC represents average intensity of a blockBecause image intensity tends to change slowly, DC values tend to change slowlyEncode using difference encoding schemeUse 3x3 pattern of blocks

Because difference tends to be near zero, can use less bits in the encoding

Categorize difference into difference classesSend the index of the difference class, followed by the bits representing the difference

47

AC Encoding

Use a zig-zag ordering of coefficients, why?Orders frequency components from low->highShould produce maximal series of 0s at the endLends well to RLE

Apply RLE to ordering

48

Huffman Encoding

String together the RLE of the AC coefficients along with the DC difference indices and valuesApply Huffman encoding to resulting sequenceAttach appropriate headersFinally have the JPEG image!

image copression

Documents