multimedia technology2. compression algorithms2.3 - 1©wolfgang effelsberg 2.3 video compression...

Multimedia Technology 2. Compression Algorithms 2.3 - 1© Wolfgang Effelsberg

2.3 Video Compression

2.3.1 MPEGMPEG stands for Moving Picture Experts Group (a committee of ISO).

The main goal of MPEG-1 was: compress a video signal (with audio) to a data stream of 1.5 Mbit/s, the data rate of a T1 link in the U.S., and the rate that can be streamed from a CD-ROM.


Goals of the MPEG-1 Compression Algorithm

• Random access within 0.5 s while maintaining a good image quality for the video

• Fast forward / fast rewind

• Possibility to play the video backwards

• Allow easy and precise editing.


MPEG Frame Types

MPEG distinguishes four types of frames:

I-Frame (Intra Frame)

Intra-coded full image, very similar to the JPEG image, encoded with DCT, quantization, run-length coding and Huffman coding

P-Frame (Predicted Frame)

Uses delta encoding. The P frame refers to preceding I- and P-frames. DPCM encoded macro blocks, motion vectors possible.

B-Frame (Interpolated Frame)

"bidirectionally predictive coded pictures„. The B frame refers to preceding and succeeding frames, interpolated the data and encodes the differences.

D-Frame

"DC coded picture", only the DC coefficient of each block is coded (upper left-hand corner of the matrix), e.g., for previews.


„Group of Pictures“ in MPEG

The sequence of I, P and B frames is not standardized but can be chosen according to the require-ments of the application. This allows the user to chose his/her own compromise between video quality, compression rate, ease of editing, etc.


MPEG Encoder

fram em em ory

inverse quantizer

fram em em ory

D C T quantizer

ID C T

m otion com pensation

m otion estim ation

entropyencoder

mot

ion

vect

ors

pred

icti

vefr

ame


MPEG Decoder

inverse quantizer

previous picture store

IDCT

entropydecoder

future picture store

1/2

0

Mux

motion compensation


Temporal Redundancy and Motion Vectors

"Motion Compensated Interpolation"

On the encoder side the search range can be chosen as a parameter: the larger the search range, the higher the potential for compression, but the longer the run time of the algorithm.

A

B

previous fram e

current fram e

future fram e

block-m atching technique

1. b lock B = b lock A2. b lock B = b lock C3. b lock B = (b lock A + b lock C ) / 2


MPEG-2

MPEG-2 extends MPEG-1 for higher bandwidths and better image qualities, up to HDTV. It was developed jointly by ISO and ITU-T (where the standard is called H.262).

MPEG-2 defines scalable data streams which allow receivers with different bandwidth and processing power to receive and decode only parts of the data stream.


Scalability in MPEG-2 (1)

• SNR scalability: Each frame is encoded in several layers. A receiver who only decodes the base layer will get a low image quality. A receiver decoding additional (higher) layers gets a better image quality. An example is color sub sampling: the base layer contains only one quarter of the values for the U and V components, compared to the Y components. The enhancement layer contains the U and V components in full resolution, for better color quality.

• Spatial scalability: The frames are encoded with different pixel resolutions (e.g., for a standard TV set and for an HDTV TV set). Both encodings are transmitted in the same data stream.


Scalability in MPEG-2 (2)

• Temporal scalability: The base layer contains only very few frames per second, the enhancement layers additional frames per second. Receivers decoding the higher layers will thus get a higher frame rate (i.e., a higher temporal resolution).

• Data partitioning: The data stream is decomposed into several streams with different amounts of redundancy for error correction. The most important parts of the stream are encoded in the base layer, e.g., the low-frequency coefficients of the DCT and the motion vectors. This layer can then be enriched with an error correcting code for better error resilience than the enhancement layers where errors are not as harmful.


MPEG-2 Video Profiles

Simple profile

no B frames

not scalable

Main profile

B frames

not scalable

SNR scalable

profile

B frames

SNR scaling

Spatially

scalable pro-

file

B frames

spatial scal-

ing

High profile

B frames

spatial or

SNR scaling

High level

1920x1152x60

<=80 Mbits/s <=100

Mbits/s

High-1440 level

1440x1152x60

<=60 Mbits/s <=60 Mbits/s <=80 Mbits/s

Main level

720x576x30

<=15 Mbits/s <=15 Mbits/s <=15 Mbits/s <=20 Mbits/s

Low level

352x288x30

<=4 Mbits/s <=4 Mbits/s


MPEG-4/ASP (1)

Originally, ISO and ITU-T had planned a standard MPEG-3 for HDTV at very high bit rates. This work was later integrated into MPEG-2. This explains why there is no MPEG-3 standard.

MPEG-4 is also known as MPEG-4 part 2 or MPEG-4/ASP (Advanced Simple Profile). It was originally planned for video at very low bit rates (e.g., for wireless PDAs). Later the ISO committee decided to concentrate on an entirely new technology, namely encoding in the form of sets of objects overlaid to form an image. The encoding technique can be chosen separately for each object. This object-oriented encoding also opens up much richer possibilities for processing on the receiver side.


MPEG-4/ASP (2)

History of MPEG-4• Competitive tests for video functionalities in November 1995• Development of the standards from 1996 to 1998• Stable final committee draft for the video part in March 1998• The standard was fixed at the end of 1999.

Features of MPEG-4• A scene is constructed of multiple independent objects.• Objects are merged into a scene on the decoder side.• A combination of different object types and different coding methods is possible.• An improved coding efficiency with bit rates between 5 kbit/s and 50 Mbit/s is

provided. • The error-resilient video coding is optimized for mobile and packet networks.


MPEG-4/ASP (3)

Separate encoding of background and foreground. The background is static.

OBJECT 1

OBJECT 2

InputVideo

Object formation

Object 1Coding

Object 2Coding

MUX Bitstream


MPEG-4/ASP (4)

Decoding of an MPEG-4 system stream

Demultiplex

ElementaryStream s

Decom pression Com position andRendering

Scene Description(Script or Classes)

PrimitiveAV Objects

Com position Information

Com position andRendering

Upstream Data

(User Events, Class Request, ...)

NETWORK

LAYER


MPEG-4/ASP (5)

Object hierarchy for our decoding example

scene

person 2D background furnitureaudiovisual

presentation

voice sprite globe desk


MPEG-4/ASP (6)

Scalability by „layered encoding“ in MPEG-4


MPEG-4/ASP (7)

The MPEG-4/ASP standard describes how to encode the objects. The automatic segmentation of the objects (i.e., how to identify single objects) is not specified.

Object segmentation techniques:•Blue screening•Automatic segmentation based on color or motion•Semi-automatic segmentation: A user selects an object, the tracking is then done automatically.


MPEG-4/ASP (8)

Features of MPEG-4/ASP

Similarity to previous standards like MPEG-1 or MPEG-2:• based on the DCT• quantization is similar to MPEG-2• supports B-frames.

Improvements:• additional coding of objects which can overlay an image• global motion compensation • ¼ pixel precision for motion compensation• better image quality for low-bandwidth encoding.

Critical:• computationally intensive, in particular on the encoding side


2160P (Quad HDTV)(3840x2160)

1080P (Full HD)(1920x1080)

720P (HD Ready)(1280x720)

Standard TV (720x576)

CIF (352x288)

QCIF (176x144)

MPEG-4/ASP (9): Resolutions and Bitrates

64 K bit/s 384 K 1.5 M 2 M 4 M 15 M 38 M 60 M 90 M 180 M

resolution

bit rate

MPEG-4 Part 2

Simple Core

Main

Profile

Studio

Profile

CPB

MPEG-1 Constrained Parameter Bitstream (CPB)

MPEG-2

Low Level

Main Level

High Level


2.3.2 ITU Recommendation H.261

Also known as „p*64 kbit/s“• A video coding technique for video data at px64 kbit/s.• Originally developed for ISDN• Parameter p in [1,30]• p small implies low image quality at low data rates. An example is video telephony

with p=1 or p=2.• p larger implies better video quality at higher data rates. Typical is p=6 for com-

pany video conferencing over six parallel ISDN B-channels.• Intraframe-Coding: based on the DCT. Very similar to JPEG but there is only one

quantization factor for all values of the block (no quantization table).• Interframe-Coding: very similar to the P frames in MPEG-1. • There are no B frames in H.261.


Important Parameters of H.261

CIF = Common Intermediate Format

Hierarchy of the elements of the data stream:

Structure Element Description

picture a full frame

group of blocks 33 macro blocks

macro block 16 x 16 Y, 8 x 8 Cb, Cr

block 8 x 8 pixels (unit for the DCT)


The H.261 Encoder

fram em em ory

pred

icti

vefr

ame

fram em em ory

inverse quantizer

D C T quantizer

ID CT

m otion com pensation

m otion estim ation

entropyencoder

m u x

m u x

loop filter

IN T R A

IN T E R

0


Status of H.261

Was very widely used in practice, many products were available in the market from many manufacturers. Has replaced earlier proprietary standards for video telephony. Got replaced by newer versions, such as H.263 and H.264.

Pure software implementations are available as well as stand-alone hardware solu-tions (“black boxes“) and combined solutions, mainly for the PC.


H.263

H.263 is the successor standard of H.261 at ITU-T, incorporating much of the expe-rience gained with MPEG-1.

Some differences between H.263 and H.261 are:

• There are five image sizes instead of two.

• There is a bi-directional interpolation where exactly one B frame follows each P frame.

• There are negotiable options that allow to tailor the algorithm for specific applications. For example, arithmetic coding can be chosen instead of run-length/Huffman coding in the entropy encoding step.


H.264, MPEG-4/AVC (1)

H.264 is also known as MPEG-4 Part 10, MPEG-4/AVC (Advanced Video Coding), or ISO/IEC 14496-10.

• Developed by the Joint Video Team (JVT): joint work by experts of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG)

• First version of the standard completed in May 2003.

Goals

• Good video quality at significantly lower bit rates than previous standards (MPEG-2, H.263, MPEG-4 Part 2/ASP)

• The coding complexity should not be much higher in comparison with previous standards.

• High flexibility: support of very low/high bit rates, very low/high video resolutions (HDTV), DVD storage, ITU-T multimedia telephony.


H.264, MPEG-4/AVC (2)

Comparison of H.264 with older standards

Previous standards H.264 / MPEG-4/AVC

Discrete cosine transform Integer transformation: integer approximation of the DCT, 16-bit integer arithmetic precision based on addition, subtraction and binary shift operations

much faster, easier to implement in hardware

Entropy coding: run-length, VLC codes (variable length coding, similar to Huffman)

Supports both VLC codes and arithmetic coding

more efficient

but requires considerably more processing to decode

Block size: 8x8 pixels Variable block sizes: 4x4 – 16x16 pixels

reduction of blocking artifacts

better segmentation of moving objects

significant improvement of the visual quality

one motion vector for each macro block (16x16 pixels)

A different motion vector can be used for each sub-block

better compression/quality in case of complex motion

Precision for motion compensation:

MPEG-2: ½ pixel precision

H.263, MPEG-4/ASP: ¼ pixel precision (optional)

Always ¼ pixel precision for motion compensation

more precise description of moving objects


H.264, MPEG-4/AVC (3)

Comparison of H.264 with older standards (cont.)

The variable block size and the deblocking filter improve the visual quality most.

Previous standards H.264 / MPEG-4/AVC

Prediction of DC component in Intra-frames

Pixel values are predicted based on already decoded pixels in neighbour blocks. Only the differences are encoded.

reduced the size of I-frames.

A P-frame refers to the last I- or P-frame.

A B-frame refers to two I- or P-frames.

P- or B- frames can refer to up to five different frames at the same time

better compression in case of periodic changes in an image.

Separate weights can be defined for the referenced blocks

better encoding of special effects such as fades or dissolves

No deblocking filter Deblocking is mandatory, P- and B-frames refer to deblocked images

significant improvement of the visual quality

Sample bit depth precision in:

MPEG-2: 8 bits/sample

MPEG-4: up to 12 bits/sample

Sample bit depth precision: 8 bits/pixel (Baseline Profile, Main Profile, High Profile), up to 14 bits/pixel (High 4:4:4 Predictive Profile) better quality of high-contrast videos (medical, surveillance)


H.264, MPEG-4/AVC (4)

The acceptance of H.264 is very high due to the high coding efficiency. Many app-lications have been developed based on H.264:

•HDTV: used for HD DVD, Blu-ray Disc and high definition TV (DVB-S2)

•Portable video: DVB-H and DMB (Digital Multimedia Broadcasting)

•Codecs are available for PCs (e.g., QuickTime V.7), video conferencing systems and camcorders.

multimedia technology2. compression algorithms2.3 - 1©wolfgang effelsberg 2.3 video compression...

Documents

mpeg mpeg

compression algorithms2

standard mpeg

multimedia technology2

video compression

compression rate

wolfgang effelsberg

main goal of mpeg