multimedia technology2. compression algorithms2.3 - 1©wolfgang effelsberg 2.3 video compression...
TRANSCRIPT
Multimedia Technology 2. Compression Algorithms 2.3 - 1© Wolfgang Effelsberg
2.3 Video Compression
2.3.1 MPEGMPEG stands for Moving Picture Experts Group (a committee of ISO).
The main goal of MPEG-1 was: compress a video signal (with audio) to a data stream of 1.5 Mbit/s, the data rate of a T1 link in the U.S., and the rate that can be streamed from a CD-ROM.
Multimedia Technology 2. Compression Algorithms 2.3 - 2© Wolfgang Effelsberg
Goals of the MPEG-1 Compression Algorithm
• Random access within 0.5 s while maintaining a good image quality for the video
• Fast forward / fast rewind
• Possibility to play the video backwards
• Allow easy and precise editing.
Multimedia Technology 2. Compression Algorithms 2.3 - 3© Wolfgang Effelsberg
MPEG Frame Types
MPEG distinguishes four types of frames:
I-Frame (Intra Frame)
Intra-coded full image, very similar to the JPEG image, encoded with DCT, quantization, run-length coding and Huffman coding
P-Frame (Predicted Frame)
Uses delta encoding. The P frame refers to preceding I- and P-frames. DPCM encoded macro blocks, motion vectors possible.
B-Frame (Interpolated Frame)
"bidirectionally predictive coded pictures„. The B frame refers to preceding and succeeding frames, interpolated the data and encodes the differences.
D-Frame
"DC coded picture", only the DC coefficient of each block is coded (upper left-hand corner of the matrix), e.g., for previews.
Multimedia Technology 2. Compression Algorithms 2.3 - 4© Wolfgang Effelsberg
„Group of Pictures“ in MPEG
The sequence of I, P and B frames is not standardized but can be chosen according to the require-ments of the application. This allows the user to chose his/her own compromise between video quality, compression rate, ease of editing, etc.
Multimedia Technology 2. Compression Algorithms 2.3 - 5© Wolfgang Effelsberg
MPEG Encoder
fram em em ory
inverse quantizer
fram em em ory
D C T quantizer
ID C T
m otion com pensation
m otion estim ation
entropyencoder
mot
ion
vect
ors
pred
icti
vefr
ame
Multimedia Technology 2. Compression Algorithms 2.3 - 6© Wolfgang Effelsberg
MPEG Decoder
inverse quantizer
previous picture store
IDCT
entropydecoder
future picture store
1/2
0
Mux
motion compensation
Multimedia Technology 2. Compression Algorithms 2.3 - 7© Wolfgang Effelsberg
Temporal Redundancy and Motion Vectors
"Motion Compensated Interpolation"
On the encoder side the search range can be chosen as a parameter: the larger the search range, the higher the potential for compression, but the longer the run time of the algorithm.
A
B
previous fram e
current fram e
future fram e
block-m atching technique
1. b lock B = b lock A2. b lock B = b lock C3. b lock B = (b lock A + b lock C ) / 2
Multimedia Technology 2. Compression Algorithms 2.3 - 8© Wolfgang Effelsberg
MPEG-2
MPEG-2 extends MPEG-1 for higher bandwidths and better image qualities, up to HDTV. It was developed jointly by ISO and ITU-T (where the standard is called H.262).
MPEG-2 defines scalable data streams which allow receivers with different bandwidth and processing power to receive and decode only parts of the data stream.
Multimedia Technology 2. Compression Algorithms 2.3 - 9© Wolfgang Effelsberg
Scalability in MPEG-2 (1)
• SNR scalability: Each frame is encoded in several layers. A receiver who only decodes the base layer will get a low image quality. A receiver decoding additional (higher) layers gets a better image quality. An example is color sub sampling: the base layer contains only one quarter of the values for the U and V components, compared to the Y components. The enhancement layer contains the U and V components in full resolution, for better color quality.
• Spatial scalability: The frames are encoded with different pixel resolutions (e.g., for a standard TV set and for an HDTV TV set). Both encodings are transmitted in the same data stream.
Multimedia Technology 2. Compression Algorithms 2.3 - 10© Wolfgang Effelsberg
Scalability in MPEG-2 (2)
• Temporal scalability: The base layer contains only very few frames per second, the enhancement layers additional frames per second. Receivers decoding the higher layers will thus get a higher frame rate (i.e., a higher temporal resolution).
• Data partitioning: The data stream is decomposed into several streams with different amounts of redundancy for error correction. The most important parts of the stream are encoded in the base layer, e.g., the low-frequency coefficients of the DCT and the motion vectors. This layer can then be enriched with an error correcting code for better error resilience than the enhancement layers where errors are not as harmful.
Multimedia Technology 2. Compression Algorithms 2.3 - 11© Wolfgang Effelsberg
MPEG-2 Video Profiles
Simple profile
no B frames
not scalable
Main profile
B frames
not scalable
SNR scalable
profile
B frames
SNR scaling
Spatially
scalable pro-
file
B frames
spatial scal-
ing
High profile
B frames
spatial or
SNR scaling
High level
1920x1152x60
<=80 Mbits/s <=100
Mbits/s
High-1440 level
1440x1152x60
<=60 Mbits/s <=60 Mbits/s <=80 Mbits/s
Main level
720x576x30
<=15 Mbits/s <=15 Mbits/s <=15 Mbits/s <=20 Mbits/s
Low level
352x288x30
<=4 Mbits/s <=4 Mbits/s
Multimedia Technology 2. Compression Algorithms 2.3 - 12© Wolfgang Effelsberg
MPEG-4/ASP (1)
Originally, ISO and ITU-T had planned a standard MPEG-3 for HDTV at very high bit rates. This work was later integrated into MPEG-2. This explains why there is no MPEG-3 standard.
MPEG-4 is also known as MPEG-4 part 2 or MPEG-4/ASP (Advanced Simple Profile). It was originally planned for video at very low bit rates (e.g., for wireless PDAs). Later the ISO committee decided to concentrate on an entirely new technology, namely encoding in the form of sets of objects overlaid to form an image. The encoding technique can be chosen separately for each object. This object-oriented encoding also opens up much richer possibilities for processing on the receiver side.
Multimedia Technology 2. Compression Algorithms 2.3 - 13© Wolfgang Effelsberg
MPEG-4/ASP (2)
History of MPEG-4• Competitive tests for video functionalities in November 1995• Development of the standards from 1996 to 1998• Stable final committee draft for the video part in March 1998• The standard was fixed at the end of 1999.
Features of MPEG-4• A scene is constructed of multiple independent objects.• Objects are merged into a scene on the decoder side.• A combination of different object types and different coding methods is possible.• An improved coding efficiency with bit rates between 5 kbit/s and 50 Mbit/s is
provided. • The error-resilient video coding is optimized for mobile and packet networks.
Multimedia Technology 2. Compression Algorithms 2.3 - 14© Wolfgang Effelsberg
MPEG-4/ASP (3)
Separate encoding of background and foreground. The background is static.
OBJECT 1
OBJECT 2
InputVideo
Object formation
Object 1Coding
Object 2Coding
MUX Bitstream
Multimedia Technology 2. Compression Algorithms 2.3 - 15© Wolfgang Effelsberg
MPEG-4/ASP (4)
Decoding of an MPEG-4 system stream
Demultiplex
ElementaryStream s
Decom pression Com position andRendering
Scene Description(Script or Classes)
PrimitiveAV Objects
Com position Information
Com position andRendering
Upstream Data
(User Events, Class Request, ...)
NETWORK
LAYER
Multimedia Technology 2. Compression Algorithms 2.3 - 16© Wolfgang Effelsberg
MPEG-4/ASP (5)
Object hierarchy for our decoding example
scene
person 2D background furnitureaudiovisual
presentation
voice sprite globe desk
Multimedia Technology 2. Compression Algorithms 2.3 - 17© Wolfgang Effelsberg
MPEG-4/ASP (6)
Scalability by „layered encoding“ in MPEG-4
Multimedia Technology 2. Compression Algorithms 2.3 - 18© Wolfgang Effelsberg
MPEG-4/ASP (7)
The MPEG-4/ASP standard describes how to encode the objects. The automatic segmentation of the objects (i.e., how to identify single objects) is not specified.
Object segmentation techniques:•Blue screening•Automatic segmentation based on color or motion•Semi-automatic segmentation: A user selects an object, the tracking is then done automatically.
Multimedia Technology 2. Compression Algorithms 2.3 - 19© Wolfgang Effelsberg
MPEG-4/ASP (8)
Features of MPEG-4/ASP
Similarity to previous standards like MPEG-1 or MPEG-2:• based on the DCT• quantization is similar to MPEG-2• supports B-frames.
Improvements:• additional coding of objects which can overlay an image• global motion compensation • ¼ pixel precision for motion compensation• better image quality for low-bandwidth encoding.
Critical:• computationally intensive, in particular on the encoding side
Multimedia Technology 2. Compression Algorithms 2.3 - 20© Wolfgang Effelsberg
2160P (Quad HDTV)(3840x2160)
1080P (Full HD)(1920x1080)
720P (HD Ready)(1280x720)
Standard TV (720x576)
CIF (352x288)
QCIF (176x144)
MPEG-4/ASP (9): Resolutions and Bitrates
64 K bit/s 384 K 1.5 M 2 M 4 M 15 M 38 M 60 M 90 M 180 M
resolution
bit rate
MPEG-4 Part 2
Simple Core
Main
Profile
Studio
Profile
CPB
MPEG-1 Constrained Parameter Bitstream (CPB)
MPEG-2
Low Level
Main Level
High Level
Multimedia Technology 2. Compression Algorithms 2.3 - 21© Wolfgang Effelsberg
2.3.2 ITU Recommendation H.261
Also known as „p*64 kbit/s“• A video coding technique for video data at px64 kbit/s.• Originally developed for ISDN• Parameter p in [1,30]• p small implies low image quality at low data rates. An example is video telephony
with p=1 or p=2.• p larger implies better video quality at higher data rates. Typical is p=6 for com-
pany video conferencing over six parallel ISDN B-channels.• Intraframe-Coding: based on the DCT. Very similar to JPEG but there is only one
quantization factor for all values of the block (no quantization table).• Interframe-Coding: very similar to the P frames in MPEG-1. • There are no B frames in H.261.
Multimedia Technology 2. Compression Algorithms 2.3 - 22© Wolfgang Effelsberg
Important Parameters of H.261
CIF = Common Intermediate Format
Hierarchy of the elements of the data stream:
Structure Element Description
picture a full frame
group of blocks 33 macro blocks
macro block 16 x 16 Y, 8 x 8 Cb, Cr
block 8 x 8 pixels (unit for the DCT)
Multimedia Technology 2. Compression Algorithms 2.3 - 23© Wolfgang Effelsberg
The H.261 Encoder
fram em em ory
pred
icti
vefr
ame
fram em em ory
inverse quantizer
D C T quantizer
ID CT
m otion com pensation
m otion estim ation
entropyencoder
m u x
m u x
loop filter
IN T R A
IN T E R
0
Multimedia Technology 2. Compression Algorithms 2.3 - 24© Wolfgang Effelsberg
Status of H.261
Was very widely used in practice, many products were available in the market from many manufacturers. Has replaced earlier proprietary standards for video telephony. Got replaced by newer versions, such as H.263 and H.264.
Pure software implementations are available as well as stand-alone hardware solu-tions (“black boxes“) and combined solutions, mainly for the PC.
Multimedia Technology 2. Compression Algorithms 2.3 - 25© Wolfgang Effelsberg
H.263
H.263 is the successor standard of H.261 at ITU-T, incorporating much of the expe-rience gained with MPEG-1.
Some differences between H.263 and H.261 are:
• There are five image sizes instead of two.
• There is a bi-directional interpolation where exactly one B frame follows each P frame.
• There are negotiable options that allow to tailor the algorithm for specific applications. For example, arithmetic coding can be chosen instead of run-length/Huffman coding in the entropy encoding step.
Multimedia Technology 2. Compression Algorithms 2.3 - 26© Wolfgang Effelsberg
H.264, MPEG-4/AVC (1)
H.264 is also known as MPEG-4 Part 10, MPEG-4/AVC (Advanced Video Coding), or ISO/IEC 14496-10.
• Developed by the Joint Video Team (JVT): joint work by experts of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG)
• First version of the standard completed in May 2003.
Goals
• Good video quality at significantly lower bit rates than previous standards (MPEG-2, H.263, MPEG-4 Part 2/ASP)
• The coding complexity should not be much higher in comparison with previous standards.
• High flexibility: support of very low/high bit rates, very low/high video resolutions (HDTV), DVD storage, ITU-T multimedia telephony.
Multimedia Technology 2. Compression Algorithms 2.3 - 27© Wolfgang Effelsberg
H.264, MPEG-4/AVC (2)
Comparison of H.264 with older standards
Previous standards H.264 / MPEG-4/AVC
Discrete cosine transform Integer transformation: integer approximation of the DCT, 16-bit integer arithmetic precision based on addition, subtraction and binary shift operations
much faster, easier to implement in hardware
Entropy coding: run-length, VLC codes (variable length coding, similar to Huffman)
Supports both VLC codes and arithmetic coding
more efficient
but requires considerably more processing to decode
Block size: 8x8 pixels Variable block sizes: 4x4 – 16x16 pixels
reduction of blocking artifacts
better segmentation of moving objects
significant improvement of the visual quality
one motion vector for each macro block (16x16 pixels)
A different motion vector can be used for each sub-block
better compression/quality in case of complex motion
Precision for motion compensation:
MPEG-2: ½ pixel precision
H.263, MPEG-4/ASP: ¼ pixel precision (optional)
Always ¼ pixel precision for motion compensation
more precise description of moving objects
Multimedia Technology 2. Compression Algorithms 2.3 - 28© Wolfgang Effelsberg
H.264, MPEG-4/AVC (3)
Comparison of H.264 with older standards (cont.)
The variable block size and the deblocking filter improve the visual quality most.
Previous standards H.264 / MPEG-4/AVC
Prediction of DC component in Intra-frames
Pixel values are predicted based on already decoded pixels in neighbour blocks. Only the differences are encoded.
reduced the size of I-frames.
A P-frame refers to the last I- or P-frame.
A B-frame refers to two I- or P-frames.
P- or B- frames can refer to up to five different frames at the same time
better compression in case of periodic changes in an image.
Separate weights can be defined for the referenced blocks
better encoding of special effects such as fades or dissolves
No deblocking filter Deblocking is mandatory, P- and B-frames refer to deblocked images
significant improvement of the visual quality
Sample bit depth precision in:
MPEG-2: 8 bits/sample
MPEG-4: up to 12 bits/sample
Sample bit depth precision: 8 bits/pixel (Baseline Profile, Main Profile, High Profile), up to 14 bits/pixel (High 4:4:4 Predictive Profile) better quality of high-contrast videos (medical, surveillance)
Multimedia Technology 2. Compression Algorithms 2.3 - 29© Wolfgang Effelsberg
H.264, MPEG-4/AVC (4)
The acceptance of H.264 is very high due to the high coding efficiency. Many app-lications have been developed based on H.264:
•HDTV: used for HD DVD, Blu-ray Disc and high definition TV (DVB-S2)
•Portable video: DVB-H and DMB (Digital Multimedia Broadcasting)
•Codecs are available for PCs (e.g., QuickTime V.7), video conferencing systems and camcorders.