low bit rate h.263 + video coding: efficiency, scalability and error resilience faouzi kossentini...
TRANSCRIPT
Low Bit Rate H.263 + Video Low Bit Rate H.263 + Video Coding: Efficiency, Scalability and Coding: Efficiency, Scalability and
Error ResilienceError Resilience
Faouzi Kossentini
Signal Processing and Multimedia GroupDepartment of Electrical and Computer Engineering
University of British Columbia
http://spmg.ece.ubc.ca/
OutlineOutline Introduction: Low bit rate video coding The H.263/H.263+ standards and their optional modes Efficiency: Performance and complexity of individual modes
and combinations of modes Efficiency: Real-time software-only encoding Efficiency: Rate-distortion optimized video coding Scalability: Description & Characteristics Scalability: Rate-distortion optimized framework Error resilience: Synchronization & error concealment Error resilience: Multiple description video coding Conclusions: H.263/H.263+, MPEG-4 and research directions
Low Bit Rate H.263 + Video Coding: Efficiency, Low Bit Rate H.263 + Video Coding: Efficiency, Scalability and Error ResilienceScalability and Error Resilience
2
Copyright (C) 1998Faouzi Kossentini
3
H.263 version 2 (H.263+): H.263 version 2 (H.263+): Higher coding efficiency, more flexibility,
scalability support, error resilience support
Video coding standards:Video coding standards: ITU-T H.261(1990), H.262 (1994), H.263 (1995), H.263+ (1998) ISO/IEC MPEG1 (1992), MPEG2 (1994), MPEG4 (1999)
Low Bit Rate Video codingLow Bit Rate Video coding
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
Video coding algorithms:Video coding algorithms: Waveform based coding: MC+DCT/wavelets, 3D subband, etc. Object- and model-based coding : shape coding, wireframes, etc.
Why?: Why?: Increasing demand for video conferencing and telephony applications, limited bandwidth in PSTN and wireless networks
Copyright (C) 1998Faouzi Kossentini
ME
IQ
MUXVLC
0
Bit StreamVideo in
Inter/Intra
QDCT
IDCT
PRED
The H.263 StandardThe H.263 Standard
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
4
Copyright (C) 1998Faouzi Kossentini
5
The H.263 StandardThe H.263 StandardStructure
Inter picture prediction to reduce temporal redundancies DCT coding to reduce spatial redundancies in difference frames
I P P P
Major enhancements to H.261More standardized input picture formatsHalf pixel motion compensationOptimized VLC tablesBetter motion vector predictionFour (optional) negotiable modes
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
Copyright (C) 1998Faouzi Kossentini
6
Unrestricted Motion Vector (UMV) mode, Annex D: Motion vectors (MVs) allowed to point outside the picture, MV range extended to ±31.5
Syntax-based Arithmetic Coding (SAC) mode, Annex E : SAC used instead of VLCs
Advanced Prediction (AP) mode, Annex F: Four MVs per macroblock, over-lapped motion compensation, MVs allowed to point outside the picture area
PB-frames (PB) mode, Annex G: A Frame with a P-picture and a B- picture used as a unit
I B P B P
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
The H.263 Standard: Optional ModesThe H.263 Standard: Optional Modes
Copyright (C) 1998Faouzi Kossentini
The H.263+ Standard: Optional ModesThe H.263+ Standard: Optional Modes
Unrestricted Motion Vector (UMV) mode, Annex D: MV range extended to ±256 depending on the picture size, reversible VLCs used for MVs
Advanced Intra Coding (AIC) mode, Annex I: Inter block prediction from neighboring intra coded blocks, modified quantization, optimized VLCs
Deblocking Filter (DF) mode, Annex J: Deblocking filter inside the coding loop, four motion vectors per macroblock, MVs outside picture boundaries
Slice Structured (SS) mode, Annex K: Slices used instead of GOBs
Supplemental Enhancement Information (SEI) mode, Annex L: Supplemental information included in the stream to offer display capabilities
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
7
Copyright (C) 1998Faouzi Kossentini
Improved PB-frames (IBP) mode, Annex M: Forward, backward and bi-directional prediction supported, delta vector not transmitted
Reference Picture Selection (RPS) mode, Annex N: Reference picture selected for prediction to suppress temporal error propagation
Temporal, SNR, and Spatial Scalability mode, Annex O: Syntax to support temporal, SNR, and spatial scalability
Reference Picture Resampling (RPR) mode, Annex P: Warping of the reference picture prior to its use for prediction
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
8
The H.263+ Standard: Optional ModesThe H.263+ Standard: Optional Modes
Copyright (C) 1998Faouzi Kossentini
The H.263+ Standard: Optional ModesThe H.263+ Standard: Optional Modes
Reduced Resolution Update (RRU) mode, Annex Q: Encoder allowed to send update information for a picture encoded at a lower resolution, while still maintaining a higher resolution for the reference picture
Independently Segmented Decoding (ISD) mode, Annex R: Dependencies across the segment boundaries not allowed
Alternative Inter VLC (AIV) mode, Annex S: The intra VLC table designed for encoding quantized intra DCT coefficients in the AIC mode used for inter coding
Modified Quantization (MQ) mode, Annex T: Quantizer allowed to change at the macroblock layer, finer chrominance quantization employed, range of representable quantized DCT coefficients extended to [-127, +127]
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
9
Copyright (C) 1998Faouzi Kossentini
Performance & Complexity: Compression efficiency for intra macroblocks increased, encoding time increased by ~5%, required memory slightly increased
Advanced Intra Coding ModeFirst Intra frame of Akiyo
29
31
33
35
37
4000 6000 8000 10000 12000 14000Y bits
Y P
SN
RAdvanced Intra CodingNo option
DCDC
DC
Block AboveVertical Prediction
Block to the LeftHorizontalPrediction
CurrentBlock
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
10
Efficiency: Performance & Complexity Efficiency: Performance & Complexity of Individual Modes of Individual Modes
Advanced Intra Coding (AIC) mode, Annex I: Inter block prediction fromneighboring intra coded blocks, modified quantization, optimized VLCs
Copyright (C) 1998Faouzi Kossentini
(a) (b)Without (a) and with (b) DF mode set on
Performance: Subjective quality improved (blocking & mosquito artifacts removed), 5-10% additional encoding time
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
11
Efficiency: Performance & Complexity Efficiency: Performance & Complexity of Individual Modes of Individual Modes
Deblocking Filter (DF) mode, Annex J: Deblocking filter inside the coding loop, four motion vectors per macroblock, MVs outside picture boundaries
Copyright (C) 1998Faouzi Kossentini
Performance: Picture rate doubled, complexity increased, more memory required, one frame delay incurred
Improved PB Frames mode: Y PSNR vs Rate for FOREMAN
26
28
30
32
34
36
20 40 60 80 100 120 140Rate (kbps)
Y P
SN
R
P picture, w/o mode
B picture, PB mode
P picture, PB mode
B picture, IPB mode
P picture, IPB mode
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
12
Efficiency: Performance & Complexity Efficiency: Performance & Complexity of Individual Modes of Individual Modes
Improved PB-frames (IBP) mode, Annex M: Forward, backward and bi-directional prediction supported, delta vector not transmitted
Copyright (C) 1998Faouzi Kossentini
Performance: Useful at high bit rates, bit savings of as much as 10%
achieved, less than 2% additional encoding/decoding time required
Sequence QuantizerStep Size
Bit savingsusing AIV mode
AKIYO 4 5%12 0%
FOREMAN 4 7%8 4%16 1%
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
13
Efficiency: Performance & Complexity Efficiency: Performance & Complexity of Individual Modes of Individual Modes
Alternative Inter VLC (AIV) mode, Annex S: The intra VLC table designed for encoding quantized intra DCT coefficients in the AIC mode used for inter coding
Copyright (C) 1998Faouzi Kossentini
Performance: Chrominance PSNR increased substantially at low bit rates, more flexible rate control supported, very little complexity and computation time added to the coder
MQ mode : Chrominance PSNR vs Rate
35
36
37
38
39
10 30 50 70 90 110Rate (kbps)
Ch
rom
inan
ce
PS
NR
w/o modeMQ mode
MQ mode : Average PSNR vs Rate
29
31
33
35
10 30 50 70 90 110Rate (kbps)
Ave
rage
P
SN
R
w/o modeMQ mode
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
14
Efficiency: Performance & Complexity Efficiency: Performance & Complexity of Individual Modes of Individual Modes
Modified Quantization (MQ) mode, Annex T: Quantizer allowed tochange at the macroblock layer, finer chrominance quantization employed,range of representable quantized DCT coefficients extended to [-127, +127]
Copyright (C) 1998Faouzi Kossentini
H.263 PSNR = 32.44 H.263+ PSNR = 33.29
H.263+ modes: AIC, MQ, UMV, DF, AP
H.261 PSNR = 31.91
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
15
Efficiency: Performance & Complexity Efficiency: Performance & Complexity of Combinations of Modes of Combinations of Modes
Copyright (C) 1998Faouzi Kossentini
FOREMAN (fps) AKIYO (fps)
H.263+ Baseline coding (Full ME) 1.8 2.4
H.263+ Baseline coding (Fast ME) 4.3 5.8
H.263+: MQ, DF, AIC, AIV, IPB,UMV, AP modes turned on (Fast ME)
3 3.8
Required for real-time video telephony and conferencing:
encoding at a minimum of 8-10 fps for QCIF resolution
Efficiency: Real-time Software-only Encoding Efficiency: Real-time Software-only Encoding
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
16
Copyright (C) 1998Faouzi Kossentini
Algorithmic approach: Developing efficient platform-
independent video coding algorithms
Hardware dependent approach: Mapping SIMD oriented
coding components onto Intel’s MMX architecture
Function Computational Load
Reference IDCT 25%
Integer Pixel ME 12%
Fast DCT 11%
Half Pixel ME 10%
Quantization 9%
Efficiency: Real-time Software-only Encoding Efficiency: Real-time Software-only Encoding
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
Copyright (C) 1998Faouzi Kossentini
17
18
Efficiency: Real-time Software-only Encoding Efficiency: Real-time Software-only Encoding
Copyright (C) 1998Faouzi Kossentini
Efficient Video Coding Algorithms:Efficient Video Coding Algorithms:
IDCT for blocks with all or most of the coefficients equal to zero avoided or efficiently computed (respectively)
SAD for most macroblocks computed efficiently via prediction
DCT & Quantization for most blocks avoided via prediction
Linear approximation methods used for half-pixel MEQuantization: look-up tables employed
Some Image and Video Processing Research Projects @ Some Image and Video Processing Research Projects @ the UBC Signal Processing and Multimedia Laboratorythe UBC Signal Processing and Multimedia Laboratory
Intel’s MMX Architecture:Intel’s MMX Architecture:
Features :
SIMD structure
Four new data types
57 new instructions
Saturation arithmetic
Packed byte (Eight 8-bit elements) 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0
Packed word (Four 16-bit elements) 63 48 47 32 31 16 15 0
Packed doubleword (Two 32-bit elements) 63 32 31 0
Quadword (64-bit element) 63 0
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
19
Efficiency: Real-time Software-only Encoding Efficiency: Real-time Software-only Encoding
Copyright (C) 1998Faouzi Kossentini
a1a2a3a4a5a6a7a8
- (psubusb)
b1b2b3b4b5b6b7b8
b1b2b3b4b5b6b7b8 a1a2a3a4a5a6a7a8
- (psubusb)
= =y1000y50000x2x3x40x6x7x8
por
y1x2x3x4y5x6x7x8
MMX Mapping of SAD Computations:MMX Mapping of SAD Computations:
Saturation arithmeticis used to performabsolute differenceoperations
Result: 4-5 times faster SAD computations
SAD computation function implemented in MMX for 16x16 blocks (baseline coding) and 8x8 blocks (optional modes)
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
20
Efficiency: Real-time Software-only Encoding Efficiency: Real-time Software-only Encoding
Copyright (C) 1998Faouzi Kossentini
DCT MMX Implementation:DCT MMX Implementation:
Floating point arithmetic to fixed point arithmeticFour rows processed at a timeResults: 4 times faster using fixed point arithmetic, 3 times faster using MMX as compared to the integer “C” implementation Foreman
48 Kb/sec
Akiyo
8 Kb/sec
Foreman
64 Kb/sec
Akiyo
28 Kb/sec
PSNR with floating point DCT 30.47 32.63 31.56 37.92
Variation in PSNR with
integer or MMX DCT
-0.01 +0.02 -0.01 0.00
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
21
Efficiency: Real-time Software-only Encoding Efficiency: Real-time Software-only Encoding
Other MMX Implementations:Other MMX Implementations: Interleaved transfers for half pixel
ME (7x), interp. (3x), motion comp. (2x), block data transfers (2x)
Copyright (C) 1998Faouzi Kossentini
Overall Performance:Overall Performance: Function Computational Load
DCT 9%
IDCT 9%
Quantization 8%
Load ME Area 7%
Scan 6%
Interpolation 5.5%
Fast ME 5%
Reconstruct Full Image 5%
Half Pixel ME 4%
Motion Compensation 4%
encoded frame ratebefore optimization
encoded frame rateafter optimization
FOREMAN AKIYO FOREMAN AKIYO
H.263+ Baseline coding (Full ME) 2.0 2.4 4.1 4.4
H.263+ Baseline coding (Fast ME) 6.4 7.5 14.3 17.0
14-17 fps baseline H.263+ encoding with fast ME
8-10 fps with all of the H.263+ optional modes
More than a 100% increase in speed using the efficient algorithms and more than a 25% additional increase in speed using MMX
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
22
Copyright (C) 1998Faouzi Kossentini
Motivation:Best possible tradeoffs between quality (distortion) and bit rate
obtained via Rate-Distortion (RD) optimization, through the use of the Lagrangian cost measure
Solution:Motion vector selected that yields the best rate-quality tradeoff
Coding mode (Intra, Inter16, Inter8, Skipped) selected that yields the best rate-quality tradeoff based on the already determined motion vectors
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
23
Efficiency: Rate-Distortion Optimized Video Coding Efficiency: Rate-Distortion Optimized Video Coding
Copyright (C) 1998Faouzi Kossentini
Result:10 - 30% savings in bit rate obtained using the RD optimized coder
for most sequences over our reference software
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
24
Comparison of RD Performance RD Optimized Coder vs TMN-3.1.2
Foreman CIF
28.00
30.00
32.00
34.00
36.00
38.00
0.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 800.00 900.00 1000.00
Rate
Y P
SN
R RD Coder
TMN-3.1.2 Coder
Efficiency: Rate-Distortion Optimized Video Coding Efficiency: Rate-Distortion Optimized Video Coding
Copyright (C) 1998Faouzi Kossentini
Description:Multi-representation coding at different quality and resolution
levels allowed
Three general types (SNR, spatial, temporal) of scalability supported
Characteristics:Each layer built incrementally on its previous layer, increasing
the decoded picture quality, resolution, or both
Only the first (base) layer independently decoded
Information from temporally previous pictures in the same layer or temporally simultaneous pictures in a lower layer used
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
25
Scalability: Description & CharacteristicsScalability: Description & Characteristics
Copyright (C) 1998Faouzi Kossentini
Example:Enhancement layer 1: Example of SNR scalability
Enhancement layer 2: Example of spatial and temporal scalability
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
26
EI EP
EnhancementLayer 2
EnhancementLayer 1
EI B
I PBaseLayer
EP
EI
P
EP
EP
P
Scalability: Description & CharacteristicsScalability: Description & Characteristics
Copyright (C) 1998Faouzi Kossentini
Motivation:Compression efficiency of a higher layer dependent on that of all
preceding (lower) layers
Higher compression efficiency and more flexible joint rate-distortion control achieved via an RD optimized framework
Current research direction:Employ (1) multiple rate constraints (explicit for each layer) OR (2) a
single rate constraint with explicit distortion constraints for each layer.
Apply RD optimization to temporal, spatial and SNR scalable coding algorithms
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
27
Scalability: RD Optimized FrameworkScalability: RD Optimized Framework
Copyright (C) 1998Faouzi Kossentini
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,
Scalability and Error ResilienceScalability and Error Resilience28
Error Resilience: Synchronization& Error Concealment
Use of Synchronization Markers:
Different synchronization markers (between GOBs, slices, etc.) used for higher error resilience
Error concealment:
Missing motion vectors for a macroblock received in error recovered from neighboring macroblocks (via spatial methods)
Texture information from the previous frame is compensated using the recovered motion vectors
Copyright (C) 1998Faouzi Kossentini
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,
Scalability and Error ResilienceScalability and Error Resilience29
Experimental Results:Packetization: one packet consists of a complete GOB.Using synchronization markers and error concealment yields PSNR gains of as much as 9 dB.
Error Concealment and Packetization Performance for Foreman
1214161820222426283032
0 5 10 15 20 25
Packet Loss Rate (PLR)
Y-P
SN
R
Concealment
No Concealment
Error Resilience: Synchronization& Error Concealment
Copyright (C) 1998Faouzi Kossentini
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,
Scalability and Error ResilienceScalability and Error Resilience30
Use of Reference Picture Selection Mode: Use of different pictures (or picture segments) for prediction allowed by H.263+, hence multiple description video coding
Multiple Description:Different descriptions of the source sent on different channels, may be available at the decoderRedundancy among descriptions minimizedUseful even if only one description is available
Temporal Reference for Prediction:Different temporal reference pictures employed for prediction of different picturesError propagation confined to those pictures dependent on the missing reference
Error Resilience: Multiple Description Video Coding
Copyright (C) 1998Faouzi Kossentini
H.263+ versus H.263:H.263+ backward compatible with the H.263 (same baseline)
Efficient ways provided by H.263+ of trading additional complexity for more compression efficiency, scalability, resilience and flexibility
H.263/H.263+ versus MPEG-4:H.263 output decodable by MPEG-4, H.263+/MPEG-4 sharing tools
Although somewhat available via chroma keying in H.263+, object-based functionality better provided by MPEG-4
H.263/H.263+ public-domain implementation: http://spmg.ece.ubc.ca/
Future directions: Still higher coding efficiency and better scalability/resilience possible
More object-based tools and capabilities, better (efficient) shape coding
Low Bit Rate H.263 + Video Coding: Efficiency,Low Bit Rate H.263 + Video Coding: Efficiency,Scalability and Error ResilienceScalability and Error Resilience
Copyright (C) 1998 Faouzi Kossentini
31
Conclusions