multimedia communications lecture 10: video standards part

50
Multimedia Communications Lecture 10: Video Standards Part I. Videophone and video conferencing: H.261/H.263 Dr. Tian-Sheuan Chang [email protected] Dept. Electronics Engineering National Chiao-Tung University Dept. Electronics Engineering, National Chiao Tung University Adapted from Prof. Hang’s slides

Upload: others

Post on 01-Oct-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multimedia Communications Lecture 10: Video Standards Part

Multimedia CommunicationsLecture 10: Video StandardsPart I. Videophone and video conferencing: H.261/H.263

Dr. Tian-Sheuan [email protected]. Electronics EngineeringNational Chiao-Tung University

Dept. Electronics Engineering,N

ational Chiao T

ung University

Adapted from Prof. Hang’s slides

Page 2: Multimedia Communications Lecture 10: Video Standards Part

Introduction to Video Standard

Dept. Electronics Engineering,N

ational Chiao T

ung University

Page 3: Multimedia Communications Lecture 10: Video Standards Part

2

Institute of Electronics,National C

hiao Tung U

niversity

Behind the Scene

• Why we can do compression?– Observation

• Significant amount of statistical and subjective redundancy within and between frames

– Statistical redundancy• Lossless compression• e.g. 000000000000000… -> run length coding, arithmetic coding,

huffman coding– Subjective redundancy

• Lossy compression• Explore characteristics of Human Visual System

– Not sensitive to high frequency component• Spatial redundancy

– DCT transform, quantized high freq. component• Temporal redundancy

– Motion estimation

Page 4: Multimedia Communications Lecture 10: Video Standards Part

3

Institute of Electronics,National C

hiao Tung U

niversity

The Scope of Picture and Video Coding Standardization• Only the Syntax and Decoder are standardized:

– Permits optimization beyond the obvious– Permits complexity reduction for implementability– Provides no guarantees of Quality

Pre-Processing EncodingSource

DestinationPost-Processing& Error Recovery

Decoding

Scope of Standard

Page 5: Multimedia Communications Lecture 10: Video Standards Part

4

Institute of Electronics,National C

hiao Tung U

niversity

Development of Coding Tools and Standards

DPCM 1952-1980

Transform Coding 1965-1980

Motion Compensated Prediction 1972-1989

Entropy Coding 1949-1976

H.261 1984-1990

MPEG1 1988-1992

MPEG2 1991-1994

JPEG 1984-1992

MPEG4

H.263

1950s 1960s 1970s 1980s 1990s

Page 6: Multimedia Communications Lecture 10: Video Standards Part

5

Institute of Electronics,National C

hiao Tung U

niversity

ITU/MPEG Standards• H.261

– ITU H.261– Optimized for CIF@384Kbps, focus on video phone over ISDN– First design (late ‘90) embodying typical structure that dominates today

• 16x16 macroblock motion compensation, 8x8 DCT, scalar quantization, and variable-length coding

• MPEG-1– ISO/IEC 11172– 1993 IS, design focus on VHS quality (352x240)@1.5Mbps

• MPEG-2– ISO/IEC 13818– 1994 IS, Optimized at “NTSC quality” CCIR601 video@6-10Mbps

• H.263– ITU H.263– Focus on video phone over phone lines/wireless

• MPEG-4– officially ISO/IEC 14496– Part 2. video : 2001 IS, content based video coding, interactive video– Part 10. advance video coding (AVC) – ITU H.264

• 2004 IS, 50% bit rate reduction than other video standard

Page 7: Multimedia Communications Lecture 10: Video Standards Part

H.261

Dept. Electronics Engineering,N

ational Chiao T

ung University

Page 8: Multimedia Communications Lecture 10: Video Standards Part

7

Institute of Electronics,National C

hiao Tung U

niversity

ITU-T Video Standard: H.261 - History

• CCITT Study Group (SG) XV — Videophone and videoconferencing at bit rate: ~40 kb/s -- 2 Mb/s

• Defines only the decoder; a reference encoder model was developed to test the decoder.

• History:– Dec. 1984: The specialists group established.– 1984~1988: Algorithm developed for nx384 kb/s, n =

1, …, 5.– 1989: Modified for px64 kb/s, p = 1, …, 30.– Dec. 1990: Standards approved

Page 9: Multimedia Communications Lecture 10: Video Standards Part

8

Institute of Electronics,National C

hiao Tung U

niversity

ITU-T Multimedia Communications Standards

/3

Page 10: Multimedia Communications Lecture 10: Video Standards Part

9

Institute of Electronics,National C

hiao Tung U

niversity

H.324 Terminal(multimedia communication over PSTN)

Page 11: Multimedia Communications Lecture 10: Video Standards Part

10

Institute of Electronics,National C

hiao Tung U

niversity

H.261 Overall Codec System

Page 12: Multimedia Communications Lecture 10: Video Standards Part

11

Institute of Electronics,National C

hiao Tung U

niversity

Quick View of H.261

• ITU-T H.261: The basis of modern video compression– The first widespread practical success– First design (late ‘90) embodying typical structure that

dominates today• 16x16 macroblock motion compensation, 8x8 DCT, scalar

quantization, and variable-length coding– Other key aspects

• loop filter, integer-pel motion compensation accuracy, 2-D VLC for coefficients

– Operated at 64-2048 kbps– Still in use

• although mostly as a backward compatibility feature –overtaken by H.263

Page 13: Multimedia Communications Lecture 10: Video Standards Part

12

Institute of Electronics,National C

hiao Tung U

niversity

Picture Partition (1)• Picture size: CIF, QCIF• Macroblock (MB): Contains six 8x8 blocks (motion compensation,

quantizer adjustment, …)

Cb CrY

• Group of Block (GOB): Contains 33 MB‘s (synchronization, quantizer adjustment, …)

1 2 3 4 5 6 7 8 9 10 1112 13 14 15 16 17 18 19 20 21 2223 24 25 26 27 28 29 30 31 32 33

43

215 6

Page 14: Multimedia Communications Lecture 10: Video Standards Part

13

Institute of Electronics,National C

hiao Tung U

niversity

Picture Partition (2)

• Picture: Contains 3 or 12 GOB‘s (picture sync., time reference, …)

Page 15: Multimedia Communications Lecture 10: Video Standards Part

14

Institute of Electronics,National C

hiao Tung U

niversity

Syntax

• Picture Layer: Picture Start Code (PSC: 20 bits), Temporal Reference (TR: 5), Picture Type (PTYPE), Picture Extra Insertion(PEI), Picture Spare (PSPARE)

• GOB Layer: Group Start Code (GBSC), Group Number (GN), Quantizer (GQUANT), Extra Insertion (GEI), ...

Page 16: Multimedia Communications Lecture 10: Video Standards Part

15

Institute of Electronics,National C

hiao Tung U

niversity

Syntax (cont.)

• Macroblock (MB) Layer: MB Address — deffenrential (MBA), MB Type (MTYPE), Quantizer (MQUANT), Motion Vector Data —differential (MVD), Coded Block Pattern (CBP)

Page 17: Multimedia Communications Lecture 10: Video Standards Part

16

Institute of Electronics,National C

hiao Tung U

niversity

Syntax (cont.)

• Block Layer: (DCT) Transform Coefficients (TCOEFF), End of Block (EOB: ‘10’)

Page 18: Multimedia Communications Lecture 10: Video Standards Part

17

Institute of Electronics,National C

hiao Tung U

niversity

H.261 Frame Sequence

Page 19: Multimedia Communications Lecture 10: Video Standards Part

18

Institute of Electronics,National C

hiao Tung U

niversity

H.261 Frame Sequence• Two types of image frames are defined: Intra-frames (I-frames) and

Inter-frames (P-frames):

– I-frames are treated as independent images. Transform coding method similar to JPEG is applied within each I-frame, hence “intra".

– P-frames are not independent: coded by a forward predictive codingmethod (prediction from a previous P-frame is allowed --- not just from a previous I-frame).

– Temporal redundancy removal is included in P-frame coding, whereas I-frame coding performs only spatial redundancy removal.

– To avoid propagation of coding errors, an I-frame is usually sent a couple of times in each second of the video. (intra refresh)

• Motion vectors in H.261 are always measured in units of full pixeland they have a limited range of 15 pixels, i.e., p = 15.

Page 20: Multimedia Communications Lecture 10: Video Standards Part

19

Institute of Electronics,National C

hiao Tung U

niversity

Intra-frame (I-frame) Coding• Macroblocks are of size 16x16 pixels for the Y frame, and 8x8 for Cb

and Cr frames, since 4:2:0 chroma subsampling is employed. A macroblock consists of four Y, one Cb, and one Cr 8x8 blocks.

• For each 8x8 block a DCT transform is applied, the DCT coefficients then go through quantization zigzag scan and entropy coding.

Page 21: Multimedia Communications Lecture 10: Video Standards Part

20

Institute of Electronics,National C

hiao Tung U

niversity

P-Frame (Inter-frame) Coding

Page 22: Multimedia Communications Lecture 10: Video Standards Part

21

Institute of Electronics,National C

hiao Tung U

niversity

P-Frame (Inter-frame) Coding

• The P-frame coding encodes the difference macroblock(not the Target macroblock itself).

• Sometimes, a good match cannot be found, i.e., the prediction error exceeds a certain acceptable level.– The MB itself is then encoded (treated as an Intra MB) and in this

case it is termed a non-motion compensated MB.

• For motion vector, the difference MVD is sent for entropy coding:– MVD = MVPreceding −MVCurrent

Page 23: Multimedia Communications Lecture 10: Video Standards Part

22

Institute of Electronics,National C

hiao Tung U

niversity

H.261 Encoder (Nonstandard)

Loop filter

Page 24: Multimedia Communications Lecture 10: Video Standards Part

23

Institute of Electronics,National C

hiao Tung U

niversity

H.261 Decoder (Standard)

Page 25: Multimedia Communications Lecture 10: Video Standards Part

24

Institute of Electronics,National C

hiao Tung U

niversity

A Glance at Syntax of H.261 Video Bitstream

Page 26: Multimedia Communications Lecture 10: Video Standards Part

25

Institute of Electronics,National C

hiao Tung U

niversity

Parameter Selection and Rate Control

• MTYPE (intra vs. inter, zero vs. non-zero MV in inter, loop filter on/off)

• CBP (which blocks in a MB have non-zero DCT coefficients)

• MQUANT (allow the changes of the quantizer step size at the MB level)– should be varied to satisfy the rate constraint

• MV (ideally should be determined not only by prediction error but also the total bits used for coding MV and DCT coefficients of prediction error)

Page 27: Multimedia Communications Lecture 10: Video Standards Part

26

Institute of Electronics,National C

hiao Tung U

niversity

Quantization

• 8x8 DCT; zig-zag scan• Uniform quantizer with a dead-zone:

Odd QUANTREC = QUANT•(2•level + 1); for level > 0REC = QUANT•(2•level – 1); for level < 0

Even QUANTREC = QUANT•(2•level + 1) – 1; for level > 0REC = QUANT•(2•level – 1) + 1; for level < 0REC = 0; for level = 0

QUANT value: 1 – 31 (5 bits); may be changed for every MB and /or GOBException: Intra-block dc coeff — step size = 8 (fixed) and no dead-zone

Page 28: Multimedia Communications Lecture 10: Video Standards Part

27

Institute of Electronics,National C

hiao Tung U

niversity

Quantization

• The quantization in H.261 uses a constant step size, for all DCT coefficients within a macroblock.

• If we use DCT and QDCT to denote the DCT coefficients before and after the quantization, then for DC coefficients in Intra mode

– For other coefficients (floor function for center deadzone)

– Scale: an integer in the range of [1, 31].

Page 29: Multimedia Communications Lecture 10: Video Standards Part

28

Institute of Electronics,National C

hiao Tung U

niversity

DCT Coefficient Quantization

Deadzone:To avoid too many small coefficients being coded, which are typically due to noise

Page 30: Multimedia Communications Lecture 10: Video Standards Part

29

Institute of Electronics,National C

hiao Tung U

niversity

Variable Length Coding • DCT coefficients are converted into runlength representations and then coded

using VLC (Huffman coding for each pair of symbols)– Symbol: (Zero run-length, non-zero value range)

• Other information are also coded using VLC (Huffman coding)

Bits 1 2 3 4 5 6 7 8 …. 15 16 ………. 128 0 1 2 3 4 5 6 7 8 . . .1112 . .

27 . .63

2(3 ) 5 6 8 9 9 11 13 ….. 14 20 …….. 20 4 7 9 11 13 14 14 20 ….. 5 8 11 13 14 20 …. 6 9 13 14 20 … 6 11 13 20 … 7 11 14 20 … 7 13 20 … 7 13 20 … 8 13 20 … . . . . 20 …

9 20 … 9 . 20 … . 20 ….

20 … 20 20 20

20-bits fixed length codes

Escape(6 bits)+Run(6)+Level(8)

R

u

n

Absolute Level→

Page 31: Multimedia Communications Lecture 10: Video Standards Part

30

Institute of Electronics,National C

hiao Tung U

niversity

Motion Estimation and Compensation

• Integer-pel accuracy in the range [-16,16]• Methods for generating the MVs are not specified in the

standard – Standards only define the bitstream syntax, or the decoder

operation)

• MVs coded differentially (DMV)• Encoder and decoder uses the decoded MVs to perform

motion compensation • Loop-filtering can be applied to suppress propagation of

coding noise temporally– Separable filter [1/4,1/2,1/4]– Loop filter can be turned on or off

Page 32: Multimedia Communications Lecture 10: Video Standards Part

H.263

Dept. Electronics Engineering,N

ational Chiao T

ung University

Page 33: Multimedia Communications Lecture 10: Video Standards Part

32

Institute of Electronics,National C

hiao Tung U

niversity

Very Low Bit Rate Coding• ITU-T Study Group (SG) 15/16: Very low Bit-Rate Visual Telephony

(LBC)• History:

Sept. 1993: Started new work item.Near-term: Improving H.261

Nov. 1995 — H.263 decidedJan. 1998 — H.263+ (H.263 Ver.2) decided2000 — Finished H.263++ ( H.263 Ver.3)Long-term: Draft H.26L H.264 (2003)— Different from H.261 (H.263)— Collaborate with MPEG-4 (JVT = AVC)

• Goal: Improved quality at lower rates• Result: Significantly better quality at lower rates

– Better video at 18-24 Kbps than H.261 at 64 Kbps– Enable video phone over regular phone lines (28.8 Kbps) or wireless

modem

Page 34: Multimedia Communications Lecture 10: Video Standards Part

33

Institute of Electronics,National C

hiao Tung U

niversity

Different From H.261

• A combination of H.261 and MPEG• Various picture formats such as sub-QCIF, 4CIF,

etc.• Half-pel motion compensation (~MPEG)• No loop filter• No microblock addressing (included in MB header)• Quantizer stepsize: 5-bit in picture and GOB

layers; differential MQUANT stepsize: 2-bit in MB layer

• 3D VLC for transform coeffs.• Four negotiable options

Page 35: Multimedia Communications Lecture 10: Video Standards Part

34

Institute of Electronics,National C

hiao Tung U

niversity

Video Format and Picture Partition in H.263

• Wider application range from sub-QCIF to 16CIF

Page 36: Multimedia Communications Lecture 10: Video Standards Part

35

Institute of Electronics,National C

hiao Tung U

niversity

H.263 Typical Encoder (Nonstandard)

• A general source coder model

Page 37: Multimedia Communications Lecture 10: Video Standards Part

36

Institute of Electronics,National C

hiao Tung U

niversity

DCT, Quantization and 3-D VLC

• DCT and Zig-zag scan: same as H.261 (JPEG)• Inverse-Quantization: same at that of H.261• At MB layer, the QUANT value can only be

increased / decreased by 1 and 2• 3-D VLC: An event (symbol) is made of (Last, Run,

Level).– ‘Last’ = 1 indicates the last coeff.

Page 38: Multimedia Communications Lecture 10: Video Standards Part

37

Institute of Electronics,National C

hiao Tung U

niversity

3-D VLC

Last Run Level (Bits) VCL Code0 0 1 3 10s0 0 2 5 1111s0 0 3 7 0101 01s

…1 0 1 5 0111s1 0 2 10 0000 1100 1s1 0 3 12 0000 0000 101s

•••

Page 39: Multimedia Communications Lecture 10: Video Standards Part

38

Institute of Electronics,National C

hiao Tung U

niversity

Motion Estimation: Median Prediction for MV

1. horizontal and vertical components are seperatedly calculated2. The difference between MV and the predictor is VLC-coded

Page 40: Multimedia Communications Lecture 10: Video Standards Part

39

Institute of Electronics,National C

hiao Tung U

niversity

Motion Estimation: Half-Pel Precision

• Half-pixel prediction by bilinear interpolation– to reduce the prediction error,– default range MV(u; v) are now [−16; 15:5].– Half pels are generated by bilinear interpolation

Page 41: Multimedia Communications Lecture 10: Video Standards Part

40

Institute of Electronics,National C

hiao Tung U

niversity

H.263 Negotiable Options

-- Negotiable between encoder and decoderUnrestricted motion vectors (UMV) mode:Motion vectors are allowed to point outside the pictureSyntax-based arithmetic coding (SAC) mode: VLC is replaced by arithmetic codingAdvanced prediction (AP) mode: One MV for each 8x8 blockPB-frame (PB) mode: Introduce a ‘constrained version’ of (MPEG) B-frame

Page 42: Multimedia Communications Lecture 10: Video Standards Part

41

Institute of Electronics,National C

hiao Tung U

niversity

Advanced Prediction Mode

Four MV's can be used in a MB: The 1st (differential) MV is MVD and the rest, MVD2-4The MV predictor for each 8x8 block is formed by using 3 nearby MV's as shown below

Page 43: Multimedia Communications Lecture 10: Video Standards Part

42

Institute of Electronics,National C

hiao Tung U

niversity

AP Mode: Overlapped Motion Compensation

Each pel in the current 8x8 luminance block is predicted using the weighted sum of the pels of three previous frame predictors: current, left (or right), top (or bottom). For example, the upper left 4x4 corners uses the current, top and left predictors; the upper right 4x4 corners uses the current, top and right predictors; etc.The current predictors is the previous-frame pels displaced using the current MV, the left predictor is displaced using the left block MV, etc.Four MV‘s enable more accurate MV for each block. Overlapped compensation achieves smooth transitionbetween nearby blocks.

Page 44: Multimedia Communications Lecture 10: Video Standards Part

43

Institute of Electronics,National C

hiao Tung U

niversity

Overlapped Motion Compensation

where is the pels displaced by the current MV, is the pel displaced by (MV of the top or the

bottom block), is the pel displaced by (MV of the left or the right block).

8/)4),(),(),(),(),(),((),( 0

+×+×+×=

jiHjisjiHjirjiHjiqjip

s

r

),( jiq),( jir rMV

),( jis MVs

0MV

Page 45: Multimedia Communications Lecture 10: Video Standards Part

44

Institute of Electronics,National C

hiao Tung U

niversity

Overlapped MC (cont.)

Page 46: Multimedia Communications Lecture 10: Video Standards Part

45

Institute of Electronics,National C

hiao Tung U

niversity

Motion Estimation: PB-Picture Mode

PB-picture mode codes two pictures as a group. The second picture (P) is coded first, then the first picture (B) is coded using both the P-picture and the previously coded picture. This is to avoid the reordering of pictures required in the normal B-mode. But it still requires additional coding delay than P-frames only.

In a B-block, forward prediction (predicted from the previous frame) can be used for all pixels;backward prediction (from the future frame) is only used for those pels that the backward motion vectoraligns with pels of the current MB. Pixels in the “white area” use only forward prediction.

Under large motions, PB-frames do not compress as well as B-frames. An improved PB-frame mode was defined in H.263+, that removes the previous restriction.

Page 47: Multimedia Communications Lecture 10: Video Standards Part

46

Institute of Electronics,National C

hiao Tung U

niversity

Performance of H.261 and H.263

Forman, QCIF, 12.5 Hz

Integer MC, +/- 16

Half-pel MC, +/- 32

Integer MC, +/- 16, loop filter

Integer MC, +/- 32

OBMC, 4 MVs, etc

Page 48: Multimedia Communications Lecture 10: Video Standards Part

47

Institute of Electronics,National C

hiao Tung U

niversity

Advantages of Options

(Girod and et al., Performance of the H.263 Video Compression Standard, VLSI Signal Proc., 1997)At 64 kbps, QCIF pictures, ~12.5 frames / secH.261 vs. H.263: (1) w/o options ~2 dB PSNRimprovement; (2) with all options ~ 3 dB.Key factor: Half-pel motion estimation.H.263 SAC option: 0.2 dB improvement (vs. w/o)H.263 AP option: 1.2 dB (vs. w/o)H.263 PB option: P-pic PSNR is higher but B-pic PSNR is lower; Better subjective quality

Page 49: Multimedia Communications Lecture 10: Video Standards Part

48

Institute of Electronics,National C

hiao Tung U

niversity

H.263+ (H.263 v2)

Enhance H.263 with additional options (Draft 20, Sept. ‘97)Coding efficiency:

— Advanced intra coding mode— Deblocking filter mode— Improved PB-frames mode— Reference picture resampling mode— Alternative inter VLC mode— Modified quantization mode

Page 50: Multimedia Communications Lecture 10: Video Standards Part

49

Institute of Electronics,National C

hiao Tung U

niversity

H.263+ (cont.)

Error robustness:— Slice structured mode— Referenced picture selection mode— Independently segmented decoding modeEnhanced Communication:— Temporal, SNR, and spatial scalability mode— Reduced-resolution updated mode