international telecommunication union the h.264/mpeg-4 ... · itu-t vica workshop 22-23 july 2005,...

29
ITU-T VICA Workshop 22-23 July 2005, ITU Headquarter, Geneva International Telecommunication Union The H.264/MPEG The H.264/MPEG - - 4 4 Advanced Video Coding Advanced Video Coding (AVC) Standard (AVC) Standard Gary J. Sullivan, Ph.D. Gary J. Sullivan, Ph.D. ITU ITU- T VCEG Rapporteur | Chair T VCEG Rapporteur | Chair ISO/IEC MPEG Video Rapporteur | Co ISO/IEC MPEG Video Rapporteur | Co- Chair Chair ITU/ISO/IEC JVT Rapporteur | Co ITU/ISO/IEC JVT Rapporteur | Co- Chair Chair July 2005 July 2005

Upload: buithuy

Post on 08-Feb-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

ITU-T VICA Workshop22-23 July 2005, ITU Headquarter, Geneva

International Telecommunication Union

The H.264/MPEGThe H.264/MPEG--4 4 Advanced Video Coding Advanced Video Coding

(AVC) Standard(AVC) StandardGary J. Sullivan, Ph.D.Gary J. Sullivan, Ph.D.

ITUITU--T VCEG Rapporteur | ChairT VCEG Rapporteur | ChairISO/IEC MPEG Video Rapporteur | CoISO/IEC MPEG Video Rapporteur | Co--ChairChair

ITU/ISO/IEC JVT Rapporteur | CoITU/ISO/IEC JVT Rapporteur | Co--Chair Chair

July 2005July 2005

H.264/AVC July ‘05 Gary Sullivan 1

The Advanced Video CodingThe Advanced Video Coding ProjectProjectAVC = ITUAVC = ITU--T H.264 / MPEGT H.264 / MPEG--4 part 104 part 10§ History: ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group)

“H.26L” standardization activity (where the “L” stood for “long-term”)

§ August 1999: 1st test model (TML-1)

§ July 2001: MPEG open call for technology: H.26L demo’ed

§ December 2001: Formation of the Joint Video Team (JVT)between VCEG and MPEG to finalize H.26L as a new joint project (similar to MPEG-2/H.262)

§ July 2002: Final Committee Draft status in MPEG

§ Dec ‘02 technical freeze, FCD ballot approved

§ May ’03 completed in both orgs

§ July ’04 Fidelity Range Extensions (FRExt) completed

§ January ’05 Scalable Video Coding launched

H.264/AVC July ‘05 Gary Sullivan 2

§ Primary technical objectives:• Significant improvement in coding efficiency• High loss/error robustness• “Network Friendliness” (carry it well on MPEG-2 or RTP or

H.32x or in MPEG-4 file format or MPEG-4 systems or …)• Low latency capability (better quality for higher latency)• Exact match decoding

§ Additional version 2 objectives (in FRExt):• Professional applications (more than 8 bits per sample,

4:4:4 color sampling, etc.)• Higher-quality high-resolution video• Alpha plane support (a degree of “object” functionality)

AVC ObjectivesAVC Objectives

H.264/AVC July ‘05 Gary Sullivan 3

§ Same design to be approved in both ITU-T VCEG and ISO/IEC MPEG

§ In ITU-T VCEG this is a new & separate standard• ITU-T Recommendation H.264• ITU-T Systems (H.32x) support it

§ In ISO/IEC MPEG this is a new “part” in the MPEG-4 suite• Separate codec design from prior MPEG-4 visual• New part 10 called “Advanced Video Coding” (AVC – similar to

“AAC” position in MPEG-2 as separate codec)• Not backward or forward compatible with prior standards (incl.

the prior MPEG-4 visual spec. – core technology is different)• MPEG-4 Systems / File Format supports it

§ H.222.0 | MPEG-2 Systems also supports it

Relating to Other ITU & MPEG StandardsRelating to Other ITU & MPEG Standards

H.264/AVC July ‘05 Gary Sullivan 4

§ Test of different standards (ICIP 2002 study)§ Using same rate-distortion optimization techniques for all codecs§ Streaming test: High-latency (included B frames)

• Four QCIF sequences coded at 10 Hz and 15 Hz (Foreman, Container, News, Tempete) and

• Four CIF sequences coded at 15 Hz and 30 Hz (Bus, Flower Garden, Mobile and Calendar, and Tempete)

§ Real-time conversation test: No B frames• Four QCIF sequences encoded at 10Hz and 15Hz (Akiyo, Foreman, Mother and

Daughter, and Silent Voice)• Four CIF sequences encoded at 15Hz and 30Hz (Carphone, Foreman, Paris,

and Sean)§ Compare four codecs using PSNR measure:

• MPEG-2 (in high-latency/streaming test only)• H.263 (high-latency profile, conversational high-compression profile, baseline

profile)• MPEG-4 Visual (simple and advanced simple profiles with & without B pictures)• H.264/AVC (with & without B pictures)

§ Note: These test results are from a private study and are not an endorsed report of the JVT, VCEG or MPEG organizations.

A Comparison of PerformanceA Comparison of Performance

H.264/AVC July ‘05 Gary Sullivan 5

ComparisonComparison to MPEGto MPEG--2, H.263, MPEG2, H.263, MPEG--4p24p2

2728

2930

31

3233

34

3536

3738

39

0 50 100 150 200 250

Bit-rate [kbit/s]

Foreman QCIF 10Hz

QualityY-PSNR [dB]

MPEG-2H.263

MPEG-4 VisualJVT/H.264/AVC

H.264/AVC July ‘05 Gary Sullivan 6

ComparisonComparison to MPEGto MPEG--2, H.263, MPEG2, H.263, MPEG--4p24p2Tempete CIF 30Hz

2526272829303132333435363738

0 500 1000 1500 2000 2500 3000 3500

Bit-rate [kbit/s]

QualityY-PSNR [dB]

MPEG-2H.263

MPEG-4JVT/H.264/AVC

Visual

H.264/AVC July ‘05 Gary Sullivan 7

§ This encoding software may not represent implementation quality

§ These tests only up to CIF (quarter-standard-definition) resolution

§ Interlace, SDTV, and HDTV not tested in this test§ Test sequences may not be representative of the

variety of content encountered by applications§ These tests so far not aligned with profile designs§ This study reports PSNR, but perceptual quality is what

matters

Caution: Your Mileage Caution: Your Mileage WillWill VaryVary

H.264/AVC July ‘05 Gary Sullivan 8

§ New design includes relaxation of traditional bounds on computing resources – rough guess 2-3x the MIPS, ROM & RAM requirements of MPEG-2 for decoding, 3-4x for encoding

§ Particularly an issue for low-power (e.g., mobile) devices§ Problem areas:

• Smaller block sizes for motion compensation (cache access issues)

• Longer filters for motion compensation (more memory access)• Multi-frame motion compensation (more memory for reference

frame storage)• In-loop deblocking filter (more processing & memory access)• More segmentations of macroblock to choose from (more

searching in the encoder)• More methods of predicting intra data (more searching)• Arithmetic coding (adaptivity, computation on output bits)

Computing Resources for the New DesignComputing Resources for the New Design

H.264/AVC July ‘05 Gary Sullivan 9

AVC StructureAVC Structure

EntropyCoding

Scaling & Inv. Transform

Motion-Compensation

ControlData

Quant.Transf. coeffs

MotionData

Intra/Inter

CoderControl

Decoder

MotionEstimation

Transform/Scal./Quant.-

InputVideoSignal

Split intoMacroblocks16x16 pixels

Intra-frame Prediction

DeblockingFilter

OutputVideoSignal

H.264/AVC July ‘05 Gary Sullivan 10

EntropyCoding

Scaling & Inv. Transform

Motion-Compensation

ControlData

Quant.Transf. coeffs

MotionData

Intra/Inter

CoderControl

Decoder

MotionEstimation

Transform/Scal./Quant.-

InputVideoSignal

Split intoMacroblocks16x16 pixels

Intra-frame Prediction

De-blockingFilter

OutputVideoSignal

Motion Compensation AccuracyMotion Compensation Accuracy

Motion vector accuracy 1/4 sample(6-tap filter)

8x8

0

4x8

0 10 1

2 3

4x48x4

1

08x8Types

0

16x16

0 1

8x16MB

Types

8x80 1

2 3

16x8

1

0

H.264/AVC July ‘05 Gary Sullivan 11

EntropyCoding

Scaling & Inv. Transform

Motion-Compensation

ControlData

Quant.Transf. coeffs

MotionData

Intra/Inter

CoderControl

Decoder

MotionEstimation

Transform/Scal./Quant.-

InputVideoSignal

Split intoMacroblocks16x16 pixels

Intra-frame Prediction

De-blockingFilter

OutputVideoSignal

MotionData

OutputVideoSignal

Multiple Reference FramesMultiple Reference Frames

§Multiple Reference Frames§ Generalized B Frames

H.264/AVC July ‘05 Gary Sullivan 12

EntropyCoding

Scaling & Inv. Transform

Motion-Compensation

ControlData

Quant.Transf. coeffs

MotionData

Intra/Inter

CoderControl

Decoder

MotionEstimation

Transform/Scal./Quant.-

InputVideoSignal

Split intoMacroblocks16x16 pixels

Intra-frame Prediction

De-blockingFilter

OutputVideoSignal

IntraIntra PredictionPrediction§ Directional spatial prediction

(9 types for luma, 1 chroma)

• e.g., Mode 3: diagonal down/right predictiona, f, k, p are predicted by (A + 2Q + I + 2) >> 2

1

2

3456

7

8

0

Q A B C D E F G HI a b c dJ e f g hK i j k lL m n o pMNOP

H.264/AVC July ‘05 Gary Sullivan 13

EntropyCoding

Scaling & Inv. Transform

Motion-Compensation

ControlData

Quant.Transf. coeffs

MotionData

Intra/Inter

CoderControl

Decoder

MotionEstimation

Transform/Scal./Quant.-

InputVideoSignal

Split intoMacroblocks16x16 pixels

Intra-frame Prediction

De-blockingFilter

OutputVideoSignal

TransformTransform CodingCoding

§ 4x4 Block Integer Transform

§ Repeated transform of DC coeffsfor 8x8 chroma and 16x16 Intra luma blocks

1 1 1 12 1 1 2

1 1 1 11 2 2 1

− − = − − − −

H

H.264/AVC July ‘05 Gary Sullivan 14

Deblocking FilterDeblocking Filter

EntropyCoding

Scaling & Inv. Transform

Motion-Compensation

ControlData

Quant.Transf. coeffs

MotionData

Intra/Inter

CoderControl

Decoder

MotionEstimation

Transform/Scal./Quant.-

InputVideoSignal

Split intoMacroblocks16x16 pixels

Intra-frame Prediction

DeblockingFilter

OutputVideoSignal

H.264/AVC July ‘05 Gary Sullivan 15

Entropy CodingEntropy Coding

Deq./Inv. Transform

Motion-Compensated

Predictor

ControlData

Quant.Transf. coeffs

MotionData

0

Intra/Inter

CoderControl

Decoder

MotionEstimator

Transform/Quantizer-

EntropyCoding

H.264/AVC July ‘05 Gary Sullivan 16

§ Three profiles in version 1: Baseline, Main, and Extended§ Baseline (esp. Videoconferencing & Wireless)

• I and P progressive-scan picture coding (not B)• In-loop deblocking filter• 1/4-sample motion compensation• Tree-structured motion segmentation down to 4x4 block size• VLC-based entropy coding• Some enhanced error resilience features

– Flexible macroblock ordering/arbitrary slice ordering– Redundant slices

AVC Version 1 ProfilesAVC Version 1 Profiles

H.264/AVC July ‘05 Gary Sullivan 17

§ Main Profile (esp. Broadcast)• All Baseline features except enhanced error resilience features• Interlaced video handling• Generalized B pictures• Adaptive weighting for B and P picture prediction• CABAC (arithmetic entropy coding)

§ Extended Profile (esp. Streaming)• All Baseline features• Interlaced video handling• Generalized B pictures• Adaptive weighting for B and P picture prediction• More error resilience: Data partitioning• SP/SI switching pictures

NonNon--Baseline AVC Version 1 ProfilesBaseline AVC Version 1 Profiles

H.264/AVC July ‘05 Gary Sullivan 18

Amendment 1: FidelityAmendment 1: Fidelity--Range ExtensionsRange Extensions

§ AVC standard finished 2003• ITU-T/H.264 finalized May, 2003• MPEG-4 AVC finalized July, 2003 (same thing)• Only corrigenda (bug fixes) since then

§ Fidelity-Range Extensions (FRExt)• New work item initiated in July 2003• More than 8 bits, color other than 4:2:0• Alpha coding• More coding efficiency capability• Also new supplemental information

H.264/AVC July ‘05 Gary Sullivan 19

FRExtFRExt Finished July 04Finished July 04

§ Project initiated July 2003• Motivations

– Higher quality, higher rates – 4:4:4, 4:2:2, and also 4:2:0– 8, 10, or 12 bits (14 bits considered and not

included)– Lossless– Stereo

§ Finished in one year! (July 04)

H.264/AVC July ‘05 Gary Sullivan 20

New Things in New Things in FRExtFRExt –– Part 1Part 1

§ Larger transforms• 8x8 transform (again!)• Drop 4x8, 8x4, or larger, 16-point…

§ Filtered intra prediction modes for 8x8 block size§ Quantization matrix

• 4x4, 8x8, intra, inter trans. coefficients weighted differently• Old idea, dating to JPEG and before (circa 1986?)• Full capabilities not yet explored (visual weighting)

§ Coding in various color spaces• 4:4:4, 4:2:2, 4:2:0, Monochrome, with/without Alpha• New integer color transform (a VUI-message item)

H.264/AVC July ‘05 Gary Sullivan 21

New Things in New Things in FRExtFRExt –– Part 2Part 2

§ Efficient lossless interframe coding§ Film grain characterization for analysis/synthesis

representation§ Stereo-view video support§ Deblocking filter display preference

H.264/AVC July ‘05 Gary Sullivan 22

8x8 168x8 16--Bit (Bit (BossenBossen) Transform) Transform

−−−−−−−−

−−−−−−−−

−−−−−−−−

−−−−

3610121210634884488461231010312688888888

10312661231084488448

12106336101288888888

H.264/AVC July ‘05 Gary Sullivan 23

8x8 Transform Advantage8x8 Transform Advantage(JVT(JVT--K028, IBBP coding, K028, IBBP coding, progprog. scan). scan)

12.48Average

15.65Riverbed

10.93Crawford

13.46Movie 5

11.06Movie 4

12.01Movie 3

12.71Movie 2

11.59Movie 1

% BD bit-rate reductionSequence

H.264/AVC July ‘05 Gary Sullivan 24

Quantization MatrixQuantization Matrix

§ Similar concept to MPEG-2 design§ Vary step size based on frequency§ Adapted to modified transform structure§ More efficient representation of weights§ Eight downloadable matrices (at least 4:2:0)

• Intra 4x4 Y, Cb, Cr• Intra 8x8 Y• Inter 4x4 Y, Cb, Cr• Inter 8x8 Y

H.264/AVC July ‘05 Gary Sullivan 25

New Profiles Created by New Profiles Created by FRExtFRExt

§ 4:2:0, 8-bit: “High” (HP)§ 4:2:0, 10-bit: “High 10” (Hi10)§ 4:2:2, 10-bit: “High 4:2:2” (Hi422)§ 4:4:4, 12-bit: “High 4:4:4” (Hi444)

§ Effectively the same tools, but acting on different input data (with a couple of exceptions in the 4:4:4 profile)

H.264/AVC July ‘05 Gary Sullivan 26

Some Notes on Quality TestingSome Notes on Quality Testing

§ Use appropriate “High” profile (incl. adaptive transform)§ If testing for PSNR, use “flat” quant matrices§ Otherwise, use “non-flat” quant matrices§ Use more than 1 or 2 reference pictures§ Use hierarchical reference frames coding structure§ Use CABAC entropy coding§ If testing high-quality PSNR, use adaptive quantization§ Use rate-distortion optimization in encoder§ Use large-range good-quality motion search

*

* = See G. Sullivan & S. Sun, “On Dead-Zone…”, VCIP 2005/JVT-N011

H.264/AVC July ‘05 Gary Sullivan 27

A Performance Test for High Profile A Performance Test for High Profile (from JVT(from JVT--L033 L033 -- Panasonic)Panasonic)

§ Subjective tests by Blu-Ray Disk Founders of FRExt HP

• 4:2:0/8 (HP) 1920x1080x24p (1080p), 3 clips. • Nominal 3:1 advantage to MPEG-2

– 8 Mbps HP scored better than 24 Mbps MPEG-2• Apparent transparency at 16 Mbps

H.264/AVC FRExt 8Mbps

Original MPEG2 24 Mbs, DVHS

emulation

Mean OpinionScore

2

2.5

3

3.5

4

4.5

5

H.264/AVC FRExt

12Mbps

H.264/AVC FRExt

16Mbps

H.264/AVC FRExt

20Mbps

3.65 3.71 4.00

4.03 3.90

3.59

Figure 1: Results of subjective test with studio participants (Blu-Ray Disk Founders)

5: Perfect4: Good3: Fair (OK for DVD)2: Poor1: Very Poor

H.264/AVC July ‘05 Gary Sullivan 28

§ JVT, MPEG, and VCEG management team members:• Gary Sullivan ([email protected])• Jens Ohm ([email protected]) • Ajay Luthra ([email protected])• Thomas Wiegand ([email protected])

§ IEEE Transactions on Circuits and Systems for Video Technology Special Issue on H.264/AVC (July 2003)

§ Paper in Proceedings of IEEE January 2005§ I. Richardson, H.264 and MPEG-4 Video Compression§ Overview including FRExt: SPIE Aug 2004 (Sullivan, Topiwala, and

Luthra)§ Paper at VCIP 2005: Meta-overview and deployment

For Further InformationFor Further Information