overview: image and video coding standards · 2007-05-08 · bernd girod: ee398b image...
TRANSCRIPT
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 1
Overview: Video Coding Standards
Video coding standards: applications and common structureRelevant standards organizationsITU-T Rec. H.261 ITU-T Rec. H.263ISO/IEC MPEG-1 ISO/IEC MPEG-2ISO/IEC MPEG-4Recent progress: H.264/AVC
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 2
ITU-T Q.6/SG16 (VCEG - Video Coding Experts Group) formed for ITU-T standardization activity for video compression since 1997August 1999: 1st test model (TML-1) of H.26LDecember 2001: Formation of the Joint Video Team (JVT)Joint Video Team (JVT)between VCEG and ISO/IEC JTC 1/SC 29/WG 11 (MPEG)to establish a joint standard project - H.264 / MPEG4H.264 / MPEG4--AVCAVCITU-T Approval: May 2003ISO/IEC Approval: October 2003
The The JVT JVT ProjectProject
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 3
JVT Goals
Improved coding efficiencyAverage bit rate reduction of 50% given fixed fidelity compared to any other standardTrade-off complexity vs. coding efficiency
Improved network friendlinessAnticipate error-prone transport over mobile networks and the wired and wireless InternetFurther improve robustness techniques in H.263 and MPEG-4
Simple syntax specification Avoid excessive quantity of optional features Minimize number of “profiles” for distinct application areas
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 4
Entertainment Video Broadcast: Terrestial / Satellite / Cable . . . Storage: DVD / HD-DVD / PVR . . .
Conversational ServicesH.320 Conversational3GPP Conversational H.324/MH.323 Conversational Internet/best effort IP/RTP 3GPP Conversational IP/RTP/SIP
Video Streaming3GPP Streaming IP/RTP/RTSPStreaming IP/RTP/RTSP (without TCP fallback)
Other Applications3GPP Multimedia Messaging ServicesDigital camcorder
H.264/JVT Applications
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 5
Identical specifications have been approved in both ITU-T / VCEG and ISO/IEC / MPEGIn ITU-T / VCEG this is a new & separate standard
ITU-T Recommendation H.264ITU-T Systems (H.32x) will be modified to support it
In ISO/IEC / MPEG this is a new “part” in the MPEG-4 suiteSeparate codec design from prior MPEG-4 visualNew Part 10 called “Advanced Video Coding” (AVC – similar to “AAC”in MPEG-2 as separate audio codec)MPEG-4 Systems / File Format has been modified to support itH.222.0 | MPEG-2 Systems also modified to support it
IETF: RTP payload packetization
Relationship to Other Standards
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 6
H.264/AVC Profiles
Baseline: core compression capabilities, plus error resilience, e.g., for videoconferencing, mobile videoMain: high compression and quality, e.g., for broadcastingExtended: added features for efficient streaming
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 7
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
DeblockingFilter
OutputVideoSignal
H.264/AVC Coder
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 8
Input Video Signal
• Progressive and interlaced frames can be coded as one unit
• Progressive vs. interlace frame is signaled but has no impact on decoding
• Each field can be coded separately
• Dangling fields
ProgressiveFrame
TopField
BottomField
Interlaced Frame (Top Field First)
Δt
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 9
Partitioning of the Picture
Slices: • A picture is split into 1 or several slices
• Slices are self-contained• Slices are a sequence of macroblocks
Macroblocks:• Basic syntax & processing unit• Contains 16x16 luma samples and 2 x 8x8 chroma samples
• Macroblocks within a slice depend on each other
• Macroblocks can be further partitioned
0 1 2 …
Slice #0
Slice #1
Slice #2
Macroblock #40
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 10
Flexible Macroblock Ordering (FMO)
Slice Group #0 Slice Group #1
Slice Group #2
Slice Group: • Pattern of macroblocks defined by a Macroblock allocation map
• A slice group may contain 1 to several slices
Macroblock allocation map types:• Interleaved slices• Dispersed macroblock allocation• Explicitly assign a slice group to each macroblock location inraster scan order
• One or more “foreground” slice groups and a “leftover” slicegroup
Slice Group #0
Slice Group #1
Slice Group #0
Slice Group #1
Slice Group #2
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 11
Interlaced Processing
Field coding: each field is coded as a separate picture using fields for motion compensation
Frame coding:• Type 1: the complete frame is coded as a separate picture
• Type 2: the frame is scanned as macroblock pairs, for each macroblock pair: switch between frame and field coding
Macroblock Pair
0 21 3
45
3637
……
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 12
Scanning of a Macroblock
0 1
2 3
Coded Block Pattern for Luma in 8x8 block order:signals which of the 8x8 blocks contains at least one 4x4 block with non-
zero transform coefficients Luma 4x4 block order for 4x4 intra prediction and
4x4 residual coding
Chroma 4x4 block order for 4x4 residual coding, shown as
16-25, and intra 4x4 prediction, shown as 18-21
and 22-25
10 4 5
2 3 6 7
8 9 12 13
10 11 14 15
2x2 DC
AC
Cb Cr16 17
-1
...
Intra_16x16 macroblock type only: Luma 4x4 DC
18 19
20 21
22 23
24 25
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 13
EntropyCoding
Scaling & Inv. Transform
MotionCompensation
ControlData
Quant.Transf. coeffs
Intra Prediction
Data
Intra/InterMB select
CoderControl
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
DeblockingFilter
OutputVideoSignal
Intra-frame Estimation
MotionData
H.264/AVC Coder
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 14
Common Elements with other Standards
Macroblocks: 16x16 luma + 2 x 8x8 chroma samplesInput: Association of luma and chroma and conventional sub-sampling of chroma (4:2:0)Block-wise motion compensationMotion vectors over picture boundariesVariable block-size motionBlock transformsScalar quantizationI, P, and B coding types
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 15
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
Motion vector accuracy 1/4 (6-tap filter)
8x8
0
4x8
0 10 12 3
4x48x4
108x8
Types
0
16x16
0 1
8x16MB
Types
8x80 12 3
16x8
1
0
H.264 Motion Compensation Accuracy
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 16
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
MotionData
OutputVideoSignal
Multiple Reference FramesGeneralized B FramesWeighted Prediction
H.264 Multiple Reference Frames
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 17
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
Directional spatial prediction (9 types for luma, 1 chroma)
• e.g., Mode 3: diagonal down/right predictiona, f, k, p are predicted by (A + 2Q + I + 2) >> 2
1
2
3456
7
8
0
Q A B C D E F G HI a b c dJ e f g hK i j k lL m n o p
H.264 Intra Prediction
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 18
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
4x4 Block Integer Transform
Repeated transform of DC coeffsfor 8x8 chroma and some 16x16 Intra luma blocks
1 1 1 12 1 1 21 1 1 11 2 2 1
⎡ ⎤⎢ ⎥− −⎢ ⎥=⎢ ⎥− −⎢ ⎥
− −⎢ ⎥⎣ ⎦
H
H.264 4x4 Transform
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 19
Quantization of Transform Coefficients
Scalar quantizationLogarithmic step size controlSmaller step size for chroma (per H.263 Annex T)Extended range of step sizesCan change to any step size at macroblock levelQuantization reconstruction is one multiply, one add, one shift
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 20
Deblocking Filter
Improves subjective quality and PSNR of the decoded pictureSignificantly superior to post filteringFiltering affects the edges of the 4x4 block structureAdaptive filtering removes blocking artifacts, but does not unnecessarily blur the visual content
On slice level, the global filtering strength can be adjusted tothe individual characteristics of the video sequenceOn edge level, filtering strength is made dependent on inter/intra, motion, and coded residualsOn sample level, quantizer dependent thresholds can turn off filtering for every individual sampleSpecially strong filter for macroblocks with very flat characteristics almost removes “tiling artifacts”
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 21
Deblocking Filter
One dimensional visualization of an edge position
Filtering of p0 and q0 only takes place if:
1. |p0 - q0| < α(QP)
2. |p1 - p0| < β(QP)
3. |q1 - q0| < β(QP)
Where β(QP) is considerably smaller than α(QP)
Filtering of p1 or q1 takes place if additionally :
1. |p2 - p0| < β(QP) or |q2 - q0| < β(QP)
(QP = quantization parameter)4x4 Block Edge
p0
q0
p1
p2
q1
q2
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 22
Deblocking: Subjective Result for Intra
Without Filter With H264/AVC Deblocking
Highly compressed first decoded intra picture at 0.28 bit/sample
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 23
Deblocking: Subjective Result for Inter
Without Filter With H264/AVC Deblocking
Highly compressed decoded inter picture
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 24
EntropyCoding
Inv. Scal. & Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
Entropy coding
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 25
Variable length coding
Exp-Golomb code for almost all symbols except for transform coefficientsContext adaptive VLCs for coding of transform coefficients
Number of coefficients is decodedSpecial treatment of values +1 and -1Contexts are built dependent on transform coefficients
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 26
Context modeling
Binarization Probability estimation
Coding engine
update probability estimation
Adaptive binary arithmetic coder
Chooses a model conditioned on
past observations
Maps non-binary symbols to a
binary sequence
Uses the provided model for the actual encodingand updates the model
Context-Adaptive Arithmetic Coding (CABAC)
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 27
S Pictures
General descriptionAllows identical reconstruction of frames even when different reference frames are being usedSP pictures use of motion-compensated predictionSI pictures can exactly approximate SP pictures
ApplicationsBitstream switching or splicingRandom accessFast-forward, fast-backwardError recovery and/or resiliencyResynchronization such as in Video Redundancy Coding
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 28
EntropyDecoding
Scaling & Inv. Transform
Motion-Compensation
MotionData
Intra/Inter
MotionEstimation
Transform
+
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
Scaling QuantizationQuant.Transf. coeffs
ControlData
l rec
l pred
SP and SI Pictures
[source: G. Sullivan, VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 29
Comparison of H.264 to MPEGComparison of H.264 to MPEG--44
MPEG-4: Advanced Simple Profile (ASP)Motion Compensation: 1/4 pelGlobal Motion Compensation
H.264:Motion Compensation: 1/4 pelUsing CABAC entropy coding5 reference frames (News: 17)
BothSequence structure IBBPBBP...QPB=QPP+2 (step size: +25%)Search range: 32x32 around 16x16 predictorLagrangian D+λR coder control
[source: ITU-T VCEG]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 30
RD Curves: RD Curves: FForemanoreman (QCIF, 10Hz)(QCIF, 10Hz)
26
2728
29
3031
32
33
3435
3637
3839
0 16 32 48 64 80 96 112 128
Bit-rate [kbit/s]
Ave
rage
PSN
R(Y
) [dB
]
MPEG-4
H.26L
[source: ITU-T VCEG]
>30%
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 31
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 32
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 33
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 34
Average bit-rate savings relative to:
Coder MPEG-4 ASP H.263 HLP MPEG-2
H.264/AVC MP 37.44% 47.58% 63.57%
MPEG-4 ASP - 16.65% 42.95%
H.263 HLP - - 30.61%
Performance Streaming Application
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 35
Tem pete CIF 15Hz
242526272829303132333435363738
0 256 512 768 1024 1280 1536 1792Bit-rate [kbit/s]
Y-PS
NR
[dB
]
MPEG-2 H.263 HLPMPEG-4 ASPH.264/AVC MPTest Points
Example Streaming Test ResultExample Streaming Test Result
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 36
Tem pete CIF 15Hz
0%
10%
20%
30%
40%
50%
60%
70%
80%
26 28 30 32 34 36 38Y-PSNR [dB]
Rat
e sa
ving
rela
tive
to M
PEG
-2
H.263 HLP
H.264/AVC MP
MPEG-4 ASP
Example Streaming Test ResultExample Streaming Test Result
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 37
Test Results for RealTest Results for Real--Time ConversationTime Conversation
Average bit-rate savings relative to:
Coder H.263 CHC MPEG-4 SP H.263 Base
H.264/AVC BP 27.69% 29.37% 40.59%
H.263 CHC - 2.04% 17.63%
MPEG-4 SP - - 15.69%
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 38
Paris CIF 15Hz
24252627282930313233343536373839
0 128 256 384 512 640 768Bit-rate [kbit/s]
Y-PS
NR
[dB
]
H.263-Base H.263 CHCMPEG-4 SP H.264/AVC BPTest Points
[Wiegand, et al. 2003]
Example RealExample Real--Time Conversation ResultTime Conversation Result
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 39
Paris CIF 15Hz
0%
10%
20%
30%
40%
50%
24 26 28 30 32 34 36 38Y-PSNR [dB]
Rat
e sa
ving
rela
tive
toH
.263
-Bas
elin
e
H.264/AVC BP
H.263 CHC
MPEG-4 SP
Example RealExample Real--Time Test ResultTime Test Result
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 40
Test Results EntertainmentTest Results Entertainment--Quality ApplicationsQuality Applications
Average bit-rate savings relative to:
Coder MPEG-2
H.264/AVC MP 45%
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 41
Entertainment SD (720x576i) 25Hz
24252627282930313233343536373839
0 1 2 3 4 5 6 7 8 9 10Bit-rate [Mbit/s]
Y-PS
NR
[dB
]
MPEG-2
H.264/AVC MP
Example EntertainmentExample Entertainment--Quality Applications ResultQuality Applications Result
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 42
Entertainment SD (720x576i) 25Hz
0%
10%
20%
30%
40%
50%
60%
26 28 30 32 34 36 38Y-PSNR [dB]
Rat
e sa
ving
rela
tive
to M
PEG
-2
H.264/AVC MP
Example EntertainmentExample Entertainment--Quality Applications ResultQuality Applications Result
[Wiegand, et al. 2003]
Bernd Girod: EE398B Image Communication II Video Coding Standards: H.264/AVC no. 43
Further reading
IEEE Transactions on Circuits and Systems for Video Technology, Special Issue on the H.264/JVC Video Coding Standard, July 2003.