enm172 h264 other coding techniques

8/8/2019 Enm172 H264 Other Coding Techniques

1/10

ENM172 Multimedia Coding School of Engineering

Dr. Yafan Zhao [email protected] Page 1 of 10

H.264 Other coding techniques

Note: The following notes are extracted from a new H.264 book written by IainRichardson. Copy right belongs to Iain Richardson 2009.Acknowledgement: I would like to express my appreciation to Iain Richardson forkindly providing the following text as lecture notes.

1. Introduction

This document describes a number of coding techniques for encoder.

The number of bits produced for each MB and frame is not constant due to the variation of

sequence characteristics. This causes problems when transmitting the coded sequence through a

Constant Bitrate Channel (that is available bandwidth is fixed). Rate-Control algorithm is

required to control the bitrate of the coded sequence.

H.264 supports various MB modes and an encoder has to choose the right mode for each MB in

order to achieve the best coding performance. Rate-distortion Optimisation (RDO) is used in

Mode Selection process to achieve maximised video quality and minimised bitrate.

H.264 is a block-based video coding and artefacts, such as blockness can appear in the

compressed video. Therefore a Deblocking Filter is applied to every decoded macroblock in

order to reduce blocking distortion.

2. Rate Control

The number of bits produced when an encoder codes a macroblock is not constant. For example,

Figure 0-1 plots the number of bits per macroblock in a frame of Foreman coded as a P slice.Lighter blocks are MBs with more bits, darker blocks contain fewer bits. Typically, more bits are

required to code MBs that contain significant movement and/or detail, since these contain non-

zero motion vector differences and non-zero transform coefficients.

Figure 0-1 Frame from "Foreman" sequence showing macroblock sizes


2/10



In a similar way, the number of bits per coded frame is not constant. If all encoding parametersare kept constant, variations in motion and detail cause the bitrate to vary. Practical applications

of H.264/AVC require a constant bitrate output, or at least a constrained bitrate output. Someexamples are listed in Table 0-1.

Table 0-1 Bitrate and delay constraints

Application Bitrate and delay constraints

Video broadcast over fixed

bitrate channel

Constant bitrate, medium delay

IP video streaming Variable bitrate (within limits), medium delay

IP videoconferencing Variable bitrate (within limits), low delay

DVD recording Variable bitrate (within limits), medium delay, fixed

maximum file size

Controlling the output bitrate is typically achieved by measuring the rate and/or the encoder

buffer fullness level and feeding this back to control the encoder (Figure 0-2). Many of the

encoder parameters can affect output bitrate (e.g. type of slice, motion search range, mode

selection algorithm) but the most useful parameter for bitrate control is the Quantizer Parameter(QP).

Figure 0-2 Encoder with rate feedback

One way of controlling bitrate is simply to try and enforce a constant number of bits per coded

frame, by measuring the output bitrate and feeding it back to control QP. Increasing QP will tend

to reduce coded bitrate and decreasing QP will increase coded bitrate. However, this approach isproblematic because (i) it does not take into account the fact that coded I, P and B slices generate

significantly different numbers of bits and (ii) it will tend to lead to unpleasant variations in

image quality as the encoder increases or decreases QP rapidly to try and maintain bitrate.

A more flexible approach is outlined in Figure 0-3. The available channel bitrate (in bits per

second) is used to determine a target number of bits for a Group of Pictures (GOP), typically an I

slice followed a number of P and/or B slices. The bits available for the GOP are then allocated to

I, P and B slices, with the allocation changing depending on the slice type. An I slice would

typically be allocated most bits (because intra prediction tends to be less efficient than inter

prediction), followed by P slices and then B slices. Within each slice, a certain number of bits are

allocated to each macroblock. The rate control algorithm then attempts to control the encoder toproduce the target number of bits.


3/10



Figure 0-3 Bitrate allocation for rate control

Example: Foreman, QCIF, 100 frames

100 frames of the Foreman QCIF sequence were encoded at a frame rate of 10 frames per secondusing Baseline profile, coded as one I-slice followed by P-slices with a target bit rate of 26 kbps.

Figure 0-4 shows the coded bitrate. After the first (I) slice, the encoder maintains a roughlyconstant number of bits per frame.

Foreman is a ten-second clip that contains a relatively high amount of motion, particularly in

the last 2-3 seconds. Figure 0-5 plots the variation of QP throughout the sequence. The large

variation, particularly in the final seconds, is necessary in order to compensate for the changing

motion and detail. This variation in QP leads to a variation in per-frame quality, measured as

PSNR (Y) in Figure 0-6. As the QP increases, PSNR decreases and vice versa.

This example illustrates the classic trade-off of video codec rate control : a constant or near-constant bitrate typically is achieved at the expense of varying decoded quality.


4/10



Figure 0-4 Foreman, QCIF, 100 frames: coded bitrate

Figure 0-5 Foreman, QCIF, 100 frames: QP per frame


5/10



Figure 0-6 Foreman, QCIF, 100 frames: Luma PSNR per frame

3. Mode selection

An H.264/AVC encoder can choose from many different options or modes when it codes a

macroblock. Figure 0-7 shows the main prediction choices for a macroblock These include:

Skip mode (dont send any information for this macroblock)

Four intra-16x16 modes

Nine intra-4x4 modes, with a different choice possible for each 4x4 block

16x16 inter mode with prediction from reference picture(s) from one (P, B MB) or two (BMB) lists

8x16 inter mode (prediction from multiple reference pictures as above, with the option ofdifferent reference picture(s) for each partition)

16x8 inter mode (as above)

8x8 inter mode (as above), with further sub-division of each 8x8 partition into 8x4, 4x8 or4x4 sub macroblock partitions.

As well as the choice of prediction mode, the encoder can choose to change the quantizationparameter (QP); within each inter mode the encoder has a wide choice of possible motion

vectors; and so on. There are a huge number of options for coding each macroblock. Each codingmode (i.e. each combination of coding parameters) will tend to generate a different number of

coded bits, ranging from very low (P-Skip or B-Skip) to high (Intra) and a different distortion or

(conversely) reconstructed quality.


6/10



A video encoder aims to minimize coded bitrate and maximise decoded quality (or minimize

decoded distortion). However, choosing the coding mode of a macroblock to achieve this is adifficult problem, because of (a) the huge number of possible combinations of encoding

parameters and (b) the question of deciding the best tradeoff between minimizing bitrate andminimizing distortion. Rate-distortion Optimisation (RDO) is used to select the best mode which

providing maximised video quality and minimised bitrate. RDO is carried out for each MB.

In JM software, a number of RDO methods have been implemented, which includes Low

complexity RDO, High complexity RDO and Fast high complexity RDO. In high complexity

RDO, the actual distortion and bitrate of each mode of a MB are calculated by carrying out

decoding process and they are used in the mode selection process to choose the best mode for a

MB. This is very accurate method as the actual bitrate and distortion are used in the selectionprocess, but this increases the complexity of encoder dramatically. Compared with High

complexity RDO, Low complexity RDO uses an estimation of bitrate and distortion of each

mode rather than the actual value. This methods saves significant time on decoding each MBcoded at various modes, but the performance decreases too. Fast high complexity RDO is based

on high complexity RDO and uses methods and techniques to perform early prediction of the bestmode for each MB. This method is trying to reduce the complexity of High complexity RDO

with very small or negligible loss of R-D performance.


7/10



Figure 0-7 Available macroblock prediction modes

3. Deblocking Filter

In H.264, a filter is applied to every decoded macroblock in order to reduce blocking distortion.

The deblocking filter is applied after the inverse transform in the encoder (before reconstructing

and storing the macroblock for future predictions) and in the decoder (before reconstructing and

displaying the macroblock). The filter has two benefits: (1) block edges are smoothed, improving

the appearance of decoded images (particularly at higher compression ratios) and (2) the filteredmacroblock is used for motion-compensated prediction of further frames in the encoder, resultingin a smaller residual after prediction. (Note: intra-coded macroblocks are filtered, but intra

prediction is carried out using unfiltered reconstructed macroblocks to form the prediction).

Picture edges are not filtered.

Filtering is applied to vertical or horizontal edges of 4x4 blocks in a macroblock, in the following

order:

1. Filter 4 vertical boundaries of the luma component (in order a,b,c,d in Figure 0-8)

2. Filter 4 horizontal boundaries of the luma component (in order e,f,g,h, Figure 0-8)3. Filter 2 vertical boundaries of each chroma component (i,j)


8/10



4. Filter 2 horizontal boundaries of each chroma component (k,l)

Each filtering operation affects up to three pixels on either side of the boundary. Figure 0-9

shows 4 pixels on either side of a vertical or horizontal boundary in adjacent blocks p and q(p0,p1,p2,p3 and q0,q1,q2,q3). Depending on the current quantizer, the coding modes of

neighbouring blocks and the gradient of image samples across the boundary, several outcomes

are possible, ranging from (a) no pixels are filtered to (b) p0, p1, p2, q0, q1, q2 are filtered to

produce output pixels P0, P1, P2, Q0, Q1 and Q2.

a b c d

e

f

g

h

i j

k

l

Boundary filtering: 16x16

luma

8x8 chroma

Figure 0-8 Edge filtering order in a macroblock

p3 p2 p1

p0

q0 q1 q2 q3

Vertical boundary

p0

p1

p2

p3

q0

q1

q2

q3

Horizontal

boundary

Figure 0-9 Pixels adjacent to vertical and horizontal boundaries

Filtering example

A video clip is encoded using the AVC reference software with a fixed Quantization Parameter of36 (relatively high quantization). Figure 0-10 shows an original frame from the clip; Figure 0-11

shows the same frame after inter coding and reconstruction, with the loop filter disabled. Note the

obvious blocking artefacts; note also the effect of varying motion-compensation block sizes (for

example, 16x16 blocks in the background to the left of the picture; 4x4 blocks around the arm).


9/10



With the loop filter enabled (Figure 0-12) the appearance is considerably better; there is still

some obvious distortion but most of the block edges have disappeared or faded. Note that sharpcontrast boundaries (such as the line of the arm against the dark piano) are preserved by the filter

whilst block edges in smoother regions of the picture (such as the background to the left) aresmoothed.

In this example the loop filter makes only a small contribution to compression efficiency: the

encoded bitrate is around 1.5% smaller and the PSNR around 1% larger for the sequence with the

filter. However, the subjective quality of the filtered sequence is significantly better. The coding

performance gain provided by the filter depends on the bitrate and sequence content.

Figure 0-10 Original frame (violin frame 2)

Figure 0-11 Reconstructed, QP=36 (no filter)


10/10



Figure 0-12 Reconstructed, QP=36 (with filter)

enm172 h264 other coding techniques

Documents