enm172 h264 other coding techniques

Upload: susheel-kumar

Post on 09-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    1/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 1 of 10

    H.264 Other coding techniques

    Note: The following notes are extracted from a new H.264 book written by IainRichardson. Copy right belongs to Iain Richardson 2009.Acknowledgement: I would like to express my appreciation to Iain Richardson forkindly providing the following text as lecture notes.

    1. Introduction

    This document describes a number of coding techniques for encoder.

    The number of bits produced for each MB and frame is not constant due to the variation of

    sequence characteristics. This causes problems when transmitting the coded sequence through a

    Constant Bitrate Channel (that is available bandwidth is fixed). Rate-Control algorithm is

    required to control the bitrate of the coded sequence.

    H.264 supports various MB modes and an encoder has to choose the right mode for each MB in

    order to achieve the best coding performance. Rate-distortion Optimisation (RDO) is used in

    Mode Selection process to achieve maximised video quality and minimised bitrate.

    H.264 is a block-based video coding and artefacts, such as blockness can appear in the

    compressed video. Therefore a Deblocking Filter is applied to every decoded macroblock in

    order to reduce blocking distortion.

    2. Rate Control

    The number of bits produced when an encoder codes a macroblock is not constant. For example,

    Figure 0-1 plots the number of bits per macroblock in a frame of Foreman coded as a P slice.Lighter blocks are MBs with more bits, darker blocks contain fewer bits. Typically, more bits are

    required to code MBs that contain significant movement and/or detail, since these contain non-

    zero motion vector differences and non-zero transform coefficients.

    Figure 0-1 Frame from "Foreman" sequence showing macroblock sizes

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    2/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 2 of 10

    In a similar way, the number of bits per coded frame is not constant. If all encoding parametersare kept constant, variations in motion and detail cause the bitrate to vary. Practical applications

    of H.264/AVC require a constant bitrate output, or at least a constrained bitrate output. Someexamples are listed in Table 0-1.

    Table 0-1 Bitrate and delay constraints

    Application Bitrate and delay constraints

    Video broadcast over fixed

    bitrate channel

    Constant bitrate, medium delay

    IP video streaming Variable bitrate (within limits), medium delay

    IP videoconferencing Variable bitrate (within limits), low delay

    DVD recording Variable bitrate (within limits), medium delay, fixed

    maximum file size

    Controlling the output bitrate is typically achieved by measuring the rate and/or the encoder

    buffer fullness level and feeding this back to control the encoder (Figure 0-2). Many of the

    encoder parameters can affect output bitrate (e.g. type of slice, motion search range, mode

    selection algorithm) but the most useful parameter for bitrate control is the Quantizer Parameter(QP).

    Figure 0-2 Encoder with rate feedback

    One way of controlling bitrate is simply to try and enforce a constant number of bits per coded

    frame, by measuring the output bitrate and feeding it back to control QP. Increasing QP will tend

    to reduce coded bitrate and decreasing QP will increase coded bitrate. However, this approach isproblematic because (i) it does not take into account the fact that coded I, P and B slices generate

    significantly different numbers of bits and (ii) it will tend to lead to unpleasant variations in

    image quality as the encoder increases or decreases QP rapidly to try and maintain bitrate.

    A more flexible approach is outlined in Figure 0-3. The available channel bitrate (in bits per

    second) is used to determine a target number of bits for a Group of Pictures (GOP), typically an I

    slice followed a number of P and/or B slices. The bits available for the GOP are then allocated to

    I, P and B slices, with the allocation changing depending on the slice type. An I slice would

    typically be allocated most bits (because intra prediction tends to be less efficient than inter

    prediction), followed by P slices and then B slices. Within each slice, a certain number of bits are

    allocated to each macroblock. The rate control algorithm then attempts to control the encoder toproduce the target number of bits.

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    3/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 3 of 10

    Figure 0-3 Bitrate allocation for rate control

    Example: Foreman, QCIF, 100 frames

    100 frames of the Foreman QCIF sequence were encoded at a frame rate of 10 frames per secondusing Baseline profile, coded as one I-slice followed by P-slices with a target bit rate of 26 kbps.

    Figure 0-4 shows the coded bitrate. After the first (I) slice, the encoder maintains a roughlyconstant number of bits per frame.

    Foreman is a ten-second clip that contains a relatively high amount of motion, particularly in

    the last 2-3 seconds. Figure 0-5 plots the variation of QP throughout the sequence. The large

    variation, particularly in the final seconds, is necessary in order to compensate for the changing

    motion and detail. This variation in QP leads to a variation in per-frame quality, measured as

    PSNR (Y) in Figure 0-6. As the QP increases, PSNR decreases and vice versa.

    This example illustrates the classic trade-off of video codec rate control : a constant or near-constant bitrate typically is achieved at the expense of varying decoded quality.

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    4/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 4 of 10

    Figure 0-4 Foreman, QCIF, 100 frames: coded bitrate

    Figure 0-5 Foreman, QCIF, 100 frames: QP per frame

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    5/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 5 of 10

    Figure 0-6 Foreman, QCIF, 100 frames: Luma PSNR per frame

    3. Mode selection

    An H.264/AVC encoder can choose from many different options or modes when it codes a

    macroblock. Figure 0-7 shows the main prediction choices for a macroblock These include:

    Skip mode (dont send any information for this macroblock)

    Four intra-16x16 modes

    Nine intra-4x4 modes, with a different choice possible for each 4x4 block

    16x16 inter mode with prediction from reference picture(s) from one (P, B MB) or two (BMB) lists

    8x16 inter mode (prediction from multiple reference pictures as above, with the option ofdifferent reference picture(s) for each partition)

    16x8 inter mode (as above)

    8x8 inter mode (as above), with further sub-division of each 8x8 partition into 8x4, 4x8 or4x4 sub macroblock partitions.

    As well as the choice of prediction mode, the encoder can choose to change the quantizationparameter (QP); within each inter mode the encoder has a wide choice of possible motion

    vectors; and so on. There are a huge number of options for coding each macroblock. Each codingmode (i.e. each combination of coding parameters) will tend to generate a different number of

    coded bits, ranging from very low (P-Skip or B-Skip) to high (Intra) and a different distortion or

    (conversely) reconstructed quality.

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    6/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 6 of 10

    A video encoder aims to minimize coded bitrate and maximise decoded quality (or minimize

    decoded distortion). However, choosing the coding mode of a macroblock to achieve this is adifficult problem, because of (a) the huge number of possible combinations of encoding

    parameters and (b) the question of deciding the best tradeoff between minimizing bitrate andminimizing distortion. Rate-distortion Optimisation (RDO) is used to select the best mode which

    providing maximised video quality and minimised bitrate. RDO is carried out for each MB.

    In JM software, a number of RDO methods have been implemented, which includes Low

    complexity RDO, High complexity RDO and Fast high complexity RDO. In high complexity

    RDO, the actual distortion and bitrate of each mode of a MB are calculated by carrying out

    decoding process and they are used in the mode selection process to choose the best mode for a

    MB. This is very accurate method as the actual bitrate and distortion are used in the selectionprocess, but this increases the complexity of encoder dramatically. Compared with High

    complexity RDO, Low complexity RDO uses an estimation of bitrate and distortion of each

    mode rather than the actual value. This methods saves significant time on decoding each MBcoded at various modes, but the performance decreases too. Fast high complexity RDO is based

    on high complexity RDO and uses methods and techniques to perform early prediction of the bestmode for each MB. This method is trying to reduce the complexity of High complexity RDO

    with very small or negligible loss of R-D performance.

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    7/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 7 of 10

    Figure 0-7 Available macroblock prediction modes

    3. Deblocking Filter

    In H.264, a filter is applied to every decoded macroblock in order to reduce blocking distortion.

    The deblocking filter is applied after the inverse transform in the encoder (before reconstructing

    and storing the macroblock for future predictions) and in the decoder (before reconstructing and

    displaying the macroblock). The filter has two benefits: (1) block edges are smoothed, improving

    the appearance of decoded images (particularly at higher compression ratios) and (2) the filteredmacroblock is used for motion-compensated prediction of further frames in the encoder, resultingin a smaller residual after prediction. (Note: intra-coded macroblocks are filtered, but intra

    prediction is carried out using unfiltered reconstructed macroblocks to form the prediction).

    Picture edges are not filtered.

    Filtering is applied to vertical or horizontal edges of 4x4 blocks in a macroblock, in the following

    order:

    1. Filter 4 vertical boundaries of the luma component (in order a,b,c,d in Figure 0-8)

    2. Filter 4 horizontal boundaries of the luma component (in order e,f,g,h, Figure 0-8)3. Filter 2 vertical boundaries of each chroma component (i,j)

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    8/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 8 of 10

    4. Filter 2 horizontal boundaries of each chroma component (k,l)

    Each filtering operation affects up to three pixels on either side of the boundary. Figure 0-9

    shows 4 pixels on either side of a vertical or horizontal boundary in adjacent blocks p and q(p0,p1,p2,p3 and q0,q1,q2,q3). Depending on the current quantizer, the coding modes of

    neighbouring blocks and the gradient of image samples across the boundary, several outcomes

    are possible, ranging from (a) no pixels are filtered to (b) p0, p1, p2, q0, q1, q2 are filtered to

    produce output pixels P0, P1, P2, Q0, Q1 and Q2.

    a b c d

    e

    f

    g

    h

    i j

    k

    l

    Boundary filtering: 16x16

    luma

    8x8 chroma

    Figure 0-8 Edge filtering order in a macroblock

    p3 p2 p1

    p0

    q0 q1 q2 q3

    Vertical boundary

    p0

    p1

    p2

    p3

    q0

    q1

    q2

    q3

    Horizontal

    boundary

    Figure 0-9 Pixels adjacent to vertical and horizontal boundaries

    Filtering example

    A video clip is encoded using the AVC reference software with a fixed Quantization Parameter of36 (relatively high quantization). Figure 0-10 shows an original frame from the clip; Figure 0-11

    shows the same frame after inter coding and reconstruction, with the loop filter disabled. Note the

    obvious blocking artefacts; note also the effect of varying motion-compensation block sizes (for

    example, 16x16 blocks in the background to the left of the picture; 4x4 blocks around the arm).

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    9/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 9 of 10

    With the loop filter enabled (Figure 0-12) the appearance is considerably better; there is still

    some obvious distortion but most of the block edges have disappeared or faded. Note that sharpcontrast boundaries (such as the line of the arm against the dark piano) are preserved by the filter

    whilst block edges in smoother regions of the picture (such as the background to the left) aresmoothed.

    In this example the loop filter makes only a small contribution to compression efficiency: the

    encoded bitrate is around 1.5% smaller and the PSNR around 1% larger for the sequence with the

    filter. However, the subjective quality of the filtered sequence is significantly better. The coding

    performance gain provided by the filter depends on the bitrate and sequence content.

    Figure 0-10 Original frame (violin frame 2)

    Figure 0-11 Reconstructed, QP=36 (no filter)

  • 8/8/2019 Enm172 H264 Other Coding Techniques

    10/10

    ENM172 Multimedia Coding School of Engineering

    Dr. Yafan Zhao [email protected] Page 10 of 10

    Figure 0-12 Reconstructed, QP=36 (with filter)