tm paramvir bahl [email protected] microsoft corporation adaptive region-based multi-scaled motion-...

TM

Paramvir [email protected]

Microsoft Corporation

Adaptive Region-Based Multi-Scaled Motion-Compensated Video Coding for

Error Prone Communication Channels

Wei-Lien [email protected]

Digital Equipment Corporation

SPIE ‘97, Dallas, USA

November 4, 1997

TM

Outline

Video encoder description

Transmission model

Simulation methodology

Performance (experimental) results

Conclusions

TM

Objective and Approach

Objective:

– design a low complexity video compression algorithm for robust transmission over hostile communication channels.

Approach:

– Spatially segment video frames into video regions, decompose video regions into sub-bands, apply unequal error protection to different regions and different sub-bands, carry out prioritized transmission and apply novel reconstruction to guarantee a minimum spatial and temporal resolution at the receiver.

TM

Description of the Video Encoder

Our codec .vs. ITU’s H.263– Spatial Segmentation (Split-and-Merge algorithm)

– Frequency Segmentation (Discrete Wavelet Transform)

Characteristics of our algorithm– compression is achieved by

removing temporal redundancy via classical motion estimation and removing spatial redundancy via DWT, DCT, quantization and entropy

coding.– A new spatial region-segmentation map is generated for every intra-

frame, and the same map is used until strong changes appear in the incoming frames.

TM

Proposed Video Encoder

+

Subband Id. Region Id.

Motion vectors

-

Motion vectorsWrite

Write

2D Discrete Wavelet Transformatoio and

DCTn Quantizer

Quantizer-1

Spatial Segmentation

Intra

Inverse 2D WT/DCT

Picture Type

Motion Estimator

Motion Compensation

Predictor

Future Picture Store

Previous Picture Store

+

Inter-Frame

Picture Type

Inter/Intra

RLE + Huffman

Picture Type

Quantizer Adapter

Picture Type

Region Map

NetworkSubsystem

Inter/Intra Classifier

TM

Video Frame Segmentation

Intra-Frame Region Segmentation– The (Intra) frame is first partitioned into blocks of size 16 x 16.– The variance of each block i is computed.– All adjacent blocks of similar variances are merged.

Inter-Frame Region Segmentation – Assign the index to each inter-block based on motion estimation. – If the map generated for the frame is different from the one for the

previous frame or if the frame contains intra blocks, then performs intra frame segmentation.

TM

Video Frame Segmentation

NT 2min

2maxlog

Tji 22log

Threshold (T) = 0.278 Threshold (T) = 0.293

TM

Discrete Wavelet Transformation

A two-tap Harr filter decomposes each region of the luminance (Y) component into 4 bands– low complexity

– capability to decompose arbitrary shaped regions which are multiples of macro-blocks without causing any undesirable boundary effect

TM

Quantization and Bitstream Packing

Quantization– apply different quantization steps to DC and AC subbands

Bit Stream– five layers:

Picture Layer: HEADER,REGION MAP,REGION LAYER Region Layer Subband Layer Macroblock Layer Block Layer

– The DC and AC-subbands for each video frame are transmitted in different slices

TM

Error Concealment

Classical problems in video reconstruction– Transmission errors due to channel imperfections cause corruption in

some of the transmitted video regions rendering them un-decodable – Dynamic reduction in non-reserved bandwidth causes some of the

regions not to reach the decoder in a timely manner

Solution – the complete frame is reconstructed at the receiver by using a

combination of the current and previous video regions that were received correctly

TM

Region Reconstruction

Video Region (Ri,1)

Video Region (Ri,N)

Video Region (Ri,3)

frame i @ Receiver N regions @ Receiverframe i @ Transmitter

Video Region (Rx,1)

Video Region (Rx,N)

Video Region (Rx,3)

Video Region (Rx,2)

Ri,1

Ri,N

Ri,3

Ri-1,2

Region Store Substitute

Did not reach receiver

In the case when some of the ijR are incorrectly received, iR is formed by using the last

corresponding jth video region that was received correctly

TM

Simulation Methodology

Statistics GatheringError Warning

Read-Solomon

Coder Interleaver

Disk

WriteQDPSK

Modulator

Video Sequence

Transmitter Model

16-bit CRC

Errors

Decompressor

Disk

Read

BE RE

De-interleaver RS

Decoder

ber = 10-2

Hard Decision

Receiver Model

Channel Model

CRCDecoding

Compressor

TM


Transmitter:– video compressor– Read-Solomon forward error correcting encoder– burst error correcting interleaver– CRC error detector

The compressed video data is fragmented into blocks of 48 octets, packaged and transmitted in packets of 53 octets (ATM cell size)– 5 octets for header information (2 octets for CRC, 3 octets for

miscellaneous information such as connection number, priority,..)

TM


Error generator– modeled as a modified Gilbert model with the two states representing the

Burst Error State (BE) and the Random Error State (RE).

Burst RandomBE 05.2101.2 RE

sec1.0RB

Poisson distributed with a mean transition rate of secRB and secBR

TM

Experimental Results

Performance

Bit Rate .vs. Frame Number

0

10000

20000

1 7 13 19 25 31 37 43 49

Frame Number

Bits

PSNR .vs. Frame Number

343638

1 7 13 19 25 31 37 43 49

Frame Number

dB

TM


Software Performance

Function Name Region-Segmented Codec ITU’s H.263 CodecSegmentation 11 --DWT / I DWT 1.8 --DCT / I DCT 4.7 5.3Motion Estimation 55.3 62.8FindHalfPel 9.9 11.3Quant/ Dequant. 2.4 2.7Clip 1.5 1.7I nterpolate I mage 1.1 1.2Predict_P 2.2 2.5MB_Recon_P 1.1 1.3Miscellaneous 9 11.2

TM


Bounded Error Propagation

Corruption in the H.263 bitstream

Corruption in the AC Sub-bands of the unprotected Regions 25.34 dB

TM


Improved Temporal Resolution with changing error characteristics (K represents the number of re-transmissions allowed)

PSNR .vs. Offered Load

242526272829303132

10 30 50 70 90 110

130

Offered Load (Erlangs)

PS

NR

(d

B)

0

2

4

6

8

10

12

14

16

5 10 15 20 25 30 35 40 45

Average Channel SNR (dB)F

ram

e R

ate

(fp

s)

H263

Region-based

Region-basedH.263

K = 1

K

K = 3

TM


Improved Temporal Resolution with changing bandwidth constraints

0

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Time (seconds)

Fram

es S

kipp

ed

Frame Rate .vs. Offered Load

0

5

10

15

20

10

30

50

70

90

110

13

0

Offered Load (Erlangs)F

ram

es

/s

ec

on

d

Region-based

H263Region-based

H.263

TM


Efficient Bandwidth Utilization– The average bit rate for DC sub-bands was 8 kbps.

– The second, third, and forth AC sub-bands had an average bit rate of 3, 2.5, and 2.8 kbps respectively

Statistically Multiplexing within a Frame

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 2 4 6 8 10 12 14 16 18 20 22 24 26

Time (Seconds)

Tota

l Bit

s (M

bit

s)

Cumulative Arrivals Link Bandwidth

TM

Conclusions

Advantage of the proposed region-based multi-resolution video compression algorithm:

– allows the transmitter to apply unequal error protection– allows transmitter to dynamically adjust the order and transmission priority of

individual regions– allows for improved temporal resolution at the receiver– limits error spreading in both the spatial and the temporal domain and– reduces coding delays as transmission can begin as soon as the first region is

compressed.– Good for QoS

Can be used with near optimum reserved bandwidth utilization– Software performance comparable to H.263

Thanks !

tm paramvir bahl [email protected] microsoft corporation adaptive region-based multi-scaled motion-...

Documents

video frames

previous video regions

transmitted video regions

intra frame segmentation

previous frame

complete frame

different regions

region map