lecture3 compressed domain video processing

80
1 Compressed-Domain Image & Video Processing MIT 6.344 April 25, 2002 John G. Apostolopoulos and Susie J. Wee Streaming Media Systems Group Hewlett-Packard Laboratories

Upload: pulkit-jain

Post on 28-Nov-2014

30 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture3 Compressed Domain Video Processing

1

Compressed-Domain Image & Video Processing

MIT 6.344April 25, 2002

John G. Apostolopoulos and Susie J. WeeStreaming Media Systems Group

Hewlett-Packard Laboratories

Page 2: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

2

High-quality image and video processing with low computational complexity and

low memory requirements.

010001101001101101100110101011100010 Process

MPEG stream MPEG stream

Goal: Efficient signal processing of compressed image and video streams

Problem Statement

Page 3: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

3

Compressed Domain Processing and Transcoding

VideoServer

Full resolutionDecode and Display

Low resolutionDecode and Display

Low-bandwidthwireless link

High-bandwidthLAN

Capture andEncode

TranscoderProcessor

Transcoding: Media streaming over packet networksProcessing: Media processing in the compressed domain

Page 4: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

4

Outline

• Compressed-Domain Image Processing

• Compressed-Domain Video Processing

– Review of MPEG basics

– Manipulating temporal dependencies

– Splicing

– Reverse Play

– MPEG-2 to H.263/MPEG-4 Transcoding

• Review and Demo

Page 5: Lecture3 Compressed Domain Video Processing

5

CDP of Images

Page 6: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

6

Basic translation problem:Given four 8x8 DCT blocks, find DCT of any 8x8 subblock.

IDCTDCT domain Pixel domain DCT domain

DCTIDCT

IDCT

IDCT

DCT is good for compression,but complicates many image processing tasks.

CDP Example

Page 7: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

7

GOAL: Perform equivalent spatial-domain image processing operationsusing fast algorithms that operate directly on the compressed data

Compressed-Domain Processing Approach

Decode Encode

Conventional Approach

SPATIAL-DOMAINPROCESSING

DCT-DOMAINPROCESSING

HuffmanDecode

HuffmanEncode

BITSTREAMPROCESSING

JPEGbitstream

JPEGbitstream

Goal of DCT-Domain Processing

JPEGbitstream

JPEGbitstream

Page 8: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

8

DCT-Domain Processing

• Many DCT-domain algorithms have been developed [Vasudev, Merhav, Shen] including: translation, downscaling, filtering, masking, blue screen editing, etc.

• Algorithms exploit:

– Signal processing properties of DCT

– Matrix structures used in fast DCT algorithms

– Sparseness of quantized DCT coefficients

• Exact and approximate algorithms

Page 9: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

9

JPEGbitstream

DCT-DOMAINPROCESSING

HuffmanDecode

HuffmanEncode

DCT-Domain Processing

JPEGbitstream

Processing command

Basic Concepts:– Express the spatial-domain processing as a

sequence of matrix operators– Use the distributive property of the DCT to

develop the equivalent DCT-domain operation– Use a sparse matrix factorization to derive a fast

algorithm

Page 10: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

10

DCT-Domain Processing

• Spatial Domain: IDCT-Process-DCT– Fast DCT/IDCT based on efficient factorization

by Arai, et. al. requires 13 mults, 29 adds.IDCT T' x = A3'A2'A1' M' B2' B1' P' D' xProcess y = F T’ xDCT T y = D P B1 B2 M A1A2A3 y

• Compressed Domain: partial decompressionTFT'x = DPB1B2MA1A2A3FA3'A2'A1'M'B2'B1'P'D'x

Page 11: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

11

Huffman Decode Huffman EncodeDCT Block Y[i,j]

Downscale by 2

Method:

Y

Y = (S T) ST

TTA1 A2A3 A4

A1 A2A3 A4

S and T are fixed 8x8 matrices that differ only in the signs of their entries. For fast implementations, entries in S and T can be approximated by rounding off the entries to0, 1/8, 1/4 and 1/2.

Downscale by 3 (use 9 blocks A1,A2,...,A9) and Downscale by 4 (use 16 blocks A1,A2,...,A16) can be performed in a similar manner.

DCT-Domain Downscaling

Page 12: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

12

Downscaling by two Cycles per output8x8 block

Spatial domain- four 8x8 inverse DCT, 5376- pixel average,- one 8x8 DCTDCT Domain- Smith > 7700- Chang 6192- Our approach#1 2824- Our approach (exact S, T) 3392- Our approach (approx. S,T) 880DCT Domain - sparse data- Smith > 5250- Chang 2096- Our approach#1 902- Our approach (exact S,T) 1752- Our approach (approx. S,T) 528

Downscaling by 3:Spatial domain Approach 9392 opsDCT Domain Approach 5728 ops

Downscaling by 4:Spatial Domain Approach 16224 opsDCT Domain Approach 8224 ops

DCT-Domain Downscaling

Page 13: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

13

Huffman Decode

DCT Block X[i,j]

Modify DCT Block Huffman Encode

DCT Block Y[i,j]

Brightness Adjustment: BrightnessAdjustment

Spatial Domain:y = x + b

64 adds per 8x8 block

Transform Domain:Y[i,j] = X[i,j] + B, i = j = 0

= X[i,j] elsewhere1 add per 8x8 blockDetail Enhancement:

Transform Domain:Manipulate X[i,j] for all(i,j) except (i = j = 0)

DetailEnhancement

DCT-Domain Image Enhancement

Page 14: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

14

Original JPEG Image

Downscaling14X speedup

Tiling4.5X speedup

Sharpening8X speedup

Edge detection5X speedup

Filtering2-4X speedup

Rotate1.5-3X speedup

Speedup data based on actual implementations.

JPEGbitstream

DCT-DOMAINPROCESSING

HuffmanDecode

HuffmanEncode

JPEGbitstream

Processing command

JPEG CDP Examples

Page 15: Lecture3 Compressed Domain Video Processing

15

CDP of Video

Page 16: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

16

720 x 1280 pixels, 60 fps~ 1 Gbps

480 x 720 pixels, 30 fps~ 250 Mbps

Encode 100110101011100010

20 Gbytes / 2 hour movie~ 20 Mbps

5 Gbytes / 2 hour movie~ 5 Mbps

HDTV quality

Broadcast quality

Raw video MPEG stream

Video Compression

Page 17: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

17

Encode 100110101011100010

Raw video1 Gbps

MPEG stream20 Mbps

Decode100110101011100010

Raw video1 Gbps

MPEG stream20 Mbps

20,000 MOPS

2,000 MOPS

MPEG Video Compression

Page 18: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

18

ProcessDecode Encode

How do we process compressed video streams?

1 Gbps 1 Gbps 20 Mbps20 Mbps

2,000 MOPS 20,000 MOPS

Problem statement

• Difficulties:– Computational requirements: 22,000 MOP overhead from

decode and encode operations (plus additional processing)– Bandwidth requirements: Need to process uncompressed

data at 1 Gbps– Quality issues: Even without processing, the decode/encode

cycle causes quality degradation.

Page 19: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

19

010001101001101101100110101011100010 TranscodeMPEG stream MPEG stream

Naive Solution:

ProcessDecode Encode

How do we process compressed video streams?Use efficient compressed-domain processing (CDP) algorithms.

Ideal Solution:

1 Gbps 1 Gbps 20 Mbps20 Mbps

2,000 MOPS 20,000 MOPS

MPEG CDP

Page 20: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

20

Outline

• Compressed-Domain Image Processing

• Compressed-Domain Video Processing

– Review of MPEG basics

– Manipulating temporal dependencies

– Splicing

– Reverse Play

– MPEG-2 to H.263/MPEG-4 Transcoding

• Review and Demo

Page 21: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

21

MPEG Group of Pictures (GOP) Structure

• Composed of I, P, and B frames• Arrows show prediction dependencies• Periodic I-frames enable random access

MPEG GOP

I0 B1 B2 P3 B4 B5 P6 B7 B8 I9

Page 22: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

22

P

GOP Layer Picture Layer

MacroblockLayer

BlockLayer

8x8 DCT4 8x8 DCT

1 MV

MPEG Structures

Slice Layer

BB

PB

BI

P B B I B B P B B P B B I

Page 23: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

23

MPEG Structures

• GOPs*• Pictures*: Intraframe (I frame), Forward predicted (P frame),

Bidirectionally predicted (B frame)• Slices*: (16x16n pixel blocks)• Macroblocks (16x16 pixel blocks):

– 1 MV and 4 DCT blocks per MB (plus chrominance)– I frame: Intra MBs– P frame: Intra or Forward MBs– B frame: Intra, Forward, Backward, or Bidirectional MBs

• Blocks (8x8 pixel blocks): 8x8 DCT(* start codes)

P B B I B B P B B P B B I

Page 24: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

24

MPEG Coding Order

• Preceding and following I or P frames (anchors) are used to predict each B frame.

• “Anchor” frames must be decoded before B data. (Note: In coding order, all arrows point forward.)

I0 B1 B2 P3Displayorder B4 B5 P6 B7 B8 I9

I0 B1 B2P3 B4 B5P6 B7 B8I9Codingorder

Page 25: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

25

MPEG Bitstream

• GOP header• Picture header• Picture data• Pictures in coding order• Start codes (seq, GOP, pic, and slice start codes; seq end code)

– Unique byte-aligned 32-bit patterns– 0x000001nm 23 zeros, 1 one, 1-byte identifier– Enables random access

10110010

I B B P B B P B B I

Page 26: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

26

GOAL: Perform equivalent spatial-domain video processing operationsusing fast algorithms that operate directly on the compressed data

Compressed-Domain Processing

Decode Encode

2,000 MOPS > 20,000 MOPSConventional Approach

SPATIAL-DOMAINPROCESSING

MV+DCT-DOMAINPROCESSING

HuffmanDecode

HuffmanEncode

200 MOPS 200 MOPS

BITSTREAMPROCESSING

Goal of Compressed-Domain Video Processing

MPEGbitstream

MPEGbitstream

MPEGbitstream

MPEGbitstream

Page 27: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

27

Splicing

Reverse Play

Frame-Level Video Processing

Page 28: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

28

Difficulties and Solutions

• High compression causes temporal dependencies– Manipulate the temporal dependencies

• Coding order vs. display order– Data reordering

• Rate control / Buffer limitations– Requantization, Frame conversion

• MPEG header information (e.g. time stamps, vbv_delay)– Rewrite header bits

Page 29: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

29

Outline

• Compressed-Domain Image Processing

• Compressed-Domain Video Processing

– Review of MPEG basics

– Manipulating temporal dependencies

– Splicing

– Reverse Play

– MPEG-2 to H.263/MPEG-4 Transcoding

• Review and Demo

Page 30: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

30

Manipulating Temporal Dependencies in Compressed Video

• Video compression uses temporal prediction→ Prediction dependencies in coded data

• Many applications require changes in the dependencies• Manipulating (modifying) temporal dependencies in the

compressed video [Wee]– Adding– Removing– Changing

Page 31: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

31

Temporal PredictionOriginal Video Frames:

Coded Video Frames:

Each frame Fi is coded with two components

1. Prediction

Anchor frames

Side information

2. Residual

Coded Residual

Reconstructed Frame

{ }n21 F̂,...,F̂,F̂ ˆ =F

( )ii S,ˆP iA

iS{ }1-i21 F̂,...,F̂,F̂ ˆ ⊆iA

{ }n21 F,...,F,F =F

( )iiii S,ˆP - FR iA=

( ) iiii R̂S,ˆP F̂ += iAiR̂

Page 32: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

32

Prediction DependenciesEach coded frame is dependent upon its anchor frames.

Prediction Depth

( ) iiii R̂S,ˆP F̂ += iA

F1

F2 F3

F4

F5 F6

F7

F10

Level 0

Level 1

Level 2

Level 3

F8 F9Level 4

Page 33: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

33

Manipulating DependenciesFewer levels of dependence

Remove dependence between frames

F1

F2 F3

F4

F5 F6

F7Level 0

Level 1

F10

F8 F9

F1

F2 F3

F4

F5

F6

F7

F10

Level 0

Level 1

Level 2 F8 F9

Page 34: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

34

Problem FormulationCompression algorithm has a set of prediction rules.

Choose prediction modes during encoding.

Each choice yields:

different compressed representation of frame Fi ,

different distribution between P & R components,

different set of prediction dependencies.

iS,ˆiA

( ) iiii R̂S,ˆP F̂ += iA

'iS,ˆ '

iA

( ) 'i

'i

'i

'i R̂S,ˆP F̂ += '

iA

Page 35: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

35

Manipulating DependenciesGiven a compressed representation with dependencies

how do we create a new compressed representation with dependencies ?

• Reconstruct frames

• Compressed-domain approximation

• Calculate new prediction and residual

( ) iiii R̂S,ˆP F̂ += iA

( )'i'ii

'i S,ˆPF R '

iA−=

iS,ˆiA

'iS,ˆ '

iA

ii F F̂ ≈

( ) ( )'i'iiii

'i S,ˆPS,ˆPR̂ R '

ii AA −+≈

Page 36: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

36

P Bb Bb I B B P

P B B I B B P

P B B I B B I

P Bf Bf I B B P

P to I

B to Bfor

B to Bback

Original

Frame Conversions: Remove Dependencies

Page 37: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

37

P B B I B B P

P B B P B B PI to P

Original

Frame Conversion: Add Dependencies

1. Find and code new MVs. MV Resampling2. Calculate new prediction.3. Calculate and code new residual.

Page 38: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

38

Compressed-Domain Processing

MPEG standard addresses prediction rules, buffer requirements, coding order, and bitstream syntax.

MPEG CDP steps• Determine and perform appropriate frame conversions

(i.e. manipulate dependencies).• Reorder data.• Perform rate matching.• Update header information.

001011011010011001 110001011010110100CDP

Page 39: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

39

Outline

• Compressed-Domain Image Processing

• Compressed-Domain Video Processing

– Review of MPEG basics

– Manipulating temporal dependencies

– Splicing

– Reverse Play

– MPEG-2 to H.263/MPEG-4 Transcoding

• Review and Demo

Page 40: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

40

100110101011100010 001011011010011001 110001011010110100

Decode Decode Encode

Cut & Paste

Transcode

Compressed-Domain Splicing

Application: Ad Insertion for DTV

Page 41: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

41

Splicing: SMPTE standard• SMPTE formed a committee on splicing.• Disadvantages:

– User must predefined splice points during encoding⇒ Complicated encoders.

– Splice points can only occur on I frames–Not frame accurate.

• Advantages:– Simple cut-and-paste operation.

Page 42: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

42

The TV Newsroom IEEE Spectrum

Page 43: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

43

Splicing: Conventional approach

Decode

Decode

Cut &Paste

Encode

00101101101001

00101101101001

110001011010112,000 MOPS

20,000 MOPS

2,000 MOPS

1 Gbps20 Mbps

20 Mbps

20 Mbps

1 Gbps

1 Gbps

Splice00101101101001

10010110011001

11000101101011

> 24,000 MOPS for one splice point every sec

Page 44: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

44

Compressed-Domain Processing

Can we do better by exploiting properties of1) the MPEG compression algorithm

and2) the splicing operation ?

Page 45: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

45

Splicing: Our approach

ProcessHead

ProcessTail

Match &Merge

00101101101001

10010110011001

11000101101011

CD Splice00101101101001

10010110011001

11000101101011

< 2,000 MOPS for one splice point every sec

Page 46: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

46

Splicing Algorithm (simplified overview)

Only process the GOPS affected by the splice.

Step 1: Process the head stream.• Remove (backward) dependencies on dropped frames.

Step 2: Process the tail stream.• Remove (forward) dependencies on dropped frames.

Step 3: Match & Merge the head and tail data.• Perform rate matching.• Reorder data appropriately.• Update header information.

Page 47: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

47

P Bb Bb I Bb Bb P

P B B I B B P

P B B I B B I

P Bf Bf I Bf Bf P

P to I

B to Bfor

B to Bback

Original

Frame Conversion: Remove Dependencies

1. Eliminate temporal dependencies.2. Calculate new prediction (DCT-domain).3. Calculate and code new residual.

Page 48: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

48

Splicing

GOP with splice-out frame

Head Input Stream

X X

GOP with splice-in frame

Tail Input Stream

X

Spliced Output Stream

Hn-2 Hn-1 Hn

T0 T1 T2 T3

T1 T2Hn-2 Hn-1 F(Hn,T0)

Page 49: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

49

100110101011100010 001011011010011001 110001011010110100

Decode Decode Encode

Cut & Paste

Transcode

2,000MOPS

2,000MOPS

20,000MOPS

< 2,000 MOPSfor one splice point every sec

Compressed-Domain Splicing

Page 50: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

50

Splicing: Remarks

• Proposed Algorithm– Frame-accurate splice points– Only uses MPEG stream (Encoder is not affected.)– Frame conversions in MV+sparse DCT domain.– Rate control by requantization and frame

conversion. (Do not insert synthetic frames.)– Quality only affected near splice points.– Computational scalability: Video quality can be

improved if extra computing power is available.

Page 51: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

51

Outline

• Compressed-Domain Image Processing

• Compressed-Domain Video Processing

– Review of MPEG basics

– Manipulating temporal dependencies

– Splicing

– Reverse Play

– MPEG-2 to H.263/MPEG-4 Transcoding

• Review and Demo

Page 52: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

52

Reverse-Play Transcoding

Develop computation- and memory-efficient transcoding algorithms for reverse play of a given

forward-play MPEG video stream.

001011011010011001 110001011010110100

Encode

Store &Reorder

Transcode

Decode

Page 53: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

53

Reverse-Play Architecture #1

• High memory requirements– Frame buffer must store 15 frames

• High computational requirements– Motion estimation dominates computational needs

Decode(15)

Encode(15)

Reorder(15)

FrameBuffer

MotionEstimation

I, P, BI, P, B

#1

GOP 15

Page 54: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

54

Compressed-Domain Processing

Can we do better by exploiting properties of1) the MPEG compression algorithm

and2) the reverse-play operation ?

Page 55: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

55

161514

13

1211

109

87

5

432

1

6

B-Frame Symmetry: Swap MVs

161514

13

1211

109

87

5

432

1

1615

1413

1211

109

87

5

43

21

161514

13

121110

9

87

5

4

3216

6 6MV1 MV2

1615

1413

1211

109

87

5

43

21

161514

13

121110

9

87

5

4

3216 6

MV2 MV1

ForwardPlay

ReversePlay

Anchor Frame 1 B Frame Anchor Frame 2

Page 56: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

56

Reverse-Play Architecture #2

• Reduced memory requirements– Frame buffer must store 5 frames

• Reduced computational requirements– ME still dominates computational needs

Decode(5)

Encode(5)

Reorder(5)

FrameBuffer

MotionEstimation

HuffmanDecode

HuffmanEncode

SwapMVs

I, P

B

I, P

B

#2

Exploit symmetry of B frames.

Page 57: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

57

Reverse-Play Architecture #3

• Reduced memory requirements• Reduced computational requirements

Decode(5)

Reorder(5)

FrameBuffer

ReverseMVs

HuffmanDecode

HuffmanEncode

SwapMVs

I, P

B

I, P

B

Encode(5)

#3

Exploit forward MVF in coded data.

Page 58: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

58

Forward versus Reverse MV’s

Forward MV

Reverse MV

?

?

Page 59: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

59

Search Area: No correspondence

Forwardsearcharea

Reversesearcharea

Time T0

Time T1

Time T0

Time T1

Interpret MVs as specifying a match between blocks.Develop motion vector resampling methods.

Page 60: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

60

In-place Reversal Method

Forward MV

Reverse MV:In-placereversal

Page 61: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

61

Maximum Overlap Method

Forward MV’soverlapping

block in question

Reverse MV:reversal of

forward MV with max overlap

Page 62: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

62

Decode(15)

Encode(15)

Reorder(15)

FrameBuffer

MotionEstimation

I, P, BI, P, B#1GOP 15

Decode(5)

Encode(5)

Reorder(5)

FrameBuffer

MotionEstimation

HuffmanDecode

HuffmanEncode

SwapMVs

I, P

B

I, P

B

#2

Decode(5)

Reorder(5)

FrameBuffer

ReverseMVs

HuffmanDecode

HuffmanEncode

SwapMVs

I, P

B

I, P

B

Encode(5)

#3

Computational Requirements

#1: 4200 McyclesLog Search ME

#2: 1030 McyclesLog Search ME

#3a: 420 McyclesMaximum Overlap

#3b: 330 McyclesIn-place Reverse

Page 63: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

63

Experimental Results

Girl

BusCarousel

Football

0.65 dB

0.2 dB

Page 64: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

64

Observations• Girl sequence showed largest PSNR loss (.65 dB)

due to high detail and texture in source.– MV accuracy is important!

• Bus sequence benefits from maximum overlap because of large motions in source.

• Carousel has little performance loss because block MC does not match motion in source.

• Football has little performance loss due to blurred source.– When MV accuracy is not important, fast

approximate methods do not sacrifice quality.

Page 65: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

65

Reverse Play Summary• Developed several compressed-domain reverse-

play transcoding algorithms.• CDP: Exploit properties of compression algorithm

and reverse-play operation– Exploit symmetry of B frames.– Exploit MVF information given in forward stream.

• Order of magnitude reduction in computational requirements.

• Worst case 0.65 dB loss in PSNR quality over baseline transcoding.

Page 66: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

66

Outline

• Compressed-Domain Image Processing

• Compressed-Domain Video Processing

– Review of MPEG basics

– Manipulating temporal dependencies

– Splicing

– Reverse Play

– MPEG-2 to H.263/MPEG-4 Transcoding

• Review and Demo

Page 67: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

67

Motivation

• Video communication requires the seamless delivery of video content to users with a broad range of bandwidth and resource constraints.

• However, the source signal, communication channel, and client device may be incompatible.

• Therefore, efficient transcoding algorithms must be designed

001011011010011001 110001011010110100TranscoderStandard-compliant

input bitstreamStandard-compliant

output bitstream

Page 68: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

68

MPEG-2 to H.263/MPEG-4 Transcoding

MPEG-2 to H.263/MPEG-4 Transcoder• Transcode MPEG video streams to lower-rate H.263 video streams.

• Interlace-to-progressive conversion.• Order of magnitude reduction in computational requirements.

Stream DVD movies to portable multimedia

devices.

MediaServer

Low-bandwidthwireless linkT

ransco

der

Tran

scod

erStream DTV program

material over the internet.

Media ServerDTV or DVD content

Page 69: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

69

Problem Statement

Develop a fast transcoding algorithm that adapts the bitrate, frame rate, spatial resolution, scanning format, and/or coding standard while

preserving video quality

Important features:– Interlaced-to-progressive transcoder– Inter-standard transcoder, e.g. MPEG-2 to H.263/MPEG-4

(Note: H.263 and MPEG-4 simple profile are essentially the same)

Contributors: Wee, Apostolopoulos, Feamster (MIT)

MPEG-2 input• High bitrates• DVD, Digital TV• >1.5 Mbps, 30 fps• Interlaced and progressive

H.263/MPEG-4 output• Low bitrates• Wireless, internet• <500 kbps, 10fps• Progressive

Page 70: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

70

Difficulties

• High cost of conventional transcoding

• Differences in video compression standards• Interlaced vs. progressive formats

MPEG-2Decode

ProcessFrames

H.263Encode

001011011010011001 110001011010110100

H.263 bitstreamMPEG-2 bitstream

Page 71: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

71

Issue: Standard Differences

M P E G -2 H . 2 6 3 V i d e o F o r m a t s

P r o g r e s s i v e a n d I n t e r l a c e d

P r o g r e s s i v e O n l y

I f r a m e s M o r e ( e n a b l e r a n d o m a c c e s s )

F e w e r ( c o m p r e s s i o n )

F r a m e C o d i n g T y p e s

I , P , B f r a m e s I , P , O p t i o n a l P B f r a m e s

P r e d i c t i o n M o d e s

F i e l d , F r a m e , 1 6 x 8

F r a m e O n l y

M o t i o n V e c t o r s

I n s i d e P i c t u r e O n ly

C a n p o i n t o u t s i d e p i c t u r e

Page 72: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

72

MPEGDecode

TemporalProcessing

SpatialProcessing

H.263Encode

TemporalProcessing

MPEGDecode

SpatialProcessing

TemporalProcessing

MPEGDecode

SpatialProcessing

H.263Encode

MotionEstimation

H.263EncodeMotion

Estimation

CalculateMVs

Conventional

Proposed Aproach: Also exploit input coded data!

Improved Approach: Exploit bitstream syntax!

Development of Proposed Approach

Page 73: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

73

Block Diagram

DropB frames

MPEG IPDecoder

SpatialDownsample

H.263 IPEncoder

EstimateMV

RefineSearchMPEG MVs &

Coding ModesH.263 MVs &Coding Modes

Page 74: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

74

Issue: Match Prediction Modes• Match MPEG-2 Fields to H.263 Frames

• Vertical downsampling => discard bottom field• Horizontal downsampling => downsample top field• Temporal downsampling => drop B frames• Match MPEG-2 IPPPPIPPPP to H.263 IPPPPPPPPP

– Problems– Convert MPEG-2 P fields to H.263 P frames– Convert MPEG-2 I fields to H.263 P frames

I(0,t)

P(0,b)

B(1,t)

B(1,b)

B(2,t)

B(2,b)

P(3,t)

P(3,b)

B(4,t)

B(4,b)

B(5,t)

B(5,b)

i(0) p(1) p(2)

I(6,t)

P(6,b)

B(7,t)

B(7,b)

MPEG-2 Fields

H.263 Frames

Page 75: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

75

Outline

• Compressed-Domain Image Processing

• Compressed-Domain Video Processing

– Review of MPEG basics

– Manipulating temporal dependencies

– Splicing

– Reverse Play

– MPEG-2 to H.263/MPEG-4 Transcoding

• Review and Demo

Page 76: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

76

1001101010111 0010110110100 1100010110101

Decode Decode Encode

Cut&Paste

Transcode

2,000MOPS

2,000MOPS

20,000MOPS

< 2,000 MOPSfor one splice point every sec

Compressed-Domain Splicing

SMPTE splicing solution• User must define splice points during encoding

Specialized complex encoders. • Restricted splice points

Enables Ad Insertion

for DTV

Splicing solution• Works with any MPEG stream.• Frame-accurate splice points

Page 77: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

77

Reverse-Play Transcoding

00101101101001 11000101101011

Encode

Store &Reorder

Transcode

Decode 2,000MOPS

20,000MOPS

< 1,000 MOPS

Reverse-play solution• Frame-by-frame reverse play.• Works with any MPEG stream.• Order of magnitude reduction in computational requirements.

VCR functionality for DTV and

Set Tops

Page 78: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

78

MPEG-2 to H.263/MPEG-4 Transcoding

MPEG-2 to H.263/MPEG-4 Transcoder• Transcode MPEG video streams to lower-rate H.263 video streams.

• Interlace-to-progressive conversion.• Order of magnitude reduction in computational requirements.

Stream DVD movies to portable multimedia

devices.

HPMedia Server

Low-bandwidthwireless linkT

ransco

der

Tran

scod

erP

rocesso

rP

rocesso

r

Stream DTV program material over the

internet.

Page 79: Lecture3 Compressed Domain Video Processing

79

Demo

Page 80: Lecture3 Compressed Domain Video Processing

John ApostolopoulosApril 25, 2002

80

References

References:• “Manipulating Temporal Dependencies in Compressed Video Data with

Applications to Compressed-Domain Processing of MPEG Video”, S. Wee, ICASSP 1999.

• “Splicing MPEG Video Streams in the Compressed Domain”, S. Wee, B. Vasudev, IEEE Workshop on Multimedia Signal Processing, 1997.

• “Compressed-Domain Reverse Play of MPEG Video Streams”, S. Wee, B. Vasudev, SPIE Inter. Sym. On Voice, Video, and Data Communications, 1998.

• “Reversing Motion Vector Fields, S.J. Wee, IEEE International Conference on Image Processing, October 1998.

• “Field-to-frame Transcoding with Spatial and Temporal Downsampling”, S. Wee, J. Apostolopoulos, N. Feamster, ICIP 1999.