a picture is worth a thousand words milton chen. what’s a picture worth? a thousand words -...

30
A Picture is Worth a Thousand Words Milton Chen

Post on 21-Dec-2015

228 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

A Picture is Worth a Thousand Words

Milton Chen

Page 2: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

What’s a Picture Worth?

• A thousand words - Descartes (1596-1650)

• A thousand bytes - modern translation– 1000 * 5 * 5 / 3 8,000 bits

• 75,000 bytes - ATSC/MPEG-2– 20 M / 30 600,000 bits

Page 3: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Frequency Response of the Eye

• Lens - low pass

• Photoreceptors - low pass

• Lateral inhibition - high pass– edge is important

Page 4: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Today’s Video Coding

YUV(lossy)

Motion DCTQuantize(lossy)

EntropyOrder

Designed for natural scenes =>Higher frequency DCT coefficients are quantized more =>Sharp edges are not well preserved

Page 5: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

What’s Wrong with Today’s Video Coding

• Poor performance for – text (channel logo, stock ticks)– graphics – anything with sharp edges

Page 6: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Desirable Features

• Postproduction support

• Personalized delivery / presentation

• Interactive

• Error resilience

• More compression

• Facilitate search / indexing (MPEG-7)

Page 7: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Outline

• Why

• MPEG-4 Overview

• Systems Layer

• Visual Coding– Arbitrarily shaped video– Meshed video– Face and body

Page 8: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Goals of MPEG-4

• One content– convergence of DTV, computer graphics, and

WWW– broadcast, internet, local

• User interactivity

• Higher compression rates

• Robustness in mobile environment

Page 9: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

MPEG-4 Applications

• Interactive TV (broadcast)– Home-shopping, Interactive game show

• Virtual workspace (internet)– virtual meeting, collaborative design

• Infotainment (local)– Virtual-City-Guide

Page 10: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

MPEG-4 Key Concepts

• Independent coding of objects– allow user interactivity (client & server)– higher compression rates

• Provide tools as well as solutions– allow content specific and user defined

compression algorithms

Page 11: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

MPEG-4 History

• Started in July 1993

• Originally for low-bit-rate applications

• Version 1 to be standardized by January 1999

• Continue work on version 2, etc.

Page 12: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

MPEG-4 Standard

1) Systems (manage streams, composition)

2) Visual (natural and synthetic)

3) Audio (natural and synthetic)

4) Conformance Testing

5) Reference Software

6) Delivery Multimedia Integration Framework (medium abstraction layer)

Page 13: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

hierarchically multiplexeddownstream control / data

hierarchically multiplexedupstream control / data

audiovisualpresentation

3D objects

2D background

voice

sprite

hypothetical viewer

projection

videocompositor

plane

audiocompositor

scenecoordinate

systemx

y

z user events

audiovisual objects

speakerdisplay

user input

Page 14: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation
Page 15: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

TransMux Streams

FlexMux Streams

Audiovisual InteractiveScene

AL-Packetized Streams

Elementary Streams

Composition and Rendering

Display andUser

Interaction

Transmission/Storage Medium

...(RTP)UDP

IP

(PES)MPEG-2

TS

AAL2ATM

H223PSTN

DABMux ...

TransMuxLayer

TransMux Interface

FlexMux FlexMux FlexMux FlexMux FlexMuxLayer

Stream Multiplex Interface

AL AL...AL AL ... AL AccessUnitLayer

Elementary Stream Interface

PrimitiveAV Objects

SceneDescriptionInformation

ObjectDescriptor

... CompressionLayer

ReturnChannelCoding

Page 16: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Previous Work in Object Coding• Synthetic High System (Schreiber ‘59)

• Contour-Texture Approach (Kocher & Kunt ‘82)

• Object-Based Video Coder (Musmann et. al. ‘89)

• Talisman (Torborg & Kajiya ‘96)

• Blue screen matting (Vlahos ‘64)

Page 17: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Shape Coding• Bitmap-based

– 1 means in, 0 means out– Chroma-keying, GIF89a– G4 fax standard

• Contour-based– chain code– polygon/curve approximation– Fourier descriptor

Page 18: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Chain Code

• Follows the contour and encode the direction of next boundary pel

• 4 or 8 directions for an avg. of 1.2 or 1.4 bits per boundary pel

• Extensions– length– angular resolution

Page 19: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Polygon Approximation

• Add control points until maximum error is below threshold

• Threshold <= 1.4 pel for CIF (352*288) video

• Extension– curves of various order

Page 20: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Fourier Descriptor

• Translation, rotation, and scale invariant

• Sample contour -> ( xi, yi )

• i, ( yi+1 - yi ) / ( xi + 1 - xi )

• Compute Fourier Series coefficients

• Good for recognition, but not an efficient shape coder

Page 21: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

MPEG-4 Experiments• Chroma-keying

– color bleeding– need to decode whole frame to get shape

• Bitmap and contour-based coding are similar in:– error resilience– coding efficiency

• Bitmap-based is simpler for hardware due to regular memory access

Page 22: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

MPEG-4 Shape Coding

• Three types of macroblocks– transparent, opaque, and object boundary

• Context-based arithmetic encoder • Macroblocks can be subsampled• Texture padded with 0 or mean value• Transparency

– constant: one 8 bit value– arbitrary: treat it like color

Page 23: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Meshed Video

• 2D mesh tessellates the video into patches

• Motion vector for each vertex

• Texture warped in each patch

Page 24: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Meshed Video - Motivation

• Motion Modeling– Translational-block motion does not model

rotation, scaling, reflection, and shear

• Shape Modeling– Possible without depth

Page 25: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Meshed Video - Applications• Compression

– better motion compensation– transmit texture only at key frames– spatio-temporal interpolation (zooming, frame-rate

up-conversion)

• Manipulation– augmented reality– transfiguration (replace billboards)

• Indexing / searching

Page 26: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Face• Face object

– Default face model with terminal– Facial Definition Parameter or user supplied

model/texture– Facial Animation Parameter plus Amplification

and Filters– Lip Shape Animation from phoneme

Page 27: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Facial Definition Parameter

4.64.4

10.4

10.2

9.4

2.10

Y

Z

X

7.1

2.12.10

2.1

9.2

5.2 5.1

4.34.2 4.14.4

10.6

10.10

10.8

11.311.2

11.511.5

11.411.4

11.2

11.1

11.1

10.10

10.8

10.6

10.9

10.7

10.5

10.3

10.110.2

3.11

3.13

3.7

3.9

3.53.1

3.3

Left Eye

Other feature points

Feature points affected by FAPs6.1

6.3

6.4

Tongue

6.2

Mouth

8.4

8.7

8.5

2.4 8.3

8.1

2.5

2.8

2.6.2.2

2.9

2.7

2.3

8.108.6 8.9

8.8 8.2

3.14

3.12

3.10

3.8

3.63.2

Right Eye

4.6 4.5

9.119.10

9.9

9.8

Teeth

9.12

2.112.12

9.6

2.132.14 2.14

2.12

9.14

Nose

9.79.6

9.12

9.19.29.3

9.59.4

9.14 9.13

9.15

Y

X

Z

3.4

10.4

9.3

Page 28: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Facial Animation Parameter

ES0

ENS0

MNS0

MW0

IRISD0

Page 29: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Body

• Like the face

Page 30: A Picture is Worth a Thousand Words Milton Chen. What’s a Picture Worth? A thousand words - Descartes (1596-1650) A thousand bytes - modern translation

Ultimate Compression TechniqueComputer Graphics ???

• Block based DCT (MPEG-1/2)

• Arbitrary shaped video (MPEG-4)

• Meshed video (MPEG-4)

• Image based rendering

• Textured 3D graphics

• Geometry only 3D graphics