2004.11.18 - slide 1is 202 – fall 2004 lecture 23: multimedia information prof. ray larson &...

64
2004.11.18 - SLIDE 1 IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003 http://www.sims.berkeley.edu/academics/courses/ is202/f04/ SIMS 202: Information Organization and Retrieval

Post on 21-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 1IS 202 – FALL 2004

Lecture 23: Multimedia Information

Prof. Ray Larson & Prof. Marc Davis

UC Berkeley SIMS

Tuesday and Thursday 10:30 am - 12:00 pm

Fall 2003http://www.sims.berkeley.edu/academics/courses/is202/f04/

SIMS 202:

Information Organization

and Retrieval

Page 2: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 2IS 202 – FALL 2004

Today’s Agenda

• Problem Setting

• New Solutions– Media Streams– Active Capture– Adaptive Media

• Discussion Questions

• Action Items for Next Time

Page 3: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 3IS 202 – FALL 2004

Today’s Agenda

• Problem Setting

• New Solutions– Media Streams– Active Capture– Adaptive Media

• Discussion Questions

• Action Items for Next Time

Page 4: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 4IS 202 – FALL 2004

Global Media Network

• Digital media produced anywhere by anyone accessible to anyone anywhere

• Today’s media users become tomorrow’s media producers

• Not 500 Channels — 500,000,000 multimedia Web sources

Page 5: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 5IS 202 – FALL 2004

Media Asset Management and Reuse

• Media Asset Management– Corporate

• Media companies, media archives, training, sales, catalogs, etc.

– Government• Military, surveillance, law enforcement, etc.

– Academia• Libraries, research, instruction, etc.

– Consumer• Home video and photos, fan reuse of popular

content, etc.

Page 6: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 6IS 202 – FALL 2004

Applications of Analysis and Retrieval

• Professional and educational applications– Automated authoring of Web content– Searching and browsing large video archives– Easy access to educational materials– Indexing and archiving multimedia presentations– Indexing and archiving multimedia collaborative

sessions

• Consumer domain applications– Video overview and access– Video content filtering– Enhanced access to broadcast video

Page 7: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 7IS 202 – FALL 2004

The Media Opportunity

• Vastly more media will be produced• Without ways to manage it (metadata

creation and use) we lose the advantages of digital media

• Most current approaches are insufficient and perhaps misguided

• Great opportunity for innovation and invention

• Need interdisciplinary approaches to the problem

Page 8: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 8IS 202 – FALL 2004

What is the Problem?

• Today people cannot easily find, edit, share, and reuse media

• Computers don’t understand media content– Media is opaque and data rich– We lack structured representations

• Without content representation (metadata), manipulating digital media will remain like word-processing with bitmaps

Page 9: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 9IS 202 – FALL 2004

Signal-to-Symbol Problems

• Semantic Gap– Gap between low-

level signal analysis and high-level semantic descriptions

– “Vertical off-white rectangular blob on blue background” does not equal “Campanile at UC Berkeley”

Page 10: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 10IS 202 – FALL 2004

Signal-to-Symbol Problems

• Sensory Gap– Gap between how an object appears and what it is– Different images of same object can appear

dissimilar– Images of different objects can appear similar

Page 11: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 11IS 202 – FALL 2004

Computer Vision and Context

• You go out drinking with your friends• You get drunk• Really drunk• You get hit over the head and pass out• You are flown to a city in a country you’ve never been to

with a language you don’t understand and an alphabet you can’t read

• You wake up face down in a gutter with a terrible hangover

• You have no idea where you are or how you got there• This is what it’s like to be most computer vision systems

—they have no context and no memory

• Context and memory are what enable us to understand what we see

Page 12: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 12IS 202 – FALL 2004

Disabling Assumptions

1. Media capture and media analysis are separated in time and space – Therefore removed from their context of creation

and the users who created them2. Contextual metadata about the capture and

use of media are not available to media analysis – Therefore all analysis of media content must be

focused on the media signal alone3. Multimedia content analysis must be fully

automatic– Therefore missing out on the possibility of “human-

in-the-loop” approaches to algorithm design and network effects of groups of users

Page 13: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 13IS 202 – FALL 2004

Enabling Assumptions

1. Integrate media capture and analysis at the point of capture and throughout the media lifecycle

2. Leverage contextual metadata (spatial, temporal, social, etc.) about the capture and use of media content

3. Design systems that incorporate human beings as functional components and aggregate user behavior• Human-in-the-loop algorithms• Network effects of the aggregation and analysis of

human activity and media use

Page 14: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 14IS 202 – FALL 2004

M E T A D A T AMETADATA

Traditional Media Production Chain

PRE-PRODUCTION POST-PRODUCTIONPRODUCTION DISTRIBUTION

Metadata-Centric Production Chain

Page 15: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 15IS 202 – FALL 2004

Asset Retrieval and Reuse

Automated Media Production Process

Web Integration and

Streaming MediaServices

FlashGenerator

MMS

XHTML

Print/PhysicalMedia

ActiveCapture

1Automatic

Editing3

Personalized/Customized

Delivery

4

Adaptive Media Engine

2 Annotationand Retrieval

Reusable Online Asset Database

Annotation ofMedia Assets

Page 16: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 16IS 202 – FALL 2004

Chang: Content-Based Media Analysis

• “Traditional views of content-based technologies focus on search and retrieval—which is important but relatively narrow.”

• “[…] emphasizing the end-to-end content chain and the many issues evolving around it. What’s the best way to integrate manual and automatic solutions in different parts of the chain?”

Page 17: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 17IS 202 – FALL 2004

Chang: Content-Based Media Technology

• Practical impact criteria for evaluating multimedia research directions– Generating metadata not available from

production– Providing metadata that humans aren’t good

at generating– Focusing on content with large volume and

low individual value– Adopting well-defined tasks and performance

metrics

Page 18: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 18IS 202 – FALL 2004

Chang: Content-Based Media Technology

• Areas of research– Reverse engineering of the media capturing and

editing processes– Extracting and matching objects– Meaning decoding and automatic annotation– Analysis and retrieval with user feedback– Generating time-compressed skims– Efficient indexing for large databases– Content adaptation for accessing, multimedia over

heterogeneous devices– Standards for specifying content description language

and scheme like MPEG-7

Page 19: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 19IS 202 – FALL 2004

Computational Media Aesthetics

• “ […] the algorithmic study of a variety of image and aural elements in media (based on their use in film grammar). It is also the computational analysis of the principles that have emerged underlying their manipulation in the creative art of clarifying, intensifying, and interpreting an event for an audience.”

• “Our research systematically uses film grammar to inspire and underpin an automated process of analyzing, characterizing, and structuring professionally produced videos.”

Page 20: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 20IS 202 – FALL 2004

CMA Challenges

• Can we dynamically detect successful aesthetic principles with accuracy and consistency using computational analysis?

• Can we build new postproduction tools based on this analysis for rapid, cost-efficient, and effective moviemaking and consistent evaluation?

• How can we use these successful audio–visual strategies for improved training and education in mass communication?

• How do we raise the quality of media annotation and improve the usability of content-based video search and retrieval systems?

Page 21: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 21IS 202 – FALL 2004

Today’s Agenda

• Problem Setting

• New Solutions– Media Streams– Active Capture– Adaptive Media

• Discussion Questions

• Action Items for Next Time

Page 22: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 22IS 202 – FALL 2004

Garage Cinema Research

• Research and develop technology and applications that will enable daily media consumers to become daily media producers

• Theory, design, and development of digital media systems that– Create descriptions of media content and structure

(metadata)– Use metadata to automate media production and

reuse

Page 23: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 23IS 202 – FALL 2004

Research Projects

• Media Streams– A framework for creating metadata throughout the media

production cycle to enable media reuse• Active Capture

– Automates direction and cinematography using real-time audio-video analysis in an interactive control loop to create reusable media assets

• Adaptive Media– Uses adaptive media templates and automatic editing functions

to mass customize and personalize media• Mobile Media Metadata

– Leverages the spatio-temporal context and social community of media capture to automate metadata creation for mobile media

• Social Uses of Personal Media– Analysis of social uses of media to predict future uses and

shape the design of next-generation personal media devices and applications

Page 24: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 24IS 202 – FALL 2004

Today’s Agenda

• Problem Setting

• New Solutions– Media Streams– Active Capture– Adaptive Media

• Discussion Questions

• Action Items for Next Time

Page 25: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 25IS 202 – FALL 2004

Research Projects

• Media Streams– A framework for creating metadata throughout the media

production cycle to enable media reuse• Active Capture

– Automates direction and cinematography using real-time audio-video analysis in an interactive control loop to create reusable media assets

• Adaptive Media– Uses adaptive media templates and automatic editing functions

to mass customize and personalize media• Mobile Media Metadata

– Leverages the spatio-temporal context and social community of media capture to automate metadata creation for mobile media

• Social Uses of Personal Media– Analysis of social uses of media to predict future uses and

shape the design of next-generation personal media devices and applications

Page 26: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 26IS 202 – FALL 2004

Media Metadata: Media Streams

Page 27: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 27IS 202 – FALL 2004

Media Streams Features

• Key features– Stream-based representation (better segmentation)– Semantic indexing (what things are similar to)– Relational indexing (who is doing what to whom)– Temporal indexing (when things happen)– Iconic interface (designed visual language)– Universal annotation (standardized markup schema)

• Key benefits– More accurate annotation and retrieval– Global usability and standardization– Reuse of rich media according to content and structure

Page 28: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 28IS 202 – FALL 2004

Today’s Agenda

• Problem Setting

• New Solutions– Media Streams– Active Capture– Adaptive Media

• Discussion Questions

• Action Items for Next Time

Page 29: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 29IS 202 – FALL 2004

Research Projects

• Media Streams– A framework for creating metadata throughout the media

production cycle to enable media reuse• Active Capture

– Automates direction and cinematography using real-time audio-video analysis in an interactive control loop to create reusable media assets

• Adaptive Media– Uses adaptive media templates and automatic editing functions

to mass customize and personalize media• Mobile Media Metadata

– Leverages the spatio-temporal context and social community of media capture to automate metadata creation for mobile media

• Social Uses of Personal Media– Analysis of social uses of media to predict future uses and

shape the design of next-generation personal media devices and applications

Page 30: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 30IS 202 – FALL 2004

Creating Metadata During Capture

New Capture Paradigm

1 Good Capture Drives

Multiple Uses

Current Capture Paradigm

Multiple Captures To Get

1 Good Capture

Page 31: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 31IS 202 – FALL 2004

Active Capture

Processing

Capture Interaction

ActiveCapture

ComputerVision/

Audition

HumanComputerInteraction

Direction/Cinematography

Page 32: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 32IS 202 – FALL 2004

Active Capture

• Active engagement and communication among the capture device, agent(s), and the environment

• Re-envision capture as a control system with feedback

• Use multiple data sources and communication to simplify the capture scenario

• Use HCI to support “human-in-the-loop” algorithms for computer vision and audition

Page 33: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 33IS 202 – FALL 2004

Human-In-The-Loop Algorithms

• Leverage what humans and computers are respectively good at– Example:

• Object recognition and tracking

• Leverage interaction with the situated human agent– Examples:

• Activity recognition (Jump detector with “Simon Says” interaction)• Object recognition (Car finder with “Treasure Hunt” interaction)

Page 34: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 34IS 202 – FALL 2004

Active Capture Setup

Page 35: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 35IS 202 – FALL 2004

Active Capture

Page 36: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 36IS 202 – FALL 2004

Active Capture: Reusable Shots

Page 37: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 37IS 202 – FALL 2004

Today’s Agenda

• Problem Setting

• New Solutions– Media Streams– Active Capture– Adaptive Media

• Discussion Questions

• Action Items for Next Time

Page 38: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 38IS 202 – FALL 2004

Research Projects

• Media Streams– A framework for creating metadata throughout the media

production cycle to enable media reuse• Active Capture

– Automates direction and cinematography using real-time audio-video analysis in an interactive control loop to create reusable media assets

• Adaptive Media– Uses adaptive media templates and automatic editing functions

to mass customize and personalize media• Mobile Media Metadata

– Leverages the spatio-temporal context and social community of media capture to automate metadata creation for mobile media

• Social Uses of Personal Media– Analysis of social uses of media to predict future uses and

shape the design of next-generation personal media devices and applications

Page 39: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 39IS 202 – FALL 2004

Marc Davis in T2 Trailer

Page 40: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 40IS 202 – FALL 2004

Evolution of Media Production

• Customized production– Skilled creation of one media product

• Mass production– Automatic replication of one media product

• Mass customization– Skilled creation of adaptive media templates– Automatic production of customized media

Page 41: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 41IS 202 – FALL 2004

Editing Paradigm Has Not Changed

Page 42: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 42IS 202 – FALL 2004

Computational Media

• More intimately integrate two great 20th century inventions

Page 43: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 43IS 202 – FALL 2004

• Movies change from being static data to programs

• Shots are inputs to a program that computes new media based on content representation and functional dependency (US Patents 6,243,087 & 5,969,716)

Central Idea: Movies as Programs

Parser

Parser

Producer

Media

Media

Media

ContentRepresentation

ContentRepresentation

Page 44: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 44IS 202 – FALL 2004

Automatic Video and Audio Editing

Automatically edit the output movie basedon content representation of dialogue and sound

Example of editing based on dialogue

Example of synchronizingvideo to music

Page 45: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 45IS 202 – FALL 2004

1-Shot/2-Shot/Cutaway L-Cutting

Page 46: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 46IS 202 – FALL 2004

Automatic Audio-Video Synchronization

Raw Celery Chopping Video U2 “Numb” Audio Unsynched Numb Celery Music Video

Synched Numb Celery Music Video

Page 47: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 47IS 202 – FALL 2004

ContentNot Author-Generated

Author-Generated

Author-Generated

Structure

CompilationMovie Making

TraditionalMovie Making

Historical DocumentaryMovie Making

Adaptive Media Design Space

Page 48: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 48IS 202 – FALL 2004

Video Lego(structure isconstrained)

Video MadLibs(structure is determined)

ContentNot Author-Generated

Author-Generated

Author-Generated

Structure

CompilationMovie Making

TraditionalMovie Making

Historical DocumentaryMovie Making

Adaptive Media Design Space

Page 49: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 49IS 202 – FALL 2004

The Blank Page Approach

Page 50: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 50IS 202 – FALL 2004

Captain Zoom IV MadLib™

Page 51: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 51IS 202 – FALL 2004

Constructing With Lego™ Blocks

Page 52: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 52IS 202 – FALL 2004

Video MadLibs and Video Lego

• Video MadLibs– Adaptive media template

with open slots– Structure is fixed– Content can be varied

• Video Lego– Reusable media

components that know how to fit together

– Structure is constrained– Content can be varied

Page 53: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 53IS 202 – FALL 2004

Today’s Agenda

• Problem Setting

• New Solutions– Media Streams– Active Capture– Adaptive Media

• Discussion Questions

• Action Items for Next Time

Page 54: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 54IS 202 – FALL 2004

Discussion Questions (Chang)

• Jen King on “The Holy Grail of Content-Based Media Analysis”

– Chang mentions three projects his lab has been working on:• Live sports video filtering• Medical video indexing and summarizing• Computational parsing and skimming of films

– What types of consumer-focused applications would benefit from content-based media analysis?

Page 55: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 55IS 202 – FALL 2004

Discussion Questions (Chang)

• Jen King on “The Holy Grail of Content-Based Media Analysis”

– One of the impact criteria Chang mentions is “focusing on content with large volume and low individual value,” such as home/family videos. What value is there to be gained by annotating millions of hours of weddings, graduations, and birthday parties?

Page 56: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 56IS 202 – FALL 2004

Discussion Questions (CMA)

• Tim Dennis on “Computational Media Aesthetics”– Dorai and Venkatesh propose a

"computational media aesthetics" and its potential use of film grammar to create future tools that will allow mass adoption of "successful techniques" of media production. Is it possible to use film grammar -- pacing, tempo, lighting – to identify "successful aesthetic" principles?

Page 57: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 57IS 202 – FALL 2004

Discussion Questions (CMA)

• Tim Dennis on “Computational Media Aesthetics”– Dorai and Venkatesh describe creating a framework

for computationally determined elements based on "basic devices" of film grammar -- shot, motion, recording distances, and practices -- and use these primitive features to build up higher order semantic based on production knowledge and film grammar.  Will the notion of production knowledge and film grammar be something that is culturally situated? Will there be different film grammars for each film production milieus, e.g., Bollywood, Hollywood, etc.?

Page 58: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 58IS 202 – FALL 2004

Discussion Questions (Davis)

• Andrew Iskandar on “Editing Out Video Editing”– The article mentions briefly the concept of “Video

Lego” where a “set of reusable media components” will “know how to fit together.” This shifts the idea of computational video creation from merely switching around paradigmatic media elements in a fixed syntagmatic structure (Video Mad Libs) to creation of syntagmatic structure. What issues and challenges do you foresee in this idea? Will ‘Video Lego’ ever ‘know’ enough to make video media productions?

Page 59: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 59IS 202 – FALL 2004

Discussion Questions (Davis)

• Andrew Iskandar on “Editing Out Video Editing”– Active Capture is used to create different

paradigmatic media elements outside of a particular context (i.e. screaming, turning of head, etc.) This is creates the building blocks for computational media production. Does the fact that these media elements are created outside of a particular context present any problems and challenges? How can they be resolved?

Page 60: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 60IS 202 – FALL 2004

Discussion Questions (Davis)

• Andrew Iskandar on “Editing Out Video Editing”– This concept of pulling content out of context

so that video media elements can be reused has further applications. We’ve seen it in document engineering through the XML class. What other areas of media creation or other can this idea, of pulling content out of context to promote reusability, apply to? Graphic Art? Music? Poetry? Why is it more challenging in certain fields than others?

Page 61: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 61IS 202 – FALL 2004

Today’s Agenda

• Problem Setting

• New Solutions– Media Streams– Active Capture– Adaptive Media

• Discussion Questions

• Action Items for Next Time

Page 62: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 62IS 202 – FALL 2004

Assignment 7

• Metadata Consolidation– Excel Phase– RDF Phase

• Taxonomy Items– Ontology

• MMMBase Items– Facet Syntax

• AnnotationBase Items– Exposed in UI

• Relations– Semantics beyond subclasses

Page 63: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 63IS 202 – FALL 2004

Assignment 7

• Protégé– Workshop

• Monday at 1:00 pm

Page 64: 2004.11.18 - SLIDE 1IS 202 – FALL 2004 Lecture 23: Multimedia Information Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30

2004.11.18 - SLIDE 64IS 202 – FALL 2004

Next Time

• Metadata for Motion Pictures: Media Streams and MPEG-7

• Readings for next time– “Media Streams: An Iconic Visual Language for Video

Representation” (Davis)• Jennifer

– “MPEG-7 (Part 1)” (Martinez, Koenen, Pereira)• JingHua

– “MPEG-7 (Part 2)” (Martinez, Koenen, Pereira) • Sarita