semantics and multimedia

Advances in Semantic Analysis of Multimedia

Dr. Gerald FriedlandInternational Computer Science Institute Berkeley, CAfriedland@icsi.berkeley.edu

The Internet Today

Internet Use Today

Raphaël Troncy: Linked Media: Weaving non-textual content into the Semantic Web, MozCamp, 03/2009.

Types of Videos

Addressable Market forEnterprise Video Applications

Security $1.2 Billion

(Total Market $7.8B, 2005)(Source: JP Freeman)

($7B in 06. Source Lehman)

Asset Tracking $480m by 2010

(RFID in 2006 2.4B)(Total Asset protection $14.7B)(Source: Lehman report 2006)

QA/Operational Efficiency$700m

(source: Envysion, Arrowsight, corporate

analysis)

Training$600m

(source: Forrester Enterprise Software

report 2005)

Compliance$450m

(source: JP Freeman)

BI$400m

(Reporting and Analysis 4B)(Total BI market $13.3B)

(source: IDC BI tools 03-08)

IntelligentMarketing

$200m(source: T3CI corporate

analysis)

Government

(Intelligence, Defense, Homeland Security)

$4.0 Billion Commercially

Multimedia Capabilities: 1985

• Record• Store• Play• Random Seek• Annotate Manually

Multimedia Capabilities: 2009

• Record• Store• Stream• Play• Random Seek• Annotate Manually

Multimedia Capabilities: Wanted• Semantic Navigation• Search• Content Compare• Object Cut & Paste• Annotate Automatically• Infer over Content

=> Make multimedia “understandable” for computers.

Problems

•Multimedia data very dense manual annotation not feasable

•Multimedia content analysis is difficult and rarely good enough to create reliable products.

My Research...

Features

Recognition

Understanding

Filtering

Machine Learning

Context

AudioImages Video Text

Semantic Computing

Artificial Intelligence

Signal/Text Processing

KnowledgeNetwork

Semantic Web

My Research...

Hypotheses:• Multimedia content analysis works

better when every cue is taken into account (eg. video AND audio).

• Semantic is enabled through context. Converts AI research into products.

Context

• Inclusion of prior knowledge• Combination of algorithms• Multimodality:

– audio+video+...– extra hardware

• Human interaction• ...

Sources of Context:

Context as Key: Example 1

→ →

Visual Object Extraction

Meadow^V

Simple Interactive Object Extraction (SIOX)

→ →

Image User Input Output

Context delivered by human interaction

SIOX: Algorithm IdeaColor Signatures from image retrieval:

Y. Rubner, C. Tomasi, and L. J. Guibas: The Earth Mover’s Distance as a Metric for Image Retrieval. Int. Journal of Computer Vision, 40(2):99–121, 2000.

Idea: Instead of searching and image database, use Color Signatures to search inside an image.

SIOX in GIMPSIOX

Button

G. Friedland, K. Jantz, T. Lenz, F. Wiesel, R. Rojas: “Object Cut and Paste in Images and Videos”, International Journal of Semantic Computing Vol 1, No 2, pp. 221-247, World Scientific, USA, June 2007.

SIOX in Inkscape

SIOX in Blender

Extensions

Extracting multiple similar objects at once:

Sub-Pixel Refinement

Problem: Spill colors and foreground disappearance

Original SIOX GraphCut

Sub-Pixel Refinement

Detail Refinement Brush: Coarse Interaction

VideoSIOX

1st Frame:

Subsequent Frames:

More Information

http://www.siox.org

Shoesurfer

Context as Key: Example 2

Speaker Diarization: Who Spoke When?

Audiotrack:

Segmentation:

Clustering:

G. Friedland, O. Vinyals, Y. Huang, C. Müller: “Prosodic and other Long-Term Features for Speaker Diarization”, IEEE Transactions on Audio, Speech, and Language Processing, Vol 17, No 5, pp 985--993, July 2009.

Analyzing Meetings

Dominance Estimation

I Know You...

http://www.icsi.berkeley.edu/~fractor/ioda_demo.avi

Narrative Theme Navigation

G. Friedland, L. Gottlieb, A. Janin: “Joke-o-mat: Browsing Sitcoms Punchline by Punchline”, Proceedings of ACM Multimedia, Beijing, China, October 2009.

Joke-O-Mat: Demo

http://www.youtube.com/watch?v=1qfa84Ulm5s

GStreamer

Source Recorder

Component 1

Component 2

Component n

Appscio

Device

Driver

Connecting Multimedia and Semantic Technologies

Custom Event

Source 1

Custom Event

Source 2

Custom Event

Source n

C/C++/Java

Interface

Pipeline Framework

Video Application Server

Scripting & Logic Engine

Web Technology

Interface

Events

Integrated

Development

Environment

Services Connector

Semantic Media Framework

http://www.appscio.com

Semantic Analysis of Multimedia Data• enables automatic logical

inference on perceptually encoded data

• enables more “natural” interaction with the computer: “do what the user means”

• Interfaces nicely with Semantic Web technologies

A note...

James A. Hendler

Open-Source, open-model, state-of-the-art speech recognizer for multiparty conversations.

Release Date: February 2010

4th IEEE International Conference on Semantic Computing 2010

Paper Deadline: May 3rd, 2010

Upcoming...

Thank You!

Questions?Contact:Dr. Gerald FriedlandInternational Computer Science Institute Berkeley, CAhttp://www.gerald-friedland.orgfriedland@icsi.berkeley.edu

semantics and multimedia

Technology

caliph & emir: semantics in multimedia retrieval and ... ·...

enhancing the presentation of multimedia extracted...

event semantics and model - multimedia events workshop

social networks and multimedia semantics · 2008-09-11 ·...

a multimedia service with mpeg-7 metadata and context...

extracting semantics from multimedia content: challenges and...

informatics and telematics institute centre for research and...

compositional and lexical semantics • compositional...

02 - syntax and semantics - uni-bonn.de · prolog syntax...

patrizia asirelli 10-11 july 2006 amsterdam w3c multimedia...

semantics and real world multimedia - its-wiki.no...fit-it...

logic and natural language semantics: formal semantics

measure semantics and qualitative semantics for · pdf...

semantics at the multimedia fragment level sssw 2013

dynamic semantics, discourse semantics and continuations ›...

semantics and lexicology svem21 3. structuralist semantics

statistical methods for learning multimedia...

philosophical semantics and linguistic semantics ·...

cognitive semantics and structural semantics

1 muscle network of excellence multimedia understanding...