semantics and multimedia

43
Advances in Semantic Analysis of Multimedia Dr. Gerald Friedland International Computer Science Institute Berkeley, CA [email protected]

Upload: peter-berger

Post on 18-Dec-2014

1.333 views

Category:

Technology


6 download

DESCRIPTION

This is Gerald Friedland's presentation for SVST's Multi-Media and the Semantic Web.

TRANSCRIPT

Page 1: Semantics And Multimedia

Advances in Semantic Analysis of Multimedia

Dr. Gerald FriedlandInternational Computer Science Institute Berkeley, [email protected]

Page 2: Semantics And Multimedia

The Internet Today

2

Page 3: Semantics And Multimedia

Internet Use Today

3

Raphaël Troncy: Linked Media: Weaving non-textual content into the Semantic Web, MozCamp, 03/2009.

Page 4: Semantics And Multimedia

Types of Videos

4

Page 5: Semantics And Multimedia

5

Addressable Market forEnterprise Video Applications

Security $1.2 Billion

(Total Market $7.8B, 2005)(Source: JP Freeman)

($7B in 06. Source Lehman)

Asset Tracking $480m by 2010

(RFID in 2006 2.4B)(Total Asset protection $14.7B)(Source: Lehman report 2006)

QA/Operational Efficiency$700m

(source: Envysion, Arrowsight, corporate

analysis)

Training$600m

(source: Forrester Enterprise Software

report 2005)

Compliance$450m

(source: JP Freeman)

BI$400m

(Reporting and Analysis 4B)(Total BI market $13.3B)

(source: IDC BI tools 03-08)

IntelligentMarketing

$200m(source: T3CI corporate

analysis)

Government

(Intelligence, Defense, Homeland Security)

$4.0 Billion Commercially

Page 6: Semantics And Multimedia

Multimedia Capabilities: 1985

• Record• Store• Play• Random Seek• Annotate Manually

6

Page 7: Semantics And Multimedia

Multimedia Capabilities: 2009

• Record• Store• Stream• Play• Random Seek• Annotate Manually

7

Page 8: Semantics And Multimedia

Multimedia Capabilities: Wanted• Semantic Navigation• Search• Content Compare• Object Cut & Paste• Annotate Automatically• Infer over Content

8

=> Make multimedia “understandable” for computers.

Page 9: Semantics And Multimedia

Problems

9

•Multimedia data very dense manual annotation not feasable

•Multimedia content analysis is difficult and rarely good enough to create reliable products.

Page 10: Semantics And Multimedia

My Research...

Features

Recognition

Understanding

Filtering

Machine Learning

Context

AudioImages Video Text

Semantic Computing

Artificial Intelligence

Signal/Text Processing

KnowledgeNetwork

Semantic Web

Page 11: Semantics And Multimedia

My Research...

Hypotheses:• Multimedia content analysis works

better when every cue is taken into account (eg. video AND audio).

• Semantic is enabled through context. Converts AI research into products.

Page 12: Semantics And Multimedia

Context

• Inclusion of prior knowledge• Combination of algorithms• Multimodality:

– audio+video+...– extra hardware

• Human interaction• ...

12

Sources of Context:

Page 13: Semantics And Multimedia

Context as Key: Example 1

13

→ →

Visual Object Extraction

Cut

Paste

Horse

Meadow^V

Page 14: Semantics And Multimedia

Simple Interactive Object Extraction (SIOX)

14

→ →

Image User Input Output

Context delivered by human interaction

Page 15: Semantics And Multimedia

15

SIOX: Algorithm IdeaColor Signatures from image retrieval:

Y. Rubner, C. Tomasi, and L. J. Guibas: The Earth Mover’s Distance as a Metric for Image Retrieval. Int. Journal of Computer Vision, 40(2):99–121, 2000.

Idea: Instead of searching and image database, use Color Signatures to search inside an image.

Page 16: Semantics And Multimedia

16

SIOX in GIMPSIOX

Button

G. Friedland, K. Jantz, T. Lenz, F. Wiesel, R. Rojas: “Object Cut and Paste in Images and Videos”, International Journal of Semantic Computing Vol 1, No 2, pp. 221-247, World Scientific, USA, June 2007.

Page 17: Semantics And Multimedia

17

SIOX in Inkscape

Page 18: Semantics And Multimedia

18

SIOX in Blender

Page 19: Semantics And Multimedia

19

Extensions

Extracting multiple similar objects at once:

Page 20: Semantics And Multimedia

20

Sub-Pixel Refinement

Problem: Spill colors and foreground disappearance

Original SIOX GraphCut

Page 21: Semantics And Multimedia

21

Sub-Pixel Refinement

Detail Refinement Brush: Coarse Interaction

Page 22: Semantics And Multimedia

22

VideoSIOX

1st Frame:

Subsequent Frames:

Page 23: Semantics And Multimedia

More Information

http://www.siox.org

23

Page 24: Semantics And Multimedia

24

Shoesurfer

Page 25: Semantics And Multimedia

25

Shoesurfer

Page 26: Semantics And Multimedia

26

Shoesurfer

Page 27: Semantics And Multimedia

27

Shoesurfer

Page 28: Semantics And Multimedia

28

Shoesurfer

Page 29: Semantics And Multimedia

Context as Key: Example 2

29

Page 30: Semantics And Multimedia

Speaker Diarization: Who Spoke When?

30

Audiotrack:

Segmentation:

Clustering:

G. Friedland, O. Vinyals, Y. Huang, C. Müller: “Prosodic and other Long-Term Features for Speaker Diarization”, IEEE Transactions on Audio, Speech, and Language Processing, Vol 17, No 5, pp 985--993, July 2009.

Page 31: Semantics And Multimedia

Analyzing Meetings

31

Page 32: Semantics And Multimedia

Dominance Estimation

Page 34: Semantics And Multimedia

Narrative Theme Navigation

34

G. Friedland, L. Gottlieb, A. Janin: “Joke-o-mat: Browsing Sitcoms Punchline by Punchline”, Proceedings of ACM Multimedia, Beijing, China, October 2009.

Page 35: Semantics And Multimedia

Joke-O-Mat: Demo

35

http://www.youtube.com/watch?v=1qfa84Ulm5s

Page 36: Semantics And Multimedia

36

GStreamer

Source Recorder

User

Component 1

User

Component 2

User

Component n

Appscio

.

.

.

File

Device

Driver

Connecting Multimedia and Semantic Technologies

Page 37: Semantics And Multimedia

37

Custom Event

Source 1

Custom Event

Source 2

Custom Event

Source n

.

.

.

C/C++/Java

Interface

Pipeline Framework

Video Application Server

Scripting & Logic Engine

Web Technology

Interface

Events

Integrated

Development

Environment

Services Connector

Code

Semantic Media Framework

http://www.appscio.com

Page 38: Semantics And Multimedia

Semantic Analysis of Multimedia Data• enables automatic logical

inference on perceptually encoded data

• enables more “natural” interaction with the computer: “do what the user means”

• Interfaces nicely with Semantic Web technologies

38

Page 39: Semantics And Multimedia

A note...

39

James A. Hendler

Page 40: Semantics And Multimedia

40

MySTT

Open-Source, open-model, state-of-the-art speech recognizer for multiparty conversations.

Release Date: February 2010

Page 41: Semantics And Multimedia

41

4th IEEE International Conference on Semantic Computing 2010

Paper Deadline: May 3rd, 2010

Page 42: Semantics And Multimedia

Upcoming...

42

Page 43: Semantics And Multimedia

Thank You!

43

Questions?Contact:Dr. Gerald FriedlandInternational Computer Science Institute Berkeley, CAhttp://[email protected]