m4 / september 22 2004 1 integrating multimodal descriptions to index large video collections m4...

14
M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier, Stéphane Marchand-Maillet and Eric Bruno Viper group Computer Vision & Multimedia Laboratory University of Geneva

Upload: oswald-jenkins

Post on 21-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 1

Integrating multimodal descriptions to index large video collections

M4 meeting – Munich

Nicolas Moënne-Loccoz, Bruno Janvier, Stéphane Marchand-Maillet and Eric BrunoViper groupComputer Vision & Multimedia Laboratory University of Geneva

Page 2: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 2

Outline

i. Motivations

ii. Modelling video documents

iii. Indexing video documents

iv. Retrieving video documents

v. ViCoDE interface demonstration

vi. Conclusion

Page 3: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 3

Motivations

Video DocumentsEnd-User Indexing framework

QueryingStoring

Storing & Querying video documents Integrate multimodal descriptions Efficient retrieval algorithms

?

Page 4: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 4

Motivations

MPEG-7 multimedia documents model Standard Weak ontology support

Native XML-Database storage Document-centred

Inefficient cross documents multidimensionnal indexing

Page 5: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 5

Generic Data Model

Relational DBMS

Raw Data Access

External Indexing & Retrieval Processor

Indexing Framework

OVAL

MaxDBDBMS

Indexing Processor

Overview

Data Model

Page 6: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 6

Document Model

Subset of MPEG-7 model• MPEG-7 compliance (XML-enabled model)

Temporal Segments• Central Data Unit (Data-centred model)

Ontology Capabilities• W3C-Ontology Web Language compliant

Feature Space Indexing structures

Page 7: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 7

Document ModelD

escr

iptio

nS

truc

ture

Info

rmat

ion

Document Media Stream

Temporal Segment

Ontology Annotation Descriptor Feature Space

Index

Composed by

Role Uses

Page 8: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 8

Document Storage

Videos Documents :• TRECVid-2003, MPEG-7, IM2/M4 Corpus

~ 100 hours MPEG-1 Videos

Semantic Descriptions :• TRECVid-2003 Annotations, Manual Annotations

Multimodal feature-based descriptions :• Shot segmentation• Global color & motion histogram• Activity descriptors (Wavelet motion, Salient Regions)

• ASR histogram• Audio descriptors

Page 9: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 9

Document Index

Document meta-data & structure DBMS Access Facilities

(SQL queries, Query optimization, Indexes)

Document raw-data Object-based Video Access Library (C++, Java, Matlab)

Exact and Random access to media stream raw data

Plugins based (MPEG-1, MPEG-2)

Page 10: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 10

Document Index

Document descriptions External Indexing

• Semantic Description Use of ontology structure

• Feature Based Description Multi-dimensional index structure

(Distance Matrix, Vantage Point Tree…)

Index processor : MATLAB

Indexfile

Index Processor

IndexData

AccessScript

QueryData

Page 11: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 11

Document Retrieval

Multiple feature spaces relevance feedback

• Set of positive & negative examples :

• Set of feature space distances :

• Look for the best linear combination of distances according to the user query:

?

SSS ,

,, 21 ff ddd

dwi

fiquery

idwd

Page 12: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 12

Document Retrieval

• Linear discriminant analysis

Ranked result :

Relevance feedback loop

Sml lm

SjSi ij

,

,maxarg~dw

dww

W

Si

kikd ,~~dw

?

Page 13: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 13

Video Content Description & Exploration Front-end prototype of the presented framework

• Java Server Page web content generation

• OVAL based on the fly generation of the Key-Frame

• Matlab indexing and retrieval algorithm

ViCoDE demo

ViCoDE

Page 14: M4 / September 22 2004 1 Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

M4 / September 22 2004 14

Conclusion

Generic Multimedia Documents Management Framework Generic multimedia document model - MPEG-7 compliant Standard ontology support Generic Multimodal indexing & Retrieval engine

Integration of meeting features Content based querying of meeting collections

? Questions ?