pisa - proof of concept

21
medialab PISA PISA – Proof Proof of Concept of Concept Production, Indexing and Search of Audiovisual Material Production, Indexing and Search of Audiovisual Material

Upload: vrt-medialab

Post on 05-Dec-2014

2.539 views

Category:

Technology


0 download

DESCRIPTION

In the research project PISA we have investigated how powerful search engines can be build, given a library of audiovisual material that has been analysed objectively and intelligently

TRANSCRIPT

Page 1: PISA - Proof of Concept

medialab

PISA PISA –– Proof Proof of Conceptof Concept

Production, Indexing and Search of Audiovisual MaterialProduction, Indexing and Search of Audiovisual Material

Page 2: PISA - Proof of Concept

2

PISA - Positioning

PISA – Production and Indexing of Audiovisual Media

! 30 Man-year

! Virtual Modelling

! Computer Assisted Manufacturing

! Unsupervised Feature Extraction

! Search Engine Technology

Page 3: PISA - Proof of Concept

3

Context - Digital Media Production

Production Platform

Suprastructure – Metadata Mgnt

Production and distribution

Infrastructure - Networks and Storage

Production and distribution

Ingest

Media

Asset Mgnt

Editing

Playout

Mastering

Page 4: PISA - Proof of Concept

4

Digital Asset Management, Content Management…

Production Platform

Suprastructure – Metadata Mgnt

Infrastructure - Networks and Storage

Production and distribution

Page 5: PISA - Proof of Concept

5

User Expectations

Production Platform

Data General

Data General

Data General

Data General

Data General

Data General

MetaMeta

DataData

MetaMeta

DataData

Communication

(Information)

Suprastructure – Metadata Mgnt

Infrastructure - Networks and Storage

Production and distributionMedia Production

• Mass-production

• Anywhere, anytime, on any device

• Personalisation

The ideal search engine

• retrieves all relevant items (recall 100%)

• without false positives (precision 100%)

• enables instant access to digital media

• with respect to intellectual property.

Page 6: PISA - Proof of Concept

6

Archiving – Disclosure, Annotation,…

archiefnummer : ALG 20010813 1

fragmentnummer : 1

reeks : 1000 ZONNEN EN GARNALEN

bandnummer : E03024404

formaat : DBCM

fragmenttitel : 1000 ZONNEN & GARNALEN

beeld : KL/PALPLUS

fragmentduur : 18 20

tekst : 0'00" TOERISTISCH REPORTAGEMAGAZINE OVERZICHT

ONDERWERPEN GENERIEK TOERISTISCH REPORTAGEMAGAZINE,

OVERZICHT ONDERWERPEN

0'50" VANDAAG : KUNSTENAAR LUC HOFKENS ONTWIERP EEN OASE

OP ZIJN DAKTERRAS IN BORGERHOUT DIE DOET DENKEN AAN DE

GRAND CANYON INTERVIEW MET LUC EN ZIJN VROUW

MARILOU BUITENBEELD DAK MET OMGEVING BUITENKANT

ARBEIDERSWONING, PANO OVER ROTSWANDEN, KRATEN MET WATER,

BEPANTING, FOTOALBUM MET VERLOOP WERKEN

4'00" JUNIOR : KLAARTJE ALAERTS, 13 JAAR WIL ASTRONAUTEN

WORDEN ZE BEZOEKT HETEUROSPACE CENTER METRUIMTEVEREN,

RAKETTEN SIMULATIE IN RUIMTEVEER, INTERVIEW, HEEFT EEN

UFO GEZIEN MAAKT ZELF KLEIN RAKETJE, SCHIET HET AF

7'50" DE SCHEURKALENDER : ARCHIEF RECLAMEFILM IBM

INTERVIEW MAURICE DE WILDE, EERSTE PERSOONLIJKECOMPUTER

trefwoorden : BELGIE; BORGERHOUT; ARTIEST; OASE; KUNST; GRAND

CANYON (NATUURGEBIED); DAK; TERRAS; INTERVIEW; EURO

SPACE CENTER; RUIMTEVAART; PC; BOOTTOCHT; RIJKDOM;

PASSAGIER; GASTRONOMIE; RESTAURANT; PERSONEEL;

VAKANTIE; BINNENBEELD; SCHIP; BECKERS LEEN; VRT;

LOTTO; RADIOOMROEPSTER; KLANKSTUDIO; UITVINDING;

BARBECUE; BETONMOLEN; IBM; RECLAMESPOT

rechthebbende : VRT

Opzoekscherm FILM Set: 16 Aantal: 1

blz 1 van 3

trefwoorden: ibm and vrt

archiefnummer: -

uitzendjaar: maand: dag:

fragmentnummer: fragmentduur:

reeks:

formaat: bandnummer:

aflevering: afleveringsnummer:

programma: uitzenddatum:

fragmenttitel:

tekst:

kategorie:

opnamedatum: opnamenummer:

journalist: rechthebbende:

SETS

The strings required for the operation are not defined

F11 F12 F13 F14 F17 F18 F19 F20 Ent

Eindigen Sets Refset Toon Vorige Volg/Leeg Thesaurus Commando Opzoeken

Page 7: PISA - Proof of Concept

7

Page 8: PISA - Proof of Concept

8

Web 2.0 – « User Generated Content », « Social Tagging »?

Page 9: PISA - Proof of Concept

9

Catch-22

-> “Annotation” is a subjective interpretation, and

thus it is not scalable

-> Automated processing of information is a key

discriminator, but it requires correct and

structured metadata

-> Product Engineering is the source of structured

and meaningful information, but creative staff

are not susceptible to technology

Page 10: PISA - Proof of Concept

10

Objectives - Proof of Concept

• One Set of Numbers(!)

• Model Driven Development

• Computer Assisted Manufacturing

• Unsupervised Feature Extraction

• Efficient Search and Retrieval

Develop an extensible data-model and a consistent applicationDevelop an extensible data-model and a consistent application

framework, accessible via an intuitive user-interfaceframework, accessible via an intuitive user-interface

!

(! Digitizing analogue and disintegrated information flows)

Page 11: PISA - Proof of Concept

11

PISA - Overview

Abstract

Information

Footage

Concept

Virtual

Model

Model Driven Development:

• Setting (Stage properties, light)

• Character

• Synthetic Speech

• Sound effects

• Character animation

• Virtual camera

VirtualModelling

AutomatedProductionRealisation

• Ingest

• Editing

• Mastering

• Reproduction to alternative distribution channels

Computer Assisted Design

Computer Assisted Manufacturing

Script Editing

• Parse scenario

• Shooting script editor

• Storyboard

Script Editing

Reverse Engineering

• Shot segmentation

• Video footprint and reuse detection

• Biometric face detection

• Background analysis

• Speech-to-text

Interpretation

• Character identification

• Background categorisation

and identification

• Topic and eventdetection

Intelligent Analysis and

Quantization

Quantization

Analysis

Indexing

Retrieval

• Timecode based indexing

• Geo-temporal reference

• Taxonomy based indexing and search

• Facetted search

Search Engine

Page 12: PISA - Proof of Concept

12

The Search Client

Page 13: PISA - Proof of Concept

13

The Search Engine

Media Asset

Management System

(Ardome)

Search Engine

(Lucene/SOLR)

! Search federation by system integration

! Facetted search

! Integrated application of keywords

! Intuitive and structured presentation of results

! Random access to audiovisual material

Search Client

(Custom Development)

Legacy Video Library

(Basisplus)

Actual news items

(Ardome)

Raw Material

(EBU Superpop)

<NewsML-G2>

Page 14: PISA - Proof of Concept

14

The Annotation Client

Page 15: PISA - Proof of Concept

15

Computer Assisted Analysis

Page 16: PISA - Proof of Concept

16

Intelligent Analysis

Media Asset

Management

(Ardome)

Unsupervised feature extraction provides time-

coded attributes:

! Shot segmentation and keyframe extraction

! Audio segmentation and speaker recognition

! Subtitle processing and speech recognition

! Taxonomy-driven topic detection

! Face recognition

! Scene recognition

! Copy detection

Shot

Segmentation

Speech

Recognition

Face

DetectionTopic

Detection

Media

Production

Media Asset

Management System

(Ardome)

Search Engine

(Lucene/SOLR)

Legacy Video Library

(Basisplus)

Actual news items

(Ardome)

Raw Material

(EBU Superpop)

<NewsML-G2>

Page 17: PISA - Proof of Concept

17

Conclusion

! Enterprise search – structured metadata, limited number of libraries, limited number

of records per library, dependencies between objects

! Intelligent search federation is aware of the media production process - scripts,

webpages, subtitles and formal annotation may represent the same editorial object

! Random access to audiovisual material requires an index is based on timecode and

not « wordposition in a document »

! Onthology-driven application logic is essential to enable semantic awareness, i.e.

resolving synonyms and disambiguation of homonyms

! The perfect search engine is not for sale yet and required from the ground up design

and development.

Page 18: PISA - Proof of Concept

18

From « Metadata » to CAD/CAM

?

Page 19: PISA - Proof of Concept

19

Scoop

Page 20: PISA - Proof of Concept

20

Hype Cycle 2008…

Page 21: PISA - Proof of Concept

21

! http://medialab.vrt.be/pisa

! http://projects.ibbt.be/pisa