integrating structure and semantics into audio-visual documents tuesday 21 st of october, 2003...

23
Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference (ISWC2003)

Upload: alfred-young

Post on 11-Jan-2016

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

Integrating Structure and Semantics into

Audio-visual Documents

Tuesday 21st of October, 2003

Raphaël Troncy

2nd International Semantic Web Conference (ISWC2003)

Page 2: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 2

Description of the AV content

• A three step process :– identification of the content creator and the

content provider : Dublin Core metadata, VRA

core categories …– structural decomposition in video segments

corresponding to the logical structure of the program : time-code, spatial coordinates

– semantic description of these segments : controlled vocabulary, thesaurus, free text annotation

Page 3: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 3

Description of the AV content

• Segmentation– locate and date some

events

• Description– type each segment with an

AV genre– type each segment with a

general thematic

– describe the scene (who, when, where, what, …)

describe the logical structure

describe the semantics of the content

time t

report

athletics

Michael Johnson smashed the 200mworld record to complete a 200m in

19''32 in Atlanta for the Olympic Games

Page 4: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 4

Example

Q : Find all AV sequences of type interview with Sandy Casar and concerning the Paris-Nice cycling race– noise answer : there are other sports news in the sequence– incomplete answer : the interview was broadcasted in two parts

and began in a previous sequence– the query cannot be extended !

13 [Indoor Set: 6th part]

at 18:43:56:00 - 00:09:06:00. – Eurosport

In studio, the second part of the interview, from Nice, of Sandy CASAR by Jean René GODART about the Paris-Nice cycling race and a few sports news with pictures commented by Alexandre BOYON and Laurent PUYAT.

Q : Find all AV sequences of type dialog sequence with a rider and concerning any cycling race with several stages

Page 5: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 5

• Requirements :– express models that constrain the logical structure

• identify an interview inside a report of a sports magazine

– represent the meaning contained in this structure• a cartoon is a fiction with no real characters

– describe semantically the content of each sequence• the Prologue is always an individual time trial numbered stage 0

Which languages are the most suitable to perform all these tasks ?

Problems• Weak use of the logical structures• Descriptions are not made for reasoning

make the AV descriptions accessible to automated processes

Page 6: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 6

"Pure" documentary approaches• General bibliographic description languages (DC, VRA)

• MPEG-7 : the new multimedia description language ?– three components: D, DS and DDL– structure: segment = abstract unit defined by temporal localization or

masks– semantics: entity–attribute–relation model + thesaurus for structuring the

knowledge (Classification Schemes)– tools: Videto (ZGDV), Vizard (EU-IST Project), MovieTool (© Ricoh)

• Extensions– XML Schema : add structure without semantics

• TV Anytime, Mdéfi [Tran Thuong, 2003]

– Classification Schemes : very poor expressivity• COALA [Fatemi, 2003]

Page 7: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 7

KR approaches: OWL+RDF

• Definition of concepts and relationsStudioProgram and ( HomogeneousProgram

(all hasPart StudioSequence) )

• Definition of axiomsHomogeneousProgram HeterogeneousProgram = Problem : the structure of the document

(i.e. the context) is lost !

let us merge the two approaches !

Page 8: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 8

General architecture

Documentschemes

Documentinstances

validusers AV Ontology

Domain-specific Ontology

statementsbase

MPEG-7 /XML Schema

OWL / RDF

documentalists

query

Transfor-mation

Page 9: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 9

The Audio-visual Ontology• Methodology of construction: [Bachimont et al., EKAW’02]

– Conceptualization : differential principles– Formalization : formal definitions, axioms– Operationalization : export into a KR language

• AV domain:– Production objects (program, sequence, AV genre), Properties

(theme), Persons, Technical Process (shooting, recording, post-production), Signal descriptors (audio, video), etc.

• Tools:– Conceptualization : DOE [Bachimont et al., EKAW’02]– Formalization : OilEd [Bechhofer, KI’01]– Languages : DAML+OIL … OWL

• DOE and ontologies are available at :http://opales.ina.fr/public/ontologies/

Page 10: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 10

The Audio-visual Ontology

Page 11: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 11

Documentschemes

Documentinstances

validusers AV Ontology

Domain-specific Ontology

statementsbase

MPEG-7 /XML Schema

OWL / RDF

documentalists

query

Transfor-mation

General architecture

Page 12: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 12

Generate XML Schema types

OWL• Class• Sub-class• Restriction on

properties• Union of classes

XML Schema• Complex type• Extension• Element of the content

model• Choice in the content

modelXSLT ?

Some concepts (program, sequence) extend the MPEG-7 Segment type, hence the descriptions are MPEG-7 valid

Page 13: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 13

Build description schemes for the documents

• Let us watch some sports magazine– construction of a simple schema based on

StudioSequence, Report and Interview– a Report contains some FilmClips of Broadcast Live

Sports

• The schema provides the description skeleton for several sports magazine:– Téléfoot (soccer)– VéloClub (cycling)– 3 Partout (multisports)

Page 14: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 14

Documentschemes

Documentinstances

validusers AV Ontology

Domain-specific Ontology

statementsbase

MPEG-7 /XML Schema

OWL / RDF

documentalists

query

Transfor-mation

General architecture

Page 15: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 15

SegmenTool [French project CHAPERON]

Page 16: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 16

Instantiate a document content model

<ina:Report id="aa23c647c-6517-4aee-8bce-870ae52a01af">

...

<mp7:TemporalDecomposition>

<ina:Interview id="adb23ab65-f8e7-4b2a-8c98-807197da600a">

<mp7:Semantic>...</mp7:Semantic>

<mp7:MediaTime>

<mp7:MediaTimePoint>T00:24:19</mp7:MediaTimePoint>

<mp7:MediaDuration>PT00H00M07S</mp7:MediaDuration>

</mp7:MediaTime>

<ina:Themes value="Cycling"/>

</ina:Interview>

</mp7:TemporalDecomposition>

...

</ina:Report> KBRDF triples

Interview

Cycling24m19s 7s

hasDurationhasThemes

hasStartTime

Page 17: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 17

Documentschemes

Documentinstances

validusers AV Ontology

Domain-specific Ontology

statementsbase

MPEG-7 /XML Schema

OWL / RDF

documentalists

query

Transfor-mation

General architecture

Page 18: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 18

The Cycling Ontology

Page 19: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 19

Knowledge base population

Cycling Domain

texttext

text

+Base of

facts

<rdf about="{URI}/SportsMagazine/Report3/Interview4">

<!– formal statements from a base of fact} -->

</rdf>

Rider

SeveralStagesRace

2

Sandy Casar

Paris-Nice

hasNameoverallResults

positioncyclingRace

hasName

<rdf:Description rdf:about="http://../Stade2-17_03_2002.xml#ina:Interview[@id=interview4]"> .....</rdf:Description>

Page 20: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 20

Implementation of the KB

• Sesame : architecture for the storage of RDF triples [Broekstra, 2002]– Supports different query languages: RQL, RDQL and SeRQL– Implements the RDFS semantics (RDF-MT engine)

• BOR : reasoner for the DAML+OIL language [Simov & Jordanov, 2002]

• SeBOR : integration of the two systems, done in the On-To-Knowledge EU-IST Project– Enhanced inference services are provided– Closed to what OWL DL reasoner will perform

Page 21: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 21

Sesame+BOR interface

Demo

Page 22: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 22

Conclusion• General architecture for reasoning on descriptions

of video documents:– Modeling of 2 ontologies (methodology + DOE)

– Formalization of these ontologies (OilEd, OWL)

– Creation of document schemes (extended MPEG-7)

– Creation of instances of these schemas: the structure of the descriptions (SegmenTool + XSLT transformation for creating a base of RDF triples)

– Creation of a Knowledge Base of events related to cycling race and use of an adapted reasoner (Sesame + BOR, ©AIdministrator-NL & ©OntoText-BG)

Page 23: Integrating Structure and Semantics into Audio-visual Documents Tuesday 21 st of October, 2003 Raphaël Troncy 2nd International Semantic Web Conference

10/21/2003 Raphaël Troncy - ISWC'2003 23

Future work• Development integration

– provide a simple interface for querying on both the structure and the content of the video

– watch the AV sequences corresponding to the RDF triples returned by SeBOR

• Mid-term objectives– scalability: test the system on a large base of videos annotated with

real users– use the future OWL reasoners

• Long-term objectives– use this architecture with another domain (other than cycling)– will we have to simply build another ontology ? what do we have to

adapt ?