integrating structure and semantics into audio-visual documents tuesday 21 st of october, 2003...

Integrating Structure and Semantics into

Audio-visual Documents

Tuesday 21st of October, 2003

Raphaël Troncy

2nd International Semantic Web Conference (ISWC2003)

10/21/2003 Raphaël Troncy - ISWC'2003 2

Description of the AV content

• A three step process :– identification of the content creator and the

content provider : Dublin Core metadata, VRA

core categories …– structural decomposition in video segments

corresponding to the logical structure of the program : time-code, spatial coordinates

– semantic description of these segments : controlled vocabulary, thesaurus, free text annotation

Description of the AV content

• Segmentation– locate and date some

events

• Description– type each segment with an

AV genre– type each segment with a

general thematic

– describe the scene (who, when, where, what, …)

describe the logical structure

describe the semantics of the content

time t

report

athletics

Michael Johnson smashed the 200mworld record to complete a 200m in

19''32 in Atlanta for the Olympic Games

Example

Q : Find all AV sequences of type interview with Sandy Casar and concerning the Paris-Nice cycling race– noise answer : there are other sports news in the sequence– incomplete answer : the interview was broadcasted in two parts

and began in a previous sequence– the query cannot be extended !

13 [Indoor Set: 6th part]

at 18:43:56:00 - 00:09:06:00. – Eurosport

In studio, the second part of the interview, from Nice, of Sandy CASAR by Jean René GODART about the Paris-Nice cycling race and a few sports news with pictures commented by Alexandre BOYON and Laurent PUYAT.

Q : Find all AV sequences of type dialog sequence with a rider and concerning any cycling race with several stages

• Requirements :– express models that constrain the logical structure

• identify an interview inside a report of a sports magazine

– represent the meaning contained in this structure• a cartoon is a fiction with no real characters

– describe semantically the content of each sequence• the Prologue is always an individual time trial numbered stage 0

Which languages are the most suitable to perform all these tasks ?

Problems• Weak use of the logical structures• Descriptions are not made for reasoning

make the AV descriptions accessible to automated processes

"Pure" documentary approaches• General bibliographic description languages (DC, VRA)

• MPEG-7 : the new multimedia description language ?– three components: D, DS and DDL– structure: segment = abstract unit defined by temporal localization or

masks– semantics: entity–attribute–relation model + thesaurus for structuring the

knowledge (Classification Schemes)– tools: Videto (ZGDV), Vizard (EU-IST Project), MovieTool (© Ricoh)

• Extensions– XML Schema : add structure without semantics

• TV Anytime, Mdéfi [Tran Thuong, 2003]

– Classification Schemes : very poor expressivity• COALA [Fatemi, 2003]

KR approaches: OWL+RDF

• Definition of concepts and relationsStudioProgram and ( HomogeneousProgram

(all hasPart StudioSequence) )

• Definition of axiomsHomogeneousProgram HeterogeneousProgram = Problem : the structure of the document

(i.e. the context) is lost !

let us merge the two approaches !

General architecture

Documentschemes

Documentinstances

validusers AV Ontology

Domain-specific Ontology

statementsbase

MPEG-7 /XML Schema

OWL / RDF

documentalists

Transfor-mation

The Audio-visual Ontology• Methodology of construction: [Bachimont et al., EKAW’02]

– Conceptualization : differential principles– Formalization : formal definitions, axioms– Operationalization : export into a KR language

• AV domain:– Production objects (program, sequence, AV genre), Properties

(theme), Persons, Technical Process (shooting, recording, post-production), Signal descriptors (audio, video), etc.

• Tools:– Conceptualization : DOE [Bachimont et al., EKAW’02]– Formalization : OilEd [Bechhofer, KI’01]– Languages : DAML+OIL … OWL

• DOE and ontologies are available at :http://opales.ina.fr/public/ontologies/

The Audio-visual Ontology

Documentschemes

Documentinstances

statementsbase

MPEG-7 /XML Schema

OWL / RDF

documentalists

Transfor-mation

Generate XML Schema types

OWL• Class• Sub-class• Restriction on

properties• Union of classes

XML Schema• Complex type• Extension• Element of the content

model• Choice in the content

modelXSLT ?

Some concepts (program, sequence) extend the MPEG-7 Segment type, hence the descriptions are MPEG-7 valid

Build description schemes for the documents

• Let us watch some sports magazine– construction of a simple schema based on

StudioSequence, Report and Interview– a Report contains some FilmClips of Broadcast Live

Sports

• The schema provides the description skeleton for several sports magazine:– Téléfoot (soccer)– VéloClub (cycling)– 3 Partout (multisports)

Documentschemes

Documentinstances

statementsbase

MPEG-7 /XML Schema

OWL / RDF

documentalists

Transfor-mation

SegmenTool [French project CHAPERON]

Instantiate a document content model

<ina:Report id="aa23c647c-6517-4aee-8bce-870ae52a01af">

<mp7:TemporalDecomposition>

<ina:Interview id="adb23ab65-f8e7-4b2a-8c98-807197da600a">

<mp7:Semantic>...</mp7:Semantic>

<mp7:MediaTime>

<mp7:MediaTimePoint>T00:24:19</mp7:MediaTimePoint>

<mp7:MediaDuration>PT00H00M07S</mp7:MediaDuration>

</mp7:MediaTime>

<ina:Themes value="Cycling"/>

</ina:Interview>

</mp7:TemporalDecomposition>

</ina:Report> KBRDF triples

Interview

Cycling24m19s 7s

hasDurationhasThemes

hasStartTime

Documentschemes

Documentinstances

statementsbase

MPEG-7 /XML Schema

OWL / RDF

documentalists

Transfor-mation

The Cycling Ontology

Knowledge base population

Cycling Domain

texttext

+Base of

<!– formal statements from a base of fact} -->

</rdf>

SeveralStagesRace

Sandy Casar

Paris-Nice

hasNameoverallResults

positioncyclingRace

hasName

<rdf:Description rdf:about="http://../Stade2-17_03_2002.xml#ina:Interview[@id=interview4]"> .....</rdf:Description>

Implementation of the KB

• Sesame : architecture for the storage of RDF triples [Broekstra, 2002]– Supports different query languages: RQL, RDQL and SeRQL– Implements the RDFS semantics (RDF-MT engine)

• BOR : reasoner for the DAML+OIL language [Simov & Jordanov, 2002]

• SeBOR : integration of the two systems, done in the On-To-Knowledge EU-IST Project– Enhanced inference services are provided– Closed to what OWL DL reasoner will perform

Sesame+BOR interface

Conclusion• General architecture for reasoning on descriptions

of video documents:– Modeling of 2 ontologies (methodology + DOE)

– Formalization of these ontologies (OilEd, OWL)

– Creation of document schemes (extended MPEG-7)

– Creation of instances of these schemas: the structure of the descriptions (SegmenTool + XSLT transformation for creating a base of RDF triples)

– Creation of a Knowledge Base of events related to cycling race and use of an adapted reasoner (Sesame + BOR, ©AIdministrator-NL & ©OntoText-BG)

Future work• Development integration

– provide a simple interface for querying on both the structure and the content of the video

– watch the AV sequences corresponding to the RDF triples returned by SeBOR

• Mid-term objectives– scalability: test the system on a large base of videos annotated with

real users– use the future OWL reasoners

• Long-term objectives– use this architecture with another domain (other than cycling)– will we have to simply build another ontology ? what do we have to

adapt ?

integrating structure and semantics into audio-visual documents tuesday 21 st of october, 2003...

av genretype

av contenta

av contentsegmentationlocate

av sequences of type

av descriptions accessible

logical structuresdescriptions

content provider

content creator

Documents

description of some multimedia ontologies...

saint-raphaËl fÉdÈre ses commerÇants€¦ · fédérer...

introduction to raphaël

canterbury language training professional english and...

raphaël kuhn to cite this version - dumas

raphaël m. jungers, fnrs associate and associate professor...

data conversion, linking and exploration - · pdf filedata...

raphaël tétreault boyle receives top prize for impressive...

raphaël zarka kaleidoscope 2009 by 220 jours raphaël zarka...

raphaël js conf

a motivating scenario for designing an extensible audio-...

(3.1) semantics static semantics – attribute grammars »...

sample activity report | communication strategy | raphaël...

un canadien éminent : raphaël bellemare (1821-1906)

raphaël pile, emile devillers, jean le besnerais to cite

selexity - oxygène - saint raphaël boulouris

release 2.9.1.dev0 raphaël barrois, mark sandstrom

created with raphaël 2.1.0 read plastics & rubber...

le catalogue laforêt fréjus et saint-raphaël - nos...

raphaël imbert