semantics at the multimedia fragment level or how enabling the remixing of online media

44
Semantics at the multimedia Semantics at the multimedia fragment level or how enabling the remixing of online media Raphaël Troncy <[email protected] >

Upload: raphael-troncy

Post on 05-Dec-2014

1.288 views

Category:

Technology


1 download

DESCRIPTION

Presentation given at the 6th tele-TASK symposium in HPI, October 2012, Postdam, Germany

TRANSCRIPT

Page 1: Semantics at the multimedia fragment level or how enabling the remixing of online media

Semantics at the multimediaSemantics at the multimedia fragment level or how enabling

the remixing of online mediaRaphaël Troncy <[email protected]>

Page 2: Semantics at the multimedia fragment level or how enabling the remixing of online media

Once upon a time …Once upon a time …

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 2

Page 3: Semantics at the multimedia fragment level or how enabling the remixing of online media

… leading to sharing Media Fragments… leading to sharing Media Fragments

Publishing status message containing M di F t URIa Media Fragment URIUse a ‘#’ !Highlight a

videosequencesequence

Highlight a regionregionto pay attention to

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 3

Page 4: Semantics at the multimedia fragment level or how enabling the remixing of online media

What are Media Fragments?What are Media Fragments?

t0 20 35temporal media fragment

spatial media fragment

track media fragment

09/10/2012 - - 4Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012

Page 5: Semantics at the multimedia fragment level or how enabling the remixing of online media

Media Fragments (temporal)Media Fragments (temporal)

Original resource lengthlength

Fragment beginning Fragment endPlayback progress

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 5

Fragment beginning gPlayback progress

Page 6: Semantics at the multimedia fragment level or how enabling the remixing of online media

Media Fragments (spatial) + Media Fragments (spatial) + DemoDemo

semi opaque

highlighted fragment

semi-opaque overlay

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 6

Page 7: Semantics at the multimedia fragment level or how enabling the remixing of online media

Media Fragments URIsMedia Fragments URIs

Bookmark / Share parts (fragments) of di / id t taudio/video content

Annotate media fragmentsAnnotate media fragments

Search for media fragments

Mash-ups

C b d idth Conserve bandwidth

http://www.w3.org/TR/media-frags-reqs/http://www.w3.org/TR/media frags reqs/http://www.w3.org/TR/media-frags/

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 7

Page 8: Semantics at the multimedia fragment level or how enabling the remixing of online media

Video annotationVideo annotation

09/10/2012 - - 8Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012

Page 9: Semantics at the multimedia fragment level or how enabling the remixing of online media

Video interactivityVideo interactivity

CONCEPT IN PLAYER

CubismExpressionism

Fauvism

CO CFACETS / PROPERTIES OF CONCEPT CONTENT ENRICHMENT

09/10/2012 - - 9Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012

Page 10: Semantics at the multimedia fragment level or how enabling the remixing of online media

LinkedTVLinkedTV EU ProjectEU Project Vision

Ubiquitously online cloud of Ubiquitously online cloud of Networked Audio-Visual Content

12 Excellent PartnersFraunhofer E Decoupled from place,

device or source

Fraunhofer STI GMBH

CERTH

Eurecom Condat

BEELD EN GELUID Aim

provide interactive multimedia service for non-

UEP UMONS

CWI

Noterik U. ST GALLEN

RBBmultimedia service for non-professional end-users

focus television broadcast d idcontent as seed videos

Web: http://www.linkedtv.eu

09/10/2012 - - 10Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012

Page 11: Semantics at the multimedia fragment level or how enabling the remixing of online media

Video AccessibilityVideo Accessibility

What is required to make video accessible on the Web?

Technologies: Annotating: automatic (speech transcription) and manual (social

collaborative annotation tool)collaborative annotation tool) Addressing: pointing to, retrieving, transmitting only parts of media Rendering: video visualization for the impaired, Braille output Rendering: video visualization for the impaired, Braille output

Benchmarking: Sphinx, HTK, J liJulius

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 11

Page 12: Semantics at the multimedia fragment level or how enabling the remixing of online media

Speech ProcessingSpeech Processing

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 12

Page 13: Semantics at the multimedia fragment level or how enabling the remixing of online media

Demo: Demo: http://semantics.eurecom.fr/acav/http://semantics.eurecom.fr/acav/

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 13

Page 14: Semantics at the multimedia fragment level or how enabling the remixing of online media

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 14

Page 15: Semantics at the multimedia fragment level or how enabling the remixing of online media

Semantic indexing at the fragment levelSemantic indexing at the fragment level

Benchmarking: Sphinx, HTK, JuliusJulius

NER + full text index with the transcriptiontranscription

Interlinking with the Linked Data Cloud to enable semantic searchCloud to enable semantic search

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 15

Page 16: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD: Named Entity Recognition and NERD: Named Entity Recognition and DisambiguationDisambiguation Compare performances of Named Entity

Recognition tools

DisambiguationDisambiguation

Recognition tools Understand strengths and weaknesses of different Web APIs Adapt NER processing to different context Adapt NER processing to different context

(Learn how to) Combine NER tools

Participate in the ANR ETAPE benchmark

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 16

Page 17: Semantics at the multimedia fragment level or how enabling the remixing of online media

What is a Named Entity recognition task?What is a Named Entity recognition task?

A task that aims to locate and classify the name of a person or an organization a location a brand aperson or an organization, a location, a brand, a product, a numeric expression including time, date, money and percent in a textual documenty p

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 17

Page 18: Semantics at the multimedia fragment level or how enabling the remixing of online media

NER Tools and Web APIsNER Tools and Web APIs

Standalone softwareGATEStanford CoreNLPTemis

Web APIshttp://nerd.eurecom.fr/

Web APIs

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 18

Page 19: Semantics at the multimedia fragment level or how enabling the remixing of online media

What is NERD?REST API2ontology1

UI3UI3The NERD ontology has been integrated in the NIF project, g p j ,

a EU FP7 in the context of the LOD2: Creating Knowledge

out of Interlinked Data1 http://nerd.eurecom.fr/ontology

2 http://nerd.eurecom.fr/api/application.wadl3 http://nerd.eurecom.frhttp://nerd.eurecom.fr

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 19

Page 20: Semantics at the multimedia fragment level or how enabling the remixing of online media

Factual comparison of 10 Web NER toolsFactual comparison of 10 Web NER toolsAlchemy

APIDBpedia Spotlight

Evri Extractiv Lupedia OpenCalais

Saplo Wikimeta Yahoo! Zemanta

Language EN FR EN EN I EN EN FR EN FR EN EN FR EN ENLanguage EN,FR,GR,IT,PT,RU,SP,SW

ENGR*PT*SP*

EN,IT

EN EN,FR,IT

EN,FRSP

EN,SW

EN,FRSP

EN EN

Granularity OEN OEN OED OEN OEN OEN OED OEN OEN OEDGranularity OEN OEN OED OEN OEN OEN OED OEN OEN OED

Entityposition

N/A charoffset

N/A wordoffset

range of chars

charoffset

N/A POSoffset

rangeof

chars

N/A

Classificationschema

Alchemy DBpediaFreeBaseScema.or

g

Evri DBpedia DBpediaLinkedM

DB

OpenCalais

N/A ESTER Yahoo FreeBase

Number of classes

324 320 5 34 319 95 5 7 13 81

ResponseFormat

JSONMicroFXMLRDF

HTMLJSONRDFXML

HTML

JSON

HTMLJSONRDFXML

HTMLJSONRDFaXML

JSONMicroFormat

JSON JSONXML

JSONXML

XMLJSONRDF

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 20/15

RDF

Quota (calls/day)

30000 unl 3000

3000 unl 50000 1333 unl 5000 10000

Page 21: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD OntologyNERD Ontology

Ali d th t i d bAligned the taxonomies used by the extractors

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 21

Page 22: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD type Occurrence

Person 10

Organization 10Building the NERD OntologyBuilding the NERD Ontology

Organization 10

Country 6

Company 6

Location 6

Continent 5

City 5

RadioStation 5

Album 5

Product 5

... ...

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 22

Page 23: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD REST APINERD REST API

RDF

GET,POST,

/document/user/annotation/{extractor}

RDF

POST,PUT,

DELETE

/annotation/{extractor}/extraction/evaluation

JSON

...“entities” : [{

“entity”: “Tim Berners-Lee” ,“type”: “Person” ,“uri”: "http://dbpedia.org/resource/Tim berners lee",p p g _ _ ,“nerdType”: "http://nerd.eurecom.fr/ontology#Person",“startChar”: 30,“endChar”: 45,“confidence”: 1,,“relevance”: 0.5

}]

Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web ExtractionTools In: European chapter of the Association for Computational Linguistics (EACL'12) Avignon FranceTools. In: European chapter of the Association for Computational Linguistics (EACL 12), Avignon, France.

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 23

Page 24: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD meets NIFNERD meets NIF

Model documents through a set of strings deferencable onset of strings deferencable on the Web

: offset 23107 23110 a str:String ;: offset_23107_ 23110 a str:String ;str:referenceContext :offset_0_26546 .

M t i t tit

: offset_23107_ 23110 sso:oen dbpedia:W3C.

Map string to entity

Classification

dbpedia:W3C rdf:type nerd:Organization .

Rizzo G, Troncy R., Hellmann S. and Bruemmer M. (2012), NERD meets NIF: Lifting NLP Extraction Results to the LinkedRizzo G, Troncy R., Hellmann S. and Bruemmer M. (2012), NERD meets NIF: Lifting NLP Extraction Results to the LinkedData Cloud. In: (LDOW'12) Linked Data on the Web (WWW'12), Lyon, France.

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 24

Page 25: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD User InterfaceNERD User Interface

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 25

Page 26: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD DashboardNERD Dashboard

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 26

Page 27: Semantics at the multimedia fragment level or how enabling the remixing of online media

History of NER benchmarksHistory of NER benchmarks CoNLL 2003 and CoNLL 2005

schema (4 types): person, organization, location and miscellaneous schema (4 types): person, organization, location and miscellaneous language independent task

ACE 2004 ACE 2005 and ACE 2007 ACE 2004, ACE 2005 and ACE 2007 schema (7 types): person, organization, location, facility, weapon,

vehicle and geo-political entity entity recognition, not just name (e.g. description, pronoun) find relationships among entities extracted

TAC 2009 (Knowledge Base Track) schema (3 types): person, organization and location create a knowledge base from the named entities extracted

ETAPE 2012 (Named Entity Task) schema: Quaero (7 main types, 32 sub-types)

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 27

Page 28: Semantics at the multimedia fragment level or how enabling the remixing of online media

ETAPE 2012 challengeETAPE 2012 challenge

genre train dev test sources

TV news 7h 40m 1h 40m 1h 40m BFM Story, Top QUestions (LCP)

TV d b t 10h 30 5h 10 5h 10 Pile et Face, Ca vous regarde, TV debates 10h 30m 5h 10m 5h 10m , g ,Entre les lignes (LCP)

TV amusements - 1h 05m 1h 05m La place du village (TV8)

Train Dev EvalItem length 26h 10h 55m 10h 55m Nb files 44 15 15 Nb words 290517 91656 115511Nb Named Entities 46763 14398 13055Nb Named Entities 46763 14398 13055Nb unique categories 33 33 33

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 28

Page 29: Semantics at the multimedia fragment level or how enabling the remixing of online media

Participation at ETAPE Participation at ETAPE (combined strategy)(combined strategy)

extraction

(e t URI si ei ) l i(eA1,tA1,URIA1,siA1,eiA1) .........(eA2,tA2,URIA2,siA2,eiA2)(eA3,tA3,URIA3,siA3,eiA3)

cleaning

(e t URI si ei )

fusionWhen at least 2 extractors classify the same entity with a different type then

` (eN2,tN2,URIN2,siN2,eiN2)(eN1,tN1,URIN1,siN1,eiN1) same entity with a different type then

we apply a preferred selection order (empirically defined): Wikimeta,

AlchemyAPI OpenCalais LupediaAlchemyAPI, OpenCalais, Lupedia

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 29

Page 30: Semantics at the multimedia fragment level or how enabling the remixing of online media

Participation at ETAPE Participation at ETAPE (combined+ strategy)(combined+ strategy)ETAPE

Train & Dev

...

Learned model POS tagger

(eA1,tA1,URIA1,siA1,eA1)(eA2,tA2,URIA2,siA2,eiA2)

Created static rules

f

Apply rules

) fusionConflicts handled by

priority selection: own, (e1,t1,URI1,si1,ei1)

`(eN1,tN1,URIN1,sN1,eN1)

Wikimeta,AlchemyAPI,OpenCalais,Lupedia

(eN2,tN2,URIN2,sN2,eN2)

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 30

Page 31: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD Global resultsNERD Global results

SLR Precision Recall F-measure %correct

% % % % %combined 86.85% 35.31% 17.69% 23.44% 17.69%

combined+ 188.81% 15.13% 28.40% 19.45% 28.40%

Combined+ : Eval corpus differs substantially from the Train & DevCombined+ : Eval corpus differs substantially from the Train & Devcorpora. The static rules do not fit well the Eval corpora and theyintroduce classification noise.

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 31

Page 32: Semantics at the multimedia fragment level or how enabling the remixing of online media

PerPer--extractor resultsextractor resultsSLR Precision Recall F-measure %correct

alchemyapi 37.71% 47.95% 5.45% 9.68% 5.45%

lupedia 39.49% 22.87% 1.56% 2.91% 1.56%

opencalais 37 47% 41 69% 3 53% 6 49% 3 53%opencalais 37.47% 41.69% 3.53% 6.49% 3.53%

wikimeta 36.67% 19.40% 4.25% 6.95% 4.25%

combined (nerd)

86.85% 35.31% 17.69% 23.44% 17.69%(nerd)

combined+(nerd+)

188.81% 15.13% 28.40% 19.45% 28.40%

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 32

Page 33: Semantics at the multimedia fragment level or how enabling the remixing of online media

NERD + NERD + SynoteSynote: : http://linkeddata.synote.orghttp://linkeddata.synote.org

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 33

Page 34: Semantics at the multimedia fragment level or how enabling the remixing of online media

WoLEWoLE WorkshopWorkshop

WoLE2012 Workshop in conjunction with the ISWC2012 fISWC2012 conference

http://wole2012.eurecom.fr

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 34

Page 35: Semantics at the multimedia fragment level or how enabling the remixing of online media

- 3509/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012

Page 36: Semantics at the multimedia fragment level or how enabling the remixing of online media

- 3609/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012

Page 37: Semantics at the multimedia fragment level or how enabling the remixing of online media

Building the data.eurecom.frBuilding the data.eurecom.fr

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 37

Page 38: Semantics at the multimedia fragment level or how enabling the remixing of online media

ZenaminerZenaminer

Publish SCORM content in the Web of Dataseparating the content from the layout

Introduce the use of media element / fragmentsIntroduce the use of media element / fragments

Automatic annotation of user comments using NER t lNER toolshypertext link navigation to key terms and entitiessatisfy better the information needs of the learner

See also: http://zenaminer.sourceforge.net/See also: http://zenaminer.sourceforge.net/

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 38

Page 39: Semantics at the multimedia fragment level or how enabling the remixing of online media

Example application: Link OpenLearn to relevant course/podcasts

Credit: Mathieu D’AquinSee also: Zablith et al, LinkedLearning 2011

Page 40: Semantics at the multimedia fragment level or how enabling the remixing of online media

Integrating Open Educational Material in course descriptions

Credit: Mathieu D’AquinSee also: Zablith et al, COLD 2011

Page 41: Semantics at the multimedia fragment level or how enabling the remixing of online media

Take Home MessageTake Home Message

Video is a first class citizen on the WebAnnotations: Ontology and API for Media ResourcesAccess: Media Fragments URI

NERD platform for extracting key information from learning resources including videosfrom learning resources including videos

Linked Universities movement for federating i iti ti i i d ti l d tinitiatives in exposing educational data as linked data

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 41

Page 42: Semantics at the multimedia fragment level or how enabling the remixing of online media

Media MixerMedia Mixer

Vision: adoption of semantic multimedia t h l i ill f t E k t ftechnologies will foster an European market for media fragment re-purposing and re-selling

EU FP7 CSA: November 2012 - November 2014

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 42

Page 43: Semantics at the multimedia fragment level or how enabling the remixing of online media

CreditsCredits

Giuseppe Rizzo (Zenaminer, NERD)

Anne Elisabeth Gazet (data.eurecom.fr)

M thi D’A i (Li k dU i iti L ) Mathieu D’Aquin (LinkedUniversities, Lucero)

09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 43

Page 44: Semantics at the multimedia fragment level or how enabling the remixing of online media

http://www.slideshare.net/troncy

09/10/2012 - - 44Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012