semantics at the multimedia fragment level or how enabling the remixing of online media
DESCRIPTION
Presentation given at the 6th tele-TASK symposium in HPI, October 2012, Postdam, GermanyTRANSCRIPT
Semantics at the multimediaSemantics at the multimedia fragment level or how enabling
the remixing of online mediaRaphaël Troncy <[email protected]>
Once upon a time …Once upon a time …
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 2
… leading to sharing Media Fragments… leading to sharing Media Fragments
Publishing status message containing M di F t URIa Media Fragment URIUse a ‘#’ !Highlight a
videosequencesequence
Highlight a regionregionto pay attention to
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 3
What are Media Fragments?What are Media Fragments?
t0 20 35temporal media fragment
spatial media fragment
track media fragment
09/10/2012 - - 4Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012
Media Fragments (temporal)Media Fragments (temporal)
Original resource lengthlength
Fragment beginning Fragment endPlayback progress
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 5
Fragment beginning gPlayback progress
Media Fragments (spatial) + Media Fragments (spatial) + DemoDemo
semi opaque
highlighted fragment
semi-opaque overlay
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 6
Media Fragments URIsMedia Fragments URIs
Bookmark / Share parts (fragments) of di / id t taudio/video content
Annotate media fragmentsAnnotate media fragments
Search for media fragments
Mash-ups
C b d idth Conserve bandwidth
http://www.w3.org/TR/media-frags-reqs/http://www.w3.org/TR/media frags reqs/http://www.w3.org/TR/media-frags/
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 7
Video annotationVideo annotation
09/10/2012 - - 8Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012
Video interactivityVideo interactivity
CONCEPT IN PLAYER
CubismExpressionism
Fauvism
CO CFACETS / PROPERTIES OF CONCEPT CONTENT ENRICHMENT
09/10/2012 - - 9Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012
LinkedTVLinkedTV EU ProjectEU Project Vision
Ubiquitously online cloud of Ubiquitously online cloud of Networked Audio-Visual Content
12 Excellent PartnersFraunhofer E Decoupled from place,
device or source
Fraunhofer STI GMBH
CERTH
Eurecom Condat
BEELD EN GELUID Aim
provide interactive multimedia service for non-
UEP UMONS
CWI
Noterik U. ST GALLEN
RBBmultimedia service for non-professional end-users
focus television broadcast d idcontent as seed videos
Web: http://www.linkedtv.eu
09/10/2012 - - 10Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012
Video AccessibilityVideo Accessibility
What is required to make video accessible on the Web?
Technologies: Annotating: automatic (speech transcription) and manual (social
collaborative annotation tool)collaborative annotation tool) Addressing: pointing to, retrieving, transmitting only parts of media Rendering: video visualization for the impaired, Braille output Rendering: video visualization for the impaired, Braille output
Benchmarking: Sphinx, HTK, J liJulius
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 11
Speech ProcessingSpeech Processing
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 12
Demo: Demo: http://semantics.eurecom.fr/acav/http://semantics.eurecom.fr/acav/
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 13
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 14
Semantic indexing at the fragment levelSemantic indexing at the fragment level
Benchmarking: Sphinx, HTK, JuliusJulius
NER + full text index with the transcriptiontranscription
Interlinking with the Linked Data Cloud to enable semantic searchCloud to enable semantic search
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 15
NERD: Named Entity Recognition and NERD: Named Entity Recognition and DisambiguationDisambiguation Compare performances of Named Entity
Recognition tools
DisambiguationDisambiguation
Recognition tools Understand strengths and weaknesses of different Web APIs Adapt NER processing to different context Adapt NER processing to different context
(Learn how to) Combine NER tools
Participate in the ANR ETAPE benchmark
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 16
What is a Named Entity recognition task?What is a Named Entity recognition task?
A task that aims to locate and classify the name of a person or an organization a location a brand aperson or an organization, a location, a brand, a product, a numeric expression including time, date, money and percent in a textual documenty p
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 17
NER Tools and Web APIsNER Tools and Web APIs
Standalone softwareGATEStanford CoreNLPTemis
Web APIshttp://nerd.eurecom.fr/
Web APIs
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 18
What is NERD?REST API2ontology1
UI3UI3The NERD ontology has been integrated in the NIF project, g p j ,
a EU FP7 in the context of the LOD2: Creating Knowledge
out of Interlinked Data1 http://nerd.eurecom.fr/ontology
2 http://nerd.eurecom.fr/api/application.wadl3 http://nerd.eurecom.frhttp://nerd.eurecom.fr
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 19
Factual comparison of 10 Web NER toolsFactual comparison of 10 Web NER toolsAlchemy
APIDBpedia Spotlight
Evri Extractiv Lupedia OpenCalais
Saplo Wikimeta Yahoo! Zemanta
Language EN FR EN EN I EN EN FR EN FR EN EN FR EN ENLanguage EN,FR,GR,IT,PT,RU,SP,SW
ENGR*PT*SP*
EN,IT
EN EN,FR,IT
EN,FRSP
EN,SW
EN,FRSP
EN EN
Granularity OEN OEN OED OEN OEN OEN OED OEN OEN OEDGranularity OEN OEN OED OEN OEN OEN OED OEN OEN OED
Entityposition
N/A charoffset
N/A wordoffset
range of chars
charoffset
N/A POSoffset
rangeof
chars
N/A
Classificationschema
Alchemy DBpediaFreeBaseScema.or
g
Evri DBpedia DBpediaLinkedM
DB
OpenCalais
N/A ESTER Yahoo FreeBase
Number of classes
324 320 5 34 319 95 5 7 13 81
ResponseFormat
JSONMicroFXMLRDF
HTMLJSONRDFXML
HTML
JSON
HTMLJSONRDFXML
HTMLJSONRDFaXML
JSONMicroFormat
JSON JSONXML
JSONXML
XMLJSONRDF
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 20/15
RDF
Quota (calls/day)
30000 unl 3000
3000 unl 50000 1333 unl 5000 10000
NERD OntologyNERD Ontology
Ali d th t i d bAligned the taxonomies used by the extractors
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 21
NERD type Occurrence
Person 10
Organization 10Building the NERD OntologyBuilding the NERD Ontology
Organization 10
Country 6
Company 6
Location 6
Continent 5
City 5
RadioStation 5
Album 5
Product 5
... ...
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 22
NERD REST APINERD REST API
RDF
GET,POST,
/document/user/annotation/{extractor}
RDF
POST,PUT,
DELETE
/annotation/{extractor}/extraction/evaluation
JSON
...“entities” : [{
“entity”: “Tim Berners-Lee” ,“type”: “Person” ,“uri”: "http://dbpedia.org/resource/Tim berners lee",p p g _ _ ,“nerdType”: "http://nerd.eurecom.fr/ontology#Person",“startChar”: 30,“endChar”: 45,“confidence”: 1,,“relevance”: 0.5
}]
Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web ExtractionTools In: European chapter of the Association for Computational Linguistics (EACL'12) Avignon FranceTools. In: European chapter of the Association for Computational Linguistics (EACL 12), Avignon, France.
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 23
NERD meets NIFNERD meets NIF
Model documents through a set of strings deferencable onset of strings deferencable on the Web
: offset 23107 23110 a str:String ;: offset_23107_ 23110 a str:String ;str:referenceContext :offset_0_26546 .
M t i t tit
: offset_23107_ 23110 sso:oen dbpedia:W3C.
Map string to entity
Classification
dbpedia:W3C rdf:type nerd:Organization .
Rizzo G, Troncy R., Hellmann S. and Bruemmer M. (2012), NERD meets NIF: Lifting NLP Extraction Results to the LinkedRizzo G, Troncy R., Hellmann S. and Bruemmer M. (2012), NERD meets NIF: Lifting NLP Extraction Results to the LinkedData Cloud. In: (LDOW'12) Linked Data on the Web (WWW'12), Lyon, France.
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 24
NERD User InterfaceNERD User Interface
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 25
NERD DashboardNERD Dashboard
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 26
History of NER benchmarksHistory of NER benchmarks CoNLL 2003 and CoNLL 2005
schema (4 types): person, organization, location and miscellaneous schema (4 types): person, organization, location and miscellaneous language independent task
ACE 2004 ACE 2005 and ACE 2007 ACE 2004, ACE 2005 and ACE 2007 schema (7 types): person, organization, location, facility, weapon,
vehicle and geo-political entity entity recognition, not just name (e.g. description, pronoun) find relationships among entities extracted
TAC 2009 (Knowledge Base Track) schema (3 types): person, organization and location create a knowledge base from the named entities extracted
ETAPE 2012 (Named Entity Task) schema: Quaero (7 main types, 32 sub-types)
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 27
ETAPE 2012 challengeETAPE 2012 challenge
genre train dev test sources
TV news 7h 40m 1h 40m 1h 40m BFM Story, Top QUestions (LCP)
TV d b t 10h 30 5h 10 5h 10 Pile et Face, Ca vous regarde, TV debates 10h 30m 5h 10m 5h 10m , g ,Entre les lignes (LCP)
TV amusements - 1h 05m 1h 05m La place du village (TV8)
Train Dev EvalItem length 26h 10h 55m 10h 55m Nb files 44 15 15 Nb words 290517 91656 115511Nb Named Entities 46763 14398 13055Nb Named Entities 46763 14398 13055Nb unique categories 33 33 33
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 28
Participation at ETAPE Participation at ETAPE (combined strategy)(combined strategy)
extraction
(e t URI si ei ) l i(eA1,tA1,URIA1,siA1,eiA1) .........(eA2,tA2,URIA2,siA2,eiA2)(eA3,tA3,URIA3,siA3,eiA3)
cleaning
(e t URI si ei )
fusionWhen at least 2 extractors classify the same entity with a different type then
` (eN2,tN2,URIN2,siN2,eiN2)(eN1,tN1,URIN1,siN1,eiN1) same entity with a different type then
we apply a preferred selection order (empirically defined): Wikimeta,
AlchemyAPI OpenCalais LupediaAlchemyAPI, OpenCalais, Lupedia
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 29
Participation at ETAPE Participation at ETAPE (combined+ strategy)(combined+ strategy)ETAPE
Train & Dev
...
Learned model POS tagger
(eA1,tA1,URIA1,siA1,eA1)(eA2,tA2,URIA2,siA2,eiA2)
Created static rules
f
Apply rules
) fusionConflicts handled by
priority selection: own, (e1,t1,URI1,si1,ei1)
`(eN1,tN1,URIN1,sN1,eN1)
Wikimeta,AlchemyAPI,OpenCalais,Lupedia
(eN2,tN2,URIN2,sN2,eN2)
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 30
NERD Global resultsNERD Global results
SLR Precision Recall F-measure %correct
% % % % %combined 86.85% 35.31% 17.69% 23.44% 17.69%
combined+ 188.81% 15.13% 28.40% 19.45% 28.40%
Combined+ : Eval corpus differs substantially from the Train & DevCombined+ : Eval corpus differs substantially from the Train & Devcorpora. The static rules do not fit well the Eval corpora and theyintroduce classification noise.
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 31
PerPer--extractor resultsextractor resultsSLR Precision Recall F-measure %correct
alchemyapi 37.71% 47.95% 5.45% 9.68% 5.45%
lupedia 39.49% 22.87% 1.56% 2.91% 1.56%
opencalais 37 47% 41 69% 3 53% 6 49% 3 53%opencalais 37.47% 41.69% 3.53% 6.49% 3.53%
wikimeta 36.67% 19.40% 4.25% 6.95% 4.25%
combined (nerd)
86.85% 35.31% 17.69% 23.44% 17.69%(nerd)
combined+(nerd+)
188.81% 15.13% 28.40% 19.45% 28.40%
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 32
NERD + NERD + SynoteSynote: : http://linkeddata.synote.orghttp://linkeddata.synote.org
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 33
WoLEWoLE WorkshopWorkshop
WoLE2012 Workshop in conjunction with the ISWC2012 fISWC2012 conference
http://wole2012.eurecom.fr
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 34
- 3509/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012
- 3609/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012
Building the data.eurecom.frBuilding the data.eurecom.fr
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 37
ZenaminerZenaminer
Publish SCORM content in the Web of Dataseparating the content from the layout
Introduce the use of media element / fragmentsIntroduce the use of media element / fragments
Automatic annotation of user comments using NER t lNER toolshypertext link navigation to key terms and entitiessatisfy better the information needs of the learner
See also: http://zenaminer.sourceforge.net/See also: http://zenaminer.sourceforge.net/
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 38
Example application: Link OpenLearn to relevant course/podcasts
Credit: Mathieu D’AquinSee also: Zablith et al, LinkedLearning 2011
Integrating Open Educational Material in course descriptions
Credit: Mathieu D’AquinSee also: Zablith et al, COLD 2011
Take Home MessageTake Home Message
Video is a first class citizen on the WebAnnotations: Ontology and API for Media ResourcesAccess: Media Fragments URI
NERD platform for extracting key information from learning resources including videosfrom learning resources including videos
Linked Universities movement for federating i iti ti i i d ti l d tinitiatives in exposing educational data as linked data
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 41
Media MixerMedia Mixer
Vision: adoption of semantic multimedia t h l i ill f t E k t ftechnologies will foster an European market for media fragment re-purposing and re-selling
EU FP7 CSA: November 2012 - November 2014
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 42
CreditsCredits
Giuseppe Rizzo (Zenaminer, NERD)
Anne Elisabeth Gazet (data.eurecom.fr)
M thi D’A i (Li k dU i iti L ) Mathieu D’Aquin (LinkedUniversities, Lucero)
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 43
http://www.slideshare.net/troncy
09/10/2012 - - 44Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012