raphaël troncy

Deep-linking into Media Assets at the Fragment

Level: Specification, Model and Applications

Raphaël Troncy <[email protected]>

mailto:[email protected]

17/12/2013 - 7ème Entretiens du Nouveau Monde Industriel (ENMI 2013) - 2

TimBL Vision back in 1994

http://web5.w3.org/Talks/WWW94Tim/

htpp://www.eurecom.fr/

A typical HTML web page

17/12/2013 - - 3 7ème Entretiens du Nouveau Monde Industriel (ENMI 2013)


What it looks like to a machine



Okay, so HTML is not helpful

Maybe we can tell the machine what the

different parts of the text represent?

title

time

speaker

location

abstract

biosketch

host



XML to rescue?

XML fans propose creating a XML tag set to use for each

application.

For talks, we can choose <title>, <speaker>, etc.

<title>

<time>

<speaker>

<location>

<abstract>

<biosketch>

<host> </host></biosketch>

</abstract>

</location></time>

</speaker>

</title>



XML machine accessible meaning

But, to your machine, the tags

still look like this….

The tag names carry no meaning.

XML DTDs andSchemas have

little or no semantics.

< >title

< >time

< >speaker

< >location

< >abstract

< >biosketch

< >host </ >host</ >biosketch

</ >abstract

</ >location</ >time

</ >speaker</ >title



do not readthe following sign

youloose

we interpretmachines don't

Why is it so difficult to find appropriate multimedia content, to

reuse and repurpose content previously published and to present this content in interfaces that vary

with user needs?


Image/Video indexing

Techniques used by mainstream search engines search term occurs in the filename or in the caption or in

user tags no semantics

Image indexing: main problem an image is not alphabetic: there is no countable discrete

units, that, in combination will provide the meaning of the image

image descriptors are not given with the image: one needs to extract or interpret them

Video indexing: additional problem a video has additionally a temporal dimension to take into

account a video has a priori no discrete units neither (i.e. frames,

shots, sequences cannot be absolutely defined)



Sounds Familiar?

[Arnold Smeulders, PAMI, 2000]The semantic gap is the lack of coincidence between the information that one can extract from the sensory data and the interpretation that the same data has for a user in a given situation

http://ieeexplore.ieee.org/iel5/34/19391/00895972.pdf



The science of labeling

Automatically detecting the presence of a concept in a video stream

Naming visual information

airplane



A Simple Concept Detector

[Cees Snoek and Marcel Worring, SSMS, 2007]

http://www.dcs.gla.ac.uk/ssms07/material.html



Support Vector Machine

[Cees Snoek and Marcel Worring, SSMS, 2007]

http://www.dcs.gla.ac.uk/ssms07/material.html



The Computer Vision Approach

Building detectors one-at-the-time

a face detector for frontal faces

a face detector for non-frontal faces

3 years later

One (or more) PhD for every new concept


long waya little drop of semantics goes a

Jim Hendler [1997]

http://www.cs.rpi.edu/~hendler/LittleSemanticsWeb.html



http://lin-clark.com/blog/drupal-8-now-has-schemaorg-rdf-mappings-dont-pop-champagne-yet

Once upon a time …



http://www.ukoln.ac.uk/jisc-ie/blog/2010/09/15/consuming-and-producing-linked-data-in-a-content-management-system/

… leading to sharing Media Fragments Publishing status message containing

a Media Fragment URIUse a ‘#’ !Highlight a

videosequence

Highlight a regionto pay attention to



http://www.ukoln.ac.uk/jisc-ie/blog/2010/09/15/consuming-and-producing-linked-data-in-a-content-management-system/

http://linkeddata.synote.org/synote/recording/replay/51151#t=22,32

W3C Video on the Web Workshop - 2007



http://www.w3.org/2007/08/video/report.html

Key topics

Addressing: having global identifiers for identifying spatial and temporal clips (for deep linking, bookmarking, caching and indexing)

Metadata: searching and discovering video is difficult with the volume of online video

Video codec: recommending a baseline (open) video codec for the World Wide Web

Content protection: managing digital rights associated with the media is key: W3C should look into metadata for digital rights



http://www.w3.org/2007/08/video/report.html

Making video a "first class citizen"



Flickr Notes


http://www.flickr.com/photos/mhausenblas/2883727293/




YouTube Temporal Addressing (Sept 2008)



http://www.youtube.com/watch?v=1bibCui3lFM#t=1m45s

Media Fragments Use Cases

Bookmark / Share parts (fragments) of audio/video content

Annotate media fragments

Search for media fragments

Develop Mash-ups/Collage

Conserve bandwidth


http://www.w3.org/TR/media-frags-reqs/





t0 20 35temporal media fragment

spatial media fragment

track media fragment

named media fragment“Scared Scene”

What are Media Fragments?



Media Fragments Dimensions

r01: Temporal fragments: a clipping along the time dimension from a start to an end

time that are within the duration of the media resource

r02: Spatial fragments: a clipping of an image region, only consider rectangular

regions

r03: Track fragments: a track as exposed by a container format of the media

resource

r04: Named fragments: A temporal media fragment that has been given a name

through some sort of annotation mechanism




Media Fragments (temporal)


Fragment beginning Fragment endPlayback progress

Original resource length


Media Fragments (spatial)


semi-opaque overlay

highlighted fragment

http://ninsuna.elis.ugent.be/MFPlayer/html5




Media Fragment (Semantic) Annotation

Media Fragment creation: localize a region (person)

Media Fragment annotation (tagging) = interpretationWinston Churchill, UK Prime Minister, Allied Forces, WWII

Media Fragment semantic annotation:Reg1 foaf:depicts dbpedia:WinstonChurchill.

dbpedia:Churchill rdfs:label "Winston Churchill"; rdf:type foaf:Person dbprop:order dbpedia:Prime_Minister_(UK).


The "Big Three" at the Yalta Conference (Wikipedia)

Reg1

http://en.wikipedia.org/wiki/Allies_of_World_War_II



Media Fragment (Semantic) Annotation

Media Fragment creation: localize a temporal sequence

Media Fragment annotation (tagging) = interpretationG8 Summit, EU Summit, Heiligendamm, 2007, Gothenburg, 2001

Media Fragment semantic annotation:Seq1 foaf:depicts dbpedia:33rd_G8_Summit.:Seq4 foaf:depicts dbpedia:EU_Summit.

dbpedia:33rd_G8_Summit rdfs:label "33rd G8 summit"@en ; grs:point "54.143055555555556 11.841666666666667".


A history of G8 violence (video) (© Reuters) Seq1

Seq4

http://www.reuters.com/news/video/summitVideo?videoId=56114



Things, not strings!http://googleblog.blogspot.fr/2012/05/introducing-knowledge-graph-things-not.html

Use knowledge bases (LOD)

Use commonvocabularies (LOV)

Follow the 4 Linked Data principles

Refine the 4 Linked Media principles

- 3617/12/2013 - 7ème Entretiens du Nouveau Monde Industriel (ENMI 2013)

Media Fragment Semantic Annotation

http://googleblog.blogspot.fr/2012/05/introducing-knowledge-graph-things-not.html

http://googleblog.blogspot.fr/2012/05/introducing-knowledge-graph-things-not.html

http://lod-cloud.net/

http://lov.okfn.org/

http://de.slideshare.net/linkedtv/www-linked-media-keynote



Open Annotation Data Model


Specification developed in the W3C Open Annotation Community Grouphttp://www.openannotation.org/spec/core/

Core model OWL vocabulary for representing

and sharing annotation of digital resources (and their fragment) … in RDF

A body is related to a target Nature of the annotation changes

according to intention (motivation)

How to annotatethis image?

http://www.openannotation.org/spec/core/



Semantic Annotation of an Image


http://www.w3.org/community/openannotation/wiki/SE_Semantically_Tagging_an_Image





Maphub: http://maphub.github.io/


http://maphub.github.io/


Open Video: Annotation Project


http://openvideoannotation.org/

http://openvideoannotation.org/


LinkedTV: automatic annotations ...



... and enrichment for hypervideos

CubismExpressionism

Fauvism

FACETS / PROPERTIES OF CONCEPT

CONCEPT IN PLAYER

CONTENT ENRICHMENT



7ème Entretiens du Nouveau Monde Industriel (ENMI 2013)

Media Fragments and Annotations

nerd:Location Cafe Rick

nerd:PersonH. Bogart

nerd:PersonI. Bergman

nerd:Location Casablanca

Media Fragment URI 1.0 Chapters Scenes Shots etc…

http://data.linkedtv.eu/media/e2899e7f#t=840,900

17/12/2013 - - 43


7ème Entretiens du Nouveau Monde Industriel (ENMI 2013)

Enrichment and Hypervideos

nerd:Location Cafe Rick

nerd:PersonH. Bogart

nerd:PersonI. Bergman

nerd:Location Casablanca

Nerd:PersonE. Tierney

nerd:Location China

17/12/2013 - - 44


Compare performances of NER and NEL tools Understand strengths and weaknesses of different Web APIs Adapt NER processing to different context

(Learn how to) Combine NER (/ NEL) tools

NERD: Named Entity Recognition and Disambiguation


What is NERD?REST API2ontology1

UI3

1 http://nerd.eurecom.fr/ontology2 http://nerd.eurecom.fr/api/application.wadl

3 http://nerd.eurecom.fr

http://nerd.eurecom.fr/ontology

http://nerd.eurecom.fr/api/application.wadl

http://nerd.eurecom.fr/


NERD User Interface



Locator

MediaResource

MediaFragmentAnnotation

Entity

URL (hyperlink)

Type


Media Fragment + Open Annotation + NERD


Media Fragment Enricher: http://mfe.synote.org/mfe/


http://mfe.synote.org/mfe/


http://linkedtv.eurecom.fr/nerdviewer/

Linking pieces of knowledge



7ème Entretiens du Nouveau Monde Industriel (ENMI 2013)17/12/2013 -

http://linkedtv.project.cwi.nl/news/

- 52

http://linkedtv.project.cwi.nl/news/


Take Away Summary

Video is a first class citizen on the WebAnnotations: Ontology and API for Media

Resources, Open Annotation Data Model

Access: Media Fragments URINERD platform for extracting key information from

textual resources including video subtitles and microposts

Embrace the Linked Media visionPublish, re-use, re-purpose and remix media

descriptionsDevelop links between (part of) media items via

their descriptions17/12/2013 - 7ème Entretiens du Nouveau Monde Industriel (ENMI 2013) - 53

http://www.w3.org/TR/mediaont-10/

http://www.w3.org/TR/mediaont-api-1.0/

http://www.openannotation.org/spec/core/

http://www.w3.org/TR/media-frags/

http://nerd.eurecom.fr/



Take Away Summary


Credits

Giuseppe Rizzo, Vuk Milicic, José Luis Redondo Garcia (EURECOM)

Thomas Steiner (Google Inc.), Yunjia Li (University of Southampton)

Marieke van Erp (Free University of Amsterdam)

Erik Mannens, Davy ven Deursen (iMinds, Uni. Ghent)

Paolo Ciccarese, Robert Sanderson, Herbert Van de Sompel and all the members of the W3C Open Annotation Community Group

… and many other students



raphaël troncy

Technology

nouveau monde industriel

media assets

temporal fragments

w3c video

media fragments dimensions

video streamairplane

video sequence

media fragment uri