mediaeval 2015 - jrs at synchronization of multi-user event media task

Post on 15-Jan-2017

98 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

JRS at Synchronization of

Multi-user Event Media Task Hannes Fassold, Harald Stiegler, Felix Lee, Werner Bailer

MediaEval Workshop, September 14-15, 2015, Wurzen

Gallery synchronization – Approach Overview

Based on visual information (images, key frames

of videos) and given time stamps

Probabilistic approach

Uses visual similarity of image pairs

(SIFT descriptors)

Many potential solutions (‚hypotheses‘) are calculated

in a ‚probabilistic way‘ (so with a inherent random

component)

„Best“ hypothesis is calculated from these hypotheses

2

Gallery synchronization – Approach Determine image matches

Determine good image matches (I, J)

Images with a high degree of visual similarity

Determine best match J for image I via exhaustive

matching of their set of SIFT descriptors

Descriptor calculation and matching done one GPU

SIFT very robust to orientation, scale change

Apply geometric verification step on match (I, J)

Gives a variable number of homographies and the

number of points ht supporting a homography

3

Gallery synchronization – Approach Determine visual similarity score s(i,j)

Calculation of similarity score s(i,j) for a match (I,J)

Discard all homographies with too less supporting

points (with ht < threshold)

Pick the k homographies with the highest number of

supporting points ht, and clip the values ht to range

Average and sum of the k remaining (and clipped)

values ht,clip is calculated

s(i,j) is geometric average of ‘average’ and ‘sum’

4

Gallery synchronization – Approach Calculate ‚connection‘ magnitude c(k,l)

c(k,l) calculated for each gallery pair (k, l)

c(k,l) gives some information about the ‘stability’ of the gallery pair (k, l)

More ‘stable’ gallery pairs have a higher number of matches and a low deviation of the time difference

values of the matches

c(k,l) calculated from: number of matches, average

visual similarity score of the matches, average

deviation of the time differences of the matches

5

Gallery synchronization – Approach Calculate one hypothesis

Pick a gallery pair (k,l) randomly

Probability of picking the pair (k,l) proportional to

connection magnitude c(k,l)

Calculate time difference (k,l) and propagate it

Apply k-means clustering (k=3-5) on the time

difference values of all matches, and pick one cluster

center randomly as (k,l)

Propagate it to calculate other time differences (k’,l)

Iterate this process until we have all time

differences (for all galleries)

6

Gallery synchronization – Approach Calculate ‚best‘ hypothesis

Calculate many hypotheses

Several hundreds or thousands

‘Best’ hypothesis is calculated as the medoid of

all hypotheses

Can be seen as the ‘most-inner’ point, when interpreting the hypotheses as n-dimensional points

7

Sub-event clustering – Approach

Relies solely on time information

Calculate ‘corrected’ time stamps

Using the calculated gallery time offsets

Apply 1-dimensional k-means algorithm

Value ‘k’ is determined on the data set size and on a user-defined ‘granularity’ parameter

8

Results Gallery sychronization

9

Results for Tour de France 2014 (TDF14)

and NAMM 2015 (NAMM15) dataset

Results Subevent clustering

10

Subevent clustering results for Tour de France 2014 (left) and

NAMM 2015 (right).

Run 2 has a finer ‚granularity‘ (higher k) than run 1.

Conclusion

Gallery synchronization

High accuracy (~ 80 – 90 %) for both TDF and NAMM

But precision very low on NAMM (so many galleries

not correctly synchronized)

NAMM 2015 content visually more ‚challenging‘ (much more wrong image matches)

Subevent clustering

Worse results (precision, recall) for NAMM 2015

might be due to less successfull gallery

synchronization for this dataset

11

12 Acknowledgments

This work was supported by the European Commission under the grant agreement no. FP7-610370, „ICoSOLE“ http://www.icosole.eu

The research leading to these results has received funding from the European Union's

Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 610370,

ICoSOLE (Immersive Coverage of Spatially Outspread Live Events“ (http://www.icosole.eu/).

Hannes Fassold

JOANNEUM RESEARCH – DIGITAL

hannes.fassold@joanneum.at

http://www.joanneum.at/digital

top related