clustering and retrieval of social events in flickr

1
Clustering and Retrieval of Social Events in Flickr Maia Zaharieva 1,2 Daniel Schopfhauser 1 Manfred Del Fabro 3 Matthias Zeppelzauer 4 1 Interactive Media Systems Group, Vienna University of Technology, Austria 2 Multimedia Information Systems Group, University of Vienna, Austria 3 Distributed Multimedia Systems Group, Klagenfurt University, Austria 4 Institute of Creative Media Technologies, St. Pölten Univ. of Applied Sciences, Austria This work has been partly funded by the Vienna Science and Technology Fund (WWTF) through project ICT12-010 and by the Carinthian Economic Promotion Fund (KWF) under grant KWF-20214/22573/33955. Contact: Dr. Maia Zaharieva, maia.zaharieva@{tuwien|univie}.ac.at Development queries Test queries R P F1 R P F1 Run 1 0.4656 0.8990 0.5367 0.2242 0.4570 0.2287 Run 2 0.5052 0.8974 0.6192 0.2365 0.3268 0.2109 Run 3 0.4770 0.4391 0.3838 0.4057 0.4203 0.2877 EXPERIMENTS II Configuration Run 1: No query expansion Neglect events without geo-coordinates Use event-type models Run 2: Run 1 + query expansion Run 3 (unsupervised): No query expansion Consider all events No event-type models Results EVENT RETRIEVAL APPROACH Lessons Learned Unsupervised approach performs best Query expansion reduces precision Event-type models do not help Queries 8 development queries: music, community, conferences, … 10 test queries: music, community, sport, theatre, … Dataset # photos: 362,578 # events: 17,834 % photos with GPS data: 20.12% % events with GPS data: 41.30% Development set Test set F1 NMI F1 NMI Run 1 0.9356 0.9873 0.9476 0.9886 Run 2 0.9343 0.9872 0.9466 0.9884 Run 3 0.9178 0.9840 0.9407 0.9872 Run 4 0.9159 0.9836 0.9404 0.9871 Run 5 0.9098 0.9822 0.9386 0.9866 Results EXPERIMENTS I Configuration Run 1: Text-based clustering + TE Run 2: Text-based clustering + LTE2 Run 3: Location-based clustering, LTE1 Run 4: Location-based clustering, LTE2 Run 5: Time-based clustering, TE EVENT CLUSTERING APPROACH Dataset Development set: # photos: 362,578 # events: 17,834 % photos with GPS data: 20.12% % events with GPS data: 41.30% Test set: # photos: 110,541 Lessons Learned High generalization ability Capture time, user, and location information achieve impressive results Overall, existing metadata information achieve robust results Event clustering Event retrieval Image collection Event 1 Event 2 Event N ... Event 1 ... Event M Query Result MOTIVATION Background Immense daily growth of publicly available photos depicting social events Increasing need for ecient algorithms that are able to mine large photo collections (e.g. Flickr) Research Questions Can we perform reliable event clustering using solely available metadata? Supervised vs. unsupervised event retrieval? Generalization ability of the employed approaches? Fundamental Principles Simple but robust heuristics / decision criteria Minimize number of parameters Minimize assumptions on dataset photos user name capture time GPS data user-generated textual descriptions Time-based clustering generate single-item event clusters single-item event clusters (SE) merge SE A & SE B iff - user(SE A )=user(SE B ) - t (SE A ,SE B ) < th t time-based event clusters (per user) (TE) merge TE A & TE B iff - min( t (TE A ,TE B ))< th t - min( l (TE A ,TE B )) < th l location-time-based event clusters (LTE1) Location-based clustering Step 1: estimate a representative location per cluster, L Step 2: merge TE A & TE B iff - min( t (TE A ,TE B ))< th t - l (L A ,L B ) < th l location-time-based event clusters (LTE2) Approach II Text-based clustering XOR Step 1: textual features extraction Step 2: merge LTE A & LTE B iff - Intersection(TD A ,TD B )<th TD - Intersection(TA A , TA B )<th TA - temporal and/or location constraints apply term dictionaries (TD) LDA topic assignments (TA) Approach I event clusters test queries development queries (ground truth) GeoTagging: city, country, venue textual features extraction: TF-IDF query expansion: WordNet synsets event type models - topic extraction - one class SVMs Feature extraction Matching: hard constraints temporal constraints geo constraints (country) geo constraints (city) Matching: soft constraints matching venue? textual similarity event-type score candidate clusters weighting adaptive thresholding result clusters

Upload: multimediaeval

Post on 04-Aug-2015

38 views

Category:

Software


3 download

TRANSCRIPT

Page 1: Clustering and Retrieval of Social Events in Flickr

Clustering and Retrieval of Social Events in FlickrMaia Zaharieva1,2 Daniel Schopfhauser1 Manfred Del Fabro3 Matthias Zeppelzauer4

1Interactive Media Systems Group, Vienna University of Technology, Austria2Multimedia Information Systems Group, University of Vienna, Austria3Distributed Multimedia Systems Group, Klagenfurt University, Austria

4Institute of Creative Media Technologies, St. Pölten Univ. of Applied Sciences, Austria

This work has been partly funded by the Vienna Science and Technology Fund (WWTF) through project ICT12-010 and by the Carinthian Economic Promotion Fund (KWF) under grant KWF-20214/22573/33955.

!Contact: Dr. Maia Zaharieva, maia.zaharieva@{tuwien|univie}.ac.at

Development queries Test queriesR P F1 R P F1

Run 1 0.4656 0.8990 0.5367 0.2242 0.4570 0.2287Run 2 0.5052 0.8974 0.6192 0.2365 0.3268 0.2109Run 3 0.4770 0.4391 0.3838 0.4057 0.4203 0.2877

EXPERIMENTS IIConfigurationRun 1:‣ No query expansion‣ Neglect events without geo-coordinates‣ Use event-type modelsRun 2:‣ Run 1 + query expansionRun 3 (unsupervised):‣ No query expansion‣ Consider all events‣ No event-type models

Results

EVENT RETRIEVAL APPROACH

Lessons Learned‣ Unsupervised approach performs best‣Query expansion reduces precision‣ Event-type models do not help

Queries8 development queries:‣music, community, conferences, …10 test queries:‣music, community, sport, theatre, …

Dataset# photos: 362,578# events: 17,834% photos with GPS data: 20.12%% events with GPS data: 41.30%

Development set Test setF1 NMI F1 NMI

Run 1 0.9356 0.9873 0.9476 0.9886Run 2 0.9343 0.9872 0.9466 0.9884Run 3 0.9178 0.9840 0.9407 0.9872Run 4 0.9159 0.9836 0.9404 0.9871Run 5 0.9098 0.9822 0.9386 0.9866

Results

EXPERIMENTS IConfigurationRun 1: Text-based clustering + TERun 2: Text-based clustering + LTE2Run 3: Location-based clustering, LTE1Run 4: Location-based clustering, LTE2Run 5: Time-based clustering, TE

EVENT CLUSTERING APPROACHDatasetDevelopment set: # photos: 362,578# events: 17,834% photos with GPS data: 20.12%% events with GPS data: 41.30%

Test set: # photos: 110,541

Lessons Learned‣ High generalization ability‣ Capture time, user, and location information achieve impressive results‣Overall, existing metadata information achieve robust results

Event clustering

Event retrieval

Image collection

Event 1

Event 2

Event N

...Event 1

...

Event M

Query

Result

MOTIVATIONBackground‣ Immense daily growth of publicly available photos depicting social events‣ Increasing need for efficient algorithms that are able to mine large photo collections (e.g. Flickr)

Research Questions‣ Can we perform reliable event clustering using solely available metadata?‣ Supervised vs. unsupervised event retrieval?‣Generalization ability of the employed approaches?

Fundamental Principles‣ Simple but robust heuristics / decision criteria ‣Minimize number of parameters‣Minimize assumptions on dataset

photos

user name

capture time

GPS data

user-generated textual descriptions

Time-based clustering

generate single-item

event clusters

single-item event clusters

(SE)

merge SEA & SEB iff- user(SEA)=user(SEB)- �t(SEA,SEB) < tht

time-based event clusters (per user)

(TE)

merge TEA & TEB iff- min(�t(TEA,TEB))< tht- min(�l(TEA,TEB)) < thl

location-time-based event clusters

(LTE1)

Location-based clustering

Step 1: estimate a representative location per cluster, LStep 2: merge TEA & TEB iff - min(�t(TEA,TEB))< tht - �l(LA,LB) < thl

location-time-based event clusters

(LTE2)

Approach II

Text-based clustering

XOR

Step 1: textual features extraction

Step 2: merge LTEA & LTEB iff - Intersection(TDA,TDB)<thTD - Intersection(TAA, TAB)<thTA - temporal and/or location constraints apply

term dictionaries(TD)

LDA topic assignments (TA)

Approach I

event clusters

test queries

development queries(ground truth)

GeoTagging: city, country, venue

textual features extraction: TF-IDF

query expansion: WordNet synsets

event type models- topic extraction- one class SVMs

Feature extraction Matching: hard constraints

temporal constraints

geo constraints (country)

geo constraints (city)

Matching: soft constraints

matching venue?

textual similarity

event-type score

candidate clusters

weighting

adaptive thresholding

result clusters