acmmm13 sam presentation

Upload: micc

Post on 14-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Acmmm13 Sam Presentation

    1/17

    Socially-aware video recommendation using users profiles and

    crowdsourced annotations

    Marco Bertini, Alberto Del Bimbo, Andrea Ferracani, Francesco Gelli, Daniele Maddaluno, Daniele Pezzatini

    Universit degli Studi di Firenze - MICC

    marco.bertini, alberto.delbimbo, andrea.ferracani, [email protected]

    mailto:[email protected]?subject=email%20subjectmailto:[email protected]?subject=email%20subjectmailto:[email protected]?subject=email%20subjectmailto:[email protected]?subject=email%20subjectmailto:[email protected]?subject=email%20subjectmailto:[email protected]?subject=email%20subjectmailto:[email protected]?subject=email%20subjectmailto:[email protected]?subject=email%20subject
  • 7/27/2019 Acmmm13 Sam Presentation

    2/17

    The problem: how to meet the emerging demand for services that addressthe interests of the users in multimedia sharing sites.

    The solution: we propose a socially -aware framework for user profiling,

    knowledge expansion, sharing and

    interest discovery in order to improveclassic collaborative filtering methods to

    make recommendations(as use case scenario we chose video

    recommendation for a video sharing

    site)

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia

    , October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    3/17

    Common approaches in recommendation:

    - collaborative filtering techniques based on few discrete user activities likevoting, tagging or items views (item based or user based) [Davidson et al.

    2010], or user activity on videos [Mei et al. 2011])

    - textual analysis of the metadata that accompany resources, sometimes

    complemented by some multimedia content analysis

    - social similarity expressed as resources popularity distributions [Ma et al.2013] and social opinions [Davis et al. 2009]

    - basic users profiles (built considering the tags used by uploaders [Park et

    al. 2011]

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia

    , October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    4/17

    The main contribution of this work is the use of advanced profilingtechniques and users interest similarity, estimated from semi-

    automatically generated users profiles, to improve video recommendation.

    The main goal is to demonstrate how standard algorithms ofrecommendation can be improved with a better profiling obtained:

    - leveraging social narcissism

    - improving user engagement by gamification- extracting knowledge semi-automatically from user activities

    - stimulating knowledge discovery

    - using homophily (for interests targeting and friendship prediction)

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia

    , October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    5/17

    Use case: the InTime framework

    the framework combines a profiling module (throughonline social network, i.e. Facebook), users activity

    analysis, like semantic tagging, and solutions ofinteraction design in order to generate better targeted

    services for the users of the social network.

    We developed a prototype of a social network for video recommendation

    whichallows users:

    - to create and curate public personal profiles of interests in a semi-

    automatic way

    - to share and browse suggested videos, interests and users in user profiles

    through userprofiling, clustering and semantic similarity

    - to comment and semantically annotate videos at frame level

    - to have suggestions of similar users and video recommendations

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia

    , October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    6/17

    - a clustering engine that is responsible of the categorization of resources inthe network and also, through the aid of semantic distances, makes

    recommendations and suggestions of resources that match those interests,

    exploitable in user profiles [Hadoop]

    - a recommendation engine of videos and similar users, viewable in the

    personal home page of the social network [Hadoop]

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia

    , October 21

    The system consists of threemain parts that are closely

    interconnected:

    - a user profiling engine forautomatic creation of public

    profiles of interests (the

    profiles can then be edited and

    refined by users over time in asemi-automatic way) [InTime

    Social Network, NamedEntity extraction,

    Wikification]

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    7/17

    User interest modeling: user profiles are composed by categories ofinterest built considering heterogeneous data:

    - information extracted from Facebook (cold

    start scenario, user categories of interest

    computed analyzing user page likes or userfriends page likes)

    - information provided manually by the users in

    user profiles (categories and resources ofinterest)

    - information provided manually by the users in

    user comments (resources from Facebook or

    Wikipedia)

    - information automatically extracted (named

    entity detection in user comments, semantic

    analysis and categorization of annotations).

    - click-through data, page views

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    8/17ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    9/17

    Crowdsourced video tagging:

    The system

    - features an automaticextraction of semantic entities

    - provides also a widget for

    manual semantic tagging which

    allows to add, at video framelevel, Wikipedia and Facebookresources within the text of the

    comments.

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    Semantic tags are automatically detected within comments using Named

    Entity Detection based on rules and gazetteer and with a wikificationprocedure that identifies Wikipedia entities in text comments.

    These semantic tags are used to represent the video topics, and their

    association to users interests is computed in real-time when users post newcomments, to update their personal profiles.

  • 7/27/2019 Acmmm13 Sam Presentation

    10/17

    Profile creation and curation

    Profiles are semi-automatically created on the basis of the resources thathave been tagged or extracted automatically from user comments on

    video frames:

    - all the extracted resources in the network are represented by theircorresponding Wikipedia page text document and are used as suggestions

    to improve user profiles;

    - resources are vectorized using the TF-IDF algorithm and clustered with

    Fuzzy K-Means;

    - clusters are labeled with a two-levels taxonomy of interests by

    computing a weighted average of the semantic distances of the kresourcesclosest to the centroids with respect to the items of the taxonomy using

    Wikipedia Link - based Measure (WLM) [Milne et al. 2008]- user profiles of interest are used as a place to recommend clusteredresources that the users can public on their public profile (profile curation)

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    11/17

    Video suggestions

    Video suggestions are computed considering and comparing the profiles of

    interest of the most similar users in the network:

    - users are described using a vector that contains the percentage of

    interest for all the categories of the system

    - percentages of interest are normalized counting users Facebook likes on

    categories in the cold start scenario and refined later consideringresources added by users while curating their public profile or extracted from

    their comments

    - this vector of weighted categories is used to compute user similarity andto determine a user neighborhood inside the network

    - once the neighborhood nis defined, the recommendation is generated by

    ranking items using the preferences expressed by users in n;

    -formally, the proposed system modifies the user-based recommendationalgorithm, in that it uses the similarity of user interests to select the items on

    which recommendation is computed.

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    12/17

    Recommendation algorithm

    given a user uforeach other user wdo

    compute an interests similarity sbetween uand w.

    create a neighborhood ncontaining top kusers, ranked by similarity.

    end

    foreachitem i that some user in n has a preference for, but that u has no

    preference for yetdo

    foreachother user v in n that has a preference for ido

    compute a similarity sbetween uand v.

    incorporate vs preference for i, weighted by s, into a running average.

    endend

    returnthe top items, ranked by weighted average

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    Definition of neighborhood

    based on interests similarity

    Recommendationconsidering votesderived from CF

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    13/17

    Evaluation

    In order to evaluate the accuracy of the recommendations, we extracted apercentage of the collected data, represented by users ratings on videos,

    and used them as test data, not used to train the recommendation system.

    The recommender engine produces rating predictions for the missing test

    data, that are compared to the actual values in order to evaluate theaccuracy.

    Dataset

    - 138 videos and 51 users

    - 152 expressed preferences from 1 to 5 stars, sparsity level 0,978

    - user interests profiles are represented by 383 resources that are

    organized in 15 main categories.

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    14/17

    Results - First

    Experiment

    - we selected 90% of

    our data-set as trainingset, and to perform an

    evaluation on the

    remaining 10% of the

    data, using a repeated

    random sub-samplingvalidation (1000

    iterations)

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    - we tested three different distance measures: Euclidean distance, Pearsoncorrelation and Log-Likelihood, in terms of RMSE. Euclidean distanceprovided best results.

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    15/17

    Results - Second Experiment

    - we compared ourrecommendation algorithm with

    the user-based CollectiveFiltering algorithm that does

    not consider interest profile

    similarity

    - the neighborhood

    dimension does actually affect

    the quality of prediction

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    - the algorithm always performs significantly better, in particular when a small

    number of neighbors is involved. (RMSE of 0.96 vs. 1.66 of the classical CF

    algorithm for a neighborhood of 5 users)

    - when the size of the neighborhood grows, the two approaches tend to give

    similar results, although our proposed solution performs better.

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    16/17

    Future work

    - automatic visual tagging- dynamic video saliency computation on video shots to improve videossequence suggestions

    - sentiment analysis on video scenes

    - enrichment of our dataset using services like Mechanical Turk

    ACM Multimedia 2013 - 2nd International Workshop on Socially-Aware Multimedia, October 21

    https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/https://sites.google.com/site/sociallyawaremultimedia2013/
  • 7/27/2019 Acmmm13 Sam Presentation

    17/17

    https://vimeo.com/55771570https://vimeo.com/55771570