hypertext2017-leveraging followee list memberships for inferring user interests for passive users on...
TRANSCRIPT
Guangyuan Piao, John G. Breslin
Unit for Social Semantics
28th ACM Conference on Hypertext and Social Medial Prague, Czech Republic, 4-7, July, 2017
Leveraging Followee List Memberships for Inferring User Interests for Passive Users on Twitter
2
1/3 users seek medical information and over 50% users consume news
on Social Networks
Facebook and Twitter together generate more than 5 billion microblogs / day
[SOURCE] Semantic Filtering for Social Data, Amit et al., Internet Computing’16
According to a research done by Twocharts, 44% of Twitter users have never sent a tweet
[SOURCE] http://guardianlv.com/2014/04/twitter-users-are-not-tweeting/
! user modeling for active users • analyzing users’ tweets
• representing user interests using different approaches • bag-of-words
• topic modeling
• bag-of-concepts
dbr:Eagles_of_Death_Metal (5)
Related Work
5
interest frequency
dbr:The_Wombats (2)
dbc:Hard_rock
dbp:genre
! user modeling for passive users • analyzing information of users’ followees
• HIW(followees_tweet) [Chen et al. SIGCHI’10] • a great amount of data, but also noisy
• SA(followees_name) [Besel et al. SAC’16, Faralli et al. SNAM’16] • link names to entities, construct category-based user profiles • spreading activation + WiBi-taxonomy (Wikipedia categories)
Related Work
6
dbr:Cristiano_Ronaldo (5)
dbc:Real_Madrid_C.F._players
dbr:2014_FIFA_World_Cup_players
Category A
Category B
…
…
! user modeling for passive users • analyzing information of users’ followees
• IP(followees_bio) [Piao et al. ECIR’17]
• exploring related categories & entities (1-hop)
• performed better than HIW(followees_tweet) SA(followees_name)
Related Work
7
BobHorry@bob
Android developer,educator
dbr:Android_(operating_system)
dbc:Smartphones
dbr:Java_(programming_language) dbc:Tablet_operating_systems
dc:subject
dc:subject dbp:programmedIn
Different Views of Followees
! user modeling for passive users
8
BobHorry@bob
Android developer,educator
biographies (self-description)
list memberships (others-descriptions)
Aim of Work
! user modeling for passive users
• we aim to investigate
• whether we can leverage the list memberships of followees for inferring user interest profiles,
• whether two different views of followees complement each other to improve the quality of user profiles
9
10
Our Approach
! user modeling leveraging list memberships of followees
1fetchuser’sfollowees
3extracten33esfrom
followees’listmemberships
5interest
propaga3on
Twitter user @alice
Interest profile
Twitter API
Tag.me
DBpedia graph
2fetchlist
membershipsoffollowees
4construc3ng
primaryinterests
Twitter API
weigh3ngscheme
11
Constructing Primary Interests
! Weighting Scheme 1 (WS1) • profile of a followee f in Fu :
where
• weight of an entity with respect to the target user
A (0.1) B (0.2) F (0.1) …
… …
B (0.3) F (0.2) C (0.2) …
normalized followee profile Fu
B (0.5) … F (0.3) …
12
Constructing Primary Interests
! Weighting Scheme 2 (WS2) • based on the idea of HIW (Chen et al. CHI’10)
• excluding entities extracted only in a single followee • w(u, cj) = the number of followees who have cj in their list
memberships.
A B F …
… …
B F C …
B (2) … F (2) …
13
Interest Propagation
! interest propagation using DBpedia (SEMANTiCS’16)
• SP: # of subpages • SC: # of subcategories
• P: # of properties appearing in the whole DBpedia graph • intuition 1: discount common categories
• intuition 2: discount related entities connected with common properties
dbr:Android_(operating_system)
dbc:Smartphones
dbr:Java_(programming_language) dbc:Tablet_operating_systems
dc:subject
dc:subject dbp:programmedIn
14
Interest Propagation
! interest propagation using DBpedia (SEMANTiCS’16)
• same as previous approach but with DBpedia refinement • extracting sub-graph of dbc:Main_topic_classifications
• merging categories and entities with the same names
dbc:Apple_Inc.(0.25)
dbr:Apple_Inc.(5)
dbr:Steve_Jobs(2)
Apple_Inc.(5.25)
Steve_Jobs(2)
before after
! main goal • analyze & compare different user modeling strategies in the
context of link (URL) recommendations
! link (URL) profile • same representation model for users, based on its content
! ground truth • links shared by users in their timeline in the last two weeks 15
Experiment Setup
UM#1
UM#2
candidate links (URLs)
recommendation algorithm
(cosine similarity)
top-N recommendations
16
Experiment Setup
! Twitter dataset • 439 random users
• 2,771 followees on average • considered up to 200 followees for each user due to the
Twitter API limit for crawling list memberships
! dataset for experiment • 439 users • 74,488 followees in total, 170 followees on average
• 15,053 candidate links for recommendations
17
Experiment Setup
! evaluation metrics • MRR (Mean Reciprocal Rank)
• the 1st relevant item occurs on average in recommendations
• S@N (Success rate) • mean probability of a relevant item occurs in the top-N list
• P@N (Precision) • mean probability of retrieved items in the top-N are relevant
• R@N (Recall) • mean probability of relevant items retrieved in in the top-N
18
Info. of List Memberships
• over 90% users, at least 1 list membership
• 173 list memberships, on average
• 3,047 vs. 23 entities from list memberships vs. bios considering up to 50 followees
0 10000 20000 30000 40000 50000
50
100
150
200
#ofen''es
#offo
llowees
withoutrefinement withrefinement
Results
9% compression of profile size, while remaining at a similar performance level
Results – combining two views
• combining two views of followees
The final rank of an item is determined by the average rank position of each rank based on two user models (Ryen et al. SIGIR’09) score =x : rank position based on 1st user model y : rank position based on 2nd user model β : importance control parameter
• combining two views improves the performance significantly
Results – combining two views
• combining two views of followees
0.0400
0.0450
0.0500
0.0550
0.0600
0.0650
0.0700
0.0750
0.0800
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
R@10
beta
50
100
150
200
best performance when β = 0.1, similar results for MRR, P@10, S@10
• list memberships paly more important role in the combination
Conclusions
• leveraging list memberships of followees > exploiting biographies especially in the case of a user having a small number of followees
• combining the two different views of followees can improve the quality of user modeling significantly,
• and the list memberships of followees play a more important role in the combination
24
Thank you for your attention!
Guangyuan Piao homepage: http://parklize.github.io e-mail: [email protected] twitter: https://twitter.com/parklize slideshare: http://www.slideshare.net/parklize