ecir2017-inferring user interests for passive users on twitter by leveraging followee biographies

20
Guangyuan Piao, John G. Breslin Unit for Social Semantics 39 th European Conference on Information Retrieval Aberdeen, Scotland, 9-13, April, 2017 Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

Upload: guangyuan-piao

Post on 22-Jan-2018

537 views

Category:

Education


0 download

TRANSCRIPT

Page 1: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

Guangyuan Piao, John G. Breslin

Unit for Social Semantics

39th European Conference on Information Retrieval Aberdeen, Scotland, 9-13, April, 2017

Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

Page 2: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

2

1/3 users seek medical information and over 50% users consume news

on Social Networks

Facebook and Twitter together generate more than 5 billion microblogs / day

[SOURCE] Semantic Filtering for Social Data, Amit et al., Internet Computing’16

Page 3: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

According to a research done by Twocharts, 44% of Twitter users have never sent a tweet

[SOURCE] http://guardianlv.com/2014/04/twitter-users-are-not-tweeting/

Page 4: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

How can we infer user interests for passive users based on the info of their followees?

Page 5: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

! user modeling for active users •  analyzing users’ tweets

•  representing user interests using different approaches •  bag-of-words

•  topic modeling

•  bag-of-concepts

dbpedia:Eagles_of_Death_Metal (5)

Related Work

5

interest frequency

dbpedia:The_Wombats (2)

Page 6: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

Related Work

! user modeling for passive users •  analyzing information of users’ followees

•  HIW(followees_tweet) [Chen et al. SIGCHI’10] •  a great amount of data •  but also noisy

•  SA(followees_name) [Besel et al. SAC’16, Faralli et al. SNAM’16] •  link names to entities •  construct category-based user profiles spreading activation + WiBi-taxonomy (Wikipedia categories)

6

dbpedia:Cristiano_Ronaldo (5)

Real_Madrid_C.F._players

2014_FIFA_World_Cup_players

Category A

Category B

Page 7: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

Exploiting Background Knowledge

! Wikipedia category system •  Wikipedia Bitaxonomy (Flati et al. ACL’14)

•  Hierarchical Interest Graph (Kapanipathi et al. ESWC’14)

7

Page 8: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

Exploiting Background Knowledge

! DBpedia •  beyond category information

•  using related entities with various properties

8

Page 9: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

Aim of Work

! user modeling for passive users •  limitation of using followees’ names

•  link names to entities (only popular followees can be linked)

•  12.7% in [Faralli et al. SNAM’16]

•  we aim to investigate •  whether we can leverage the biographies (bios)

of followees for inferring user interest profiles,

•  evaluate our approach against two state-of-art

user modeling strategies

9

BobHorry@bob

Android developer,educator

Page 10: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

10

Our Approach

1fetchuser’sfollowees

2extractentitiesfrombiosoffollowees

3interest

propagationTwitteruser@bob Interestprofile

TwitterAPI Aylien APIWiBi

taxonomyDBpediagraph

! user modeling leveraging biographies of followees

BobHorry@bob

Android developer,educator

dbpedia:Android (5)

Smartphones

Mobile_operating_systems

Category A

Page 11: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

11

Interest Propagation

! spreading activation (Kapanipathi et al. ESWC’14)

Category

dsubnodes

entity

Page 12: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

12

Interest Propagation

!  interest propagation using DBpedia (SEMANTiCS’16)

•  SP: # of subpages •  SC: # of subcategories

•  P: # of properties appearing in the whole DBpedia graph •  intuition 1: discount common categories

•  intuition 2: discount related entities connected with common properties

Page 13: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

13

Experiment Setup

! main goal •  analyze & compare different user modeling strategies in the

context of link (URL) recommendations

! recommendation algorithm •  cosine similarity between a user and a link (URL)

! ground truth •  links shared by users in their last two weeks

! candidate set (1,377 distinct links) •  all links shared by users in their last two weeks

Page 14: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

14

Experiment Setup

! Twitter dataset •  461 random users

•  902,544 followees •  90% of them filled their biographies

! dataset for experiment •  50 users •  84,646 followees, 77,825 distinct ones •  7,785 (10%) out of 77,825 followees can be linked to entities

•  72,145 (92.7%) of followees have bios

Page 15: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

15

Experiment Setup

! evaluation metrics •  MRR (Mean Reciprocal Rank)

•  the 1st relevant item occurs on average in recommendations

•  S@N (Success rate) •  mean probability of a relevant item occurs in the top-N list

•  P@N (Precision) •  mean probability of retrieved items in the top-N are relevant

•  R@N (Recall) •  mean probability of relevant items retrieved in in the top-N

Page 16: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

16

Observation

! # of entities extracted from names & bios of followees

•  more than twice the # of entities using bios of followees

•  on average, 509 entities (bios) vs. 210 entities (names)

0

100

200

300

400

500

600

followees_bio followees_name

averagenu

mbe

rofe

n--e

s

datasourcesforextrac-ngen--es

Page 17: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

17

Results

0.3402

0.4665

0.5532 0.5616

0

0.1

0.2

0.3

0.4

0.5

0.6

recommen

da)o

npe

rforman

ce

usermodelingstrategies

SA(followeesname) HIW(followeestweet)

SA(followeesbio) IP(followeesbio)

0.6250 0.6250

0.81250.7708

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

recommen

da)o

npe

rforman

ce

usermodelingstrategies

SA(followeesname) HIW(followeestweet)

SA(followeesbio) IP(followeesbio)

MRR S@10

Page 18: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

18

Results

0.1625

0.2521

0.2896

0.3354

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

recommen

da)o

npe

rforman

ce

usermodelingstrategies

SA(followeesname) HIW(followeestweet)

SA(followeesbio) IP(followeesbio)

0.0726

0.11860.1334

0.1555

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

recommen

da)o

npe

rforman

ce

usermodelingstrategies

SA(followeesname) HIW(followeestweet)

SA(followeesbio) IP(followeesbio)

P@10 R@10

Page 19: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

Conclusions

•  leveraging biographies of followees can provide :

•  more quantified user profiles (a greater number of entities)

•  more qualified user profiles in terms of recommendation performance

•  leveraging DBpedia for interest propagation provides better performance compared to using categories only

Page 20: ECIR2017-Inferring User Interests for Passive Users on Twitter by Leveraging Followee Biographies

20

Thank you for your attention!

Guangyuan Piao homepage: http://parklize.github.io e-mail: [email protected] twitter: https://twitter.com/parklize slideshare: http://www.slideshare.net/parklize