![Page 1: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/1.jpg)
NLPCC 2014
A Unified Microblog User Similarity Model for Online Friend Recommendation
Shi Feng, Le Zhang, Daling Wang, Yifei Zhang
![Page 2: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/2.jpg)
NLPCC 2014
Outline
• Motivation
• The Characteristics of Friend Relationship in
Microblogs
• Friend Recommendation by Combining Multiple
Measures
• Experiments
• Conclusion and Future Work
![Page 3: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/3.jpg)
NLPCC 2014
Outline
• Motivation
• The Characteristics of Friend Relationship in
Microblogs
• Friend Recommendation by Combining Multiple
Measures
• Experiments
• Conclusion and Future Work
![Page 4: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/4.jpg)
NLPCC 2014
Motivation
• Real-life friends from school-mates, colleagues, neighbors
• Extend real-life social relations into online virtual social networks
• Microblogging services – Weibo, Twitter
– 536 million registered users
– More than 100 million tweets are generated per day
![Page 5: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/5.jpg)
NLPCC 2014
Motivation
• Friend relationship in microblog is different – The users can follow someone without his or her
permissions (more casual)
– The users may add a friend link to someone because of similar hobbies, tags, locations or hot topics
• Major contributions – Find critical features for friend recommendation
– Propose a similarity model by linearly combining multiple measures
– Validate the effectiveness of the proposed method on a real-world microblog dataset
![Page 6: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/6.jpg)
NLPCC 2014
Outline
• Motivation
• The Characteristics of Friend Relationship in
Microblogs
• Friend Recommendation by Combining Multiple
Measures
• Experiments
• Conclusion and Future Work
![Page 7: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/7.jpg)
NLPCC 2014
Crawled Dataset
• Who is the target user’s potential good friend in microblogs? It can be determined by many features because the microblog is full of personal and social relation information
Where the
user is from
Where the user
have been
High
coverage
![Page 8: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/8.jpg)
NLPCC 2014
Statistics of the Crawled Dataset
• The average similarity between users of friends and strangers (based on cosine similarity)
• These features are good indicators for friend recommendation
![Page 9: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/9.jpg)
NLPCC 2014
Outline
• Motivation
• The Characteristics of Friend Relationship in
Microblogs
• Friend Recommendation by Combining Multiple
Measures
• Experiments
• Conclusion and Future Work
![Page 10: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/10.jpg)
NLPCC 2014
Overall Framework of the Proposed Model
![Page 11: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/11.jpg)
NLPCC 2014
Candidate Friend Set Generation
• Select users that are friends’ friends
– Rank the users by their common friends
– Extract the top k users in fr(u)
• Select the most popular k users in microblogs – Assumption: The celebrities are usually good candidate
friends
• Combine these two set together to form the final candidate friend set fc(u) that has 2k users
1
( ) ( ) ( )n
r i
i
f u f u f u
![Page 12: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/12.jpg)
NLPCC 2014
Overall Framework of the Proposed Model
![Page 13: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/13.jpg)
NLPCC 2014
User Tag Similarity
• Challenges – User tag vectors are very
sparse
– Many OOV words for WordNet
• Build tag tree – hierarchical clustering
– Recalculate the similarity based on the tree
{tag1}
{tag5} {tag6}
{tag2}
{tag4}
{tag3}
{tag2,tag3}
{tag5,tag6}
{tag1,tag2,tag3}
{tag1,tag2,tag3,tag4}
{tag1,tag2,tag3,tag4,tag5,tag6} Depth 0
Depth 1
Depth 2
Depth 3
Depth 4
![Page 14: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/14.jpg)
NLPCC 2014
User Tag Similarity
• Similarity between two tags
• Similarity between two users
{tag1}
{tag5} {tag6}
{tag2}
{tag4}
{tag3}
{tag2,tag3}
{tag5,tag6}
{tag1,tag2,tag3}
{tag1,tag2,tag3,tag4}
{tag1,tag2,tag3,tag4,tag5,tag6} Depth 0
Depth 1
Depth 2
Depth 3
Depth 4
, 1 ( , )
1( , )
2 /( ( ) ( 1))tt a b
k k SP a b
sim t td k d k
( ) ( )
1( , ) ( , )
| ( ) || ( ) |a i b j
ts i j tt a b
t T u t T ui j
sim u u sim t tT u T u
![Page 15: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/15.jpg)
NLPCC 2014
Overall Framework of the Proposed Model
![Page 16: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/16.jpg)
NLPCC 2014
Overall Framework of the Proposed Model
![Page 17: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/17.jpg)
NLPCC 2014
User Geography Similarity
• Location Similarity – simct(ui,uj)=1, if the two users have the same location
– simct(ui,uj)=0, if the two user do not have same location
• Check-in Similarity – Divide the check-in information into 12 categories
– Represent check-in using a vector with 12 dimensions
– chk(u)={cp1,cp2,…,cp12}
– simchk(ui,uj)=cos(chk(ui),chk(uj))
• Geography similarity
( , ) ( , ) (1 ) ( , )loc i j ct i j chk i jsim u u sim u u sim u u
![Page 18: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/18.jpg)
NLPCC 2014
Overall Framework of the Proposed Model
![Page 19: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/19.jpg)
NLPCC 2014
User Hot Topic Similarity
• The hot topic discussion that user takes part in could reflect his/her interests and hobbies
• If two users have discussed more hot topics in common, they will have bigger similarity
| ( ) ( ) |( , ) ( ( ), ( ))
| ( ) ( ) |
i j
tp i j i j
i j
TP u TP usim u u Jaccard TP u TP u
TP u TP u
![Page 20: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/20.jpg)
NLPCC 2014
Overall Framework of the Proposed Model
![Page 21: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/21.jpg)
NLPCC 2014
Unified User Similarity
• Tag, Geography, Hot topic information
• Rank the users in candidate friend set by sim(u,ui)
( , ) ( , ) ( , ) (1 ) ( , )i ts i loc i tp isim u u sim u u sim u u sim u u
![Page 22: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/22.jpg)
NLPCC 2014
Outline
• Motivation
• The Characteristics of Friend Relationship in
Microblogs
• Friend Recommendation by Combining Multiple
Measures
• Experiments
• Conclusion and Future Work
![Page 23: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/23.jpg)
NLPCC 2014
Experiment Setup
• We conduct the 5-fold cross validation on the crawled dataset
• We randomly partition user’s current friends and non-friends into 5 groups respectively
• We randomly put one group of friends and one group of non-friends together to form a subset of the crawled data
• For each run, four of the five subsets are used for training and the remaining one subset is used for testing
• Precision, Recall and F-Measure are used for evaluation
![Page 24: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/24.jpg)
NLPCC 2014
Parameter Tuning for Location Similarity
( , ) ( , ) (1 ) ( , )loc i j ct i j chk i jsim u u sim u u sim u u
![Page 25: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/25.jpg)
NLPCC 2014
Parameter Tuning for Location Similarity
( , ) ( , ) (1 ) ( , )loc i j ct i j chk i jsim u u sim u u sim u u
![Page 26: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/26.jpg)
NLPCC 2014
Parameter Tuning for Friend Recommendation
( , ) ( , ) ( , ) (1 ) ( , )i ts i loc i tp isim u u sim u u sim u u sim u u
![Page 27: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/27.jpg)
NLPCC 2014
Recommendation Results
![Page 28: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/28.jpg)
NLPCC 2014
Recommendation Results
![Page 29: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/29.jpg)
NLPCC 2014
Outline
• Motivation
• Related Work
• Learning Sentiment Lexicon from Massive
Microblog Data
• Sentiment Lexicon Optimization
• Experiments
• Conclusion and Future Work
![Page 30: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/30.jpg)
NLPCC 2014
Conclusion and Future Work
• The friend relationship in microblogs are quite different from other traditional social media – More casual/Unidirectional friendship
– Tags, Locations, Check-ins, Hot topics are good indicators for friend recommendation
• A linearly combination model of multiple measurements are proposed for calculating similarity in microblogs
• Future work – More measures, such as time factors
– New similarity measurement
![Page 31: A Unified Microblog User Similarity Model for Online ...tcci.ccf.org.cn/conference/2014/ppts/nlpcc/ppt98.pdf · –Propose a similarity model by linearly combining multiple measures](https://reader033.vdocuments.us/reader033/viewer/2022060520/604e91d9c399416881671d42/html5/thumbnails/31.jpg)
NLPCC 2014
Thank you for your attention!