finding your friends and following them to where you are #wsdm2012

24
FindingYour Friends and Following Them to WhereYou Are Adam Sadilek, Henry Kautz, Jeffrey P. Bigham University of Rochester, NewYork, USA Presenter:Yoh Okuno #wsdm2012

Upload: yoh-okuno

Post on 11-Jun-2015

14.274 views

Category:

Technology


1 download

DESCRIPTION

Presented by Yoh Okuno, WSDM 2012 reading

TRANSCRIPT

Page 1: Finding Your Friends and Following Them to Where You Are #wsdm2012

Finding  Your  Friends  and  Following  Them  to  Where  You  Are

Adam  Sadilek,  Henry  Kautz,  Jeffrey  P.  Bigham    

University  of  Rochester,  New  York,  USA  

Presenter:  Yoh  Okuno  #wsdm2012  

Page 2: Finding Your Friends and Following Them to Where You Are #wsdm2012

•  Name:  Yoh  Okuno    

•  R&D  Engineer  at  Yahoo!  Japan  

•  Interest:  NLP  (Natural  Language  Processing),  

Machine  Learning,  and  Data  Mining.  

•  Skills:  C/C++,  Java,  Python,  and  Hadoop.  

•  Website:  http://yoh.okuno.name/  

About  Presenter

Page 3: Finding Your Friends and Following Them to Where You Are #wsdm2012

Overview

1.  Introduction  

2.  Friendship  Prediction  

3.  Location  Prediction  

4.  Evaluation  

5.  Conclusion  

Page 4: Finding Your Friends and Following Them to Where You Are #wsdm2012

1.  Introduction  

Page 5: Finding Your Friends and Following Them to Where You Are #wsdm2012

“Check-­‐in”  Services  or  Posts  with  Geo-­‐tags

Page 6: Finding Your Friends and Following Them to Where You Are #wsdm2012

Figure  1:  Tweets  with  Geo-­‐tags  at  New  York  City

http://cs.rochester.edu/u/sadilek/research  

Page 7: Finding Your Friends and Following Them to Where You Are #wsdm2012

Summary:  Predicting  Friendships  and  Locations

•  Tasks:  friendship  and  location  prediction  

•  Approach:  model  interaction  between  them  

•  Data:  real-­‐world  Twitter  dataset  

•  Problem:  private  locations  are  not  provided    

•  Result:  90%  of  private  locations  is  revealed  

Page 8: Finding Your Friends and Following Them to Where You Are #wsdm2012

Data:  Crawled  Twitter  Search  API  f0r  1  Month •  Focus  on  users  who  have  >100  geo-­‐tag  tweets  

Page 9: Finding Your Friends and Following Them to Where You Are #wsdm2012

FLAP:  Friendship  +  Location  Analysis  and  Prediction

Crawler

Visualizer

Learning  and  Inference

Page 10: Finding Your Friends and Following Them to Where You Are #wsdm2012

2.  Friendship  Prediction  Task  

Page 11: Finding Your Friends and Following Them to Where You Are #wsdm2012

Similarity  Features:  Text,  Location,  and  Graph

1.  Text:  inner  product  without  stop  word  

2.  Co-­‐location:  overlap  time  in  the  same  place  

3.  Graph  :  #  of  common  friends  (normalized)  

Page 12: Finding Your Friends and Following Them to Where You Are #wsdm2012

Learning:  Regression  Decision  Tree  (DT)

•  Used  DT  whose  output  is    probability  

•  These  3  features  had  the  maximum  

information  gain  for  DT  

•  Other  features  including  Jaccard  coefficient  

were  useless  in  this  case  

•  LSH  speeds  up  O(n^2)  operation

Page 13: Finding Your Friends and Following Them to Where You Are #wsdm2012

3.  Location  Prediction  Task  

Page 14: Finding Your Friends and Following Them to Where You Are #wsdm2012

Figure  3:  Dynamic  Bayesian  Network  (DBN)

•  People  move  between  tweets  t  and  t+1  

–  u_t:  location  of  user  u  at  tweet  t  

–  fi_t:  location  of  friend  i  at  tweet  t  

–  td_t:  time  of  day  at  tweet  t  

– w_t:  whether  it  is  work  day  or  not  at  tweet  t All  variables  are  discrete

Page 15: Finding Your Friends and Following Them to Where You Are #wsdm2012

Learning:  Both  Supervised  and  Unsupervised

•  Supervised  learning  for  each  geo-­‐active  users  

•  Unsupervised:  simulate  “virtual”  private  users  

– EM  algorithm  with  forward-­‐backward  

– Simulated  annealing  to  avoid  local  optimum  

Page 16: Finding Your Friends and Following Them to Where You Are #wsdm2012

4.  Evaluation  

Page 17: Finding Your Friends and Following Them to Where You Are #wsdm2012

Evaluation  for  Friendship  Prediction  Task

•  Evaluation  settings  

– Reconstructed  friendship  graphs  via  models  

– Selected  edges  randomly  from  0%  to  50%  

•  Evaluation  results  

– FLAP  outperforms  previous  works  

– FLAP  works  well  even  if  no  edges  were  given  

•  Note:  texts  and  locations  are  provided  normally  

Page 18: Finding Your Friends and Following Them to Where You Are #wsdm2012

Figure  4:  Averaged  ROC  Curve

Page 19: Finding Your Friends and Following Them to Where You Are #wsdm2012

Evaluation  for  Location  Prediction  Task

•  Evaluation  settings  – Data:  first  20  days  for  learning  /  later  6  days  for  test  

– Varied  #  of  friends  that  the  system  considers  

•  Evaluation  results  –  Supervised:  77%  accuracy  with  only  2  friends  

– Unsupervised:  57%  accuracy  with  9  friends  

–  “Locations  can  be  inferred  even  for  private  accounts”

Page 20: Finding Your Friends and Following Them to Where You Are #wsdm2012

Table  6:  Accuracy  for  Location  Prediction  Task

Page 21: Finding Your Friends and Following Them to Where You Are #wsdm2012

Conclusion

•  For  friendship  prediction  task:  

– Combined  text,  location  and  graph  features  

– Reconstructed  friendship  graph  with  no  seeds  

•  For  location  prediction  task:  

– Exploited  friend’s  locations  to  infer  location  

– Unsupervised  result  shows  “private  is  not  safe”  

Page 22: Finding Your Friends and Following Them to Where You Are #wsdm2012

Future  Work

•  Text  features  (NER)  for  location  prediction  

•  Joint  model  of  locations  and  friendships  

•  Evaluate  semi-­‐supervised  learning  (hopefully)  

•  Consider  the  privacy  issue  as  a  tradeoff  

Page 23: Finding Your Friends and Following Them to Where You Are #wsdm2012

Any  Questions?

Page 24: Finding Your Friends and Following Them to Where You Are #wsdm2012

More  Precisely:  Belief  Propagation