![Page 1: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/1.jpg)
Modeling Information Seeking Behavior in Social Media
Eugene AgichteinIntelligent Information Access Lab (IRLab)
![Page 2: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/2.jpg)
Eugene Agichtein, Emory University, IR Lab 2
Intelligent Information Access Lab (IRLab)
Qi Guo (3rd year Phd)
Ablimit Aji (2nd year PhD)
• Modeling information seeking behavior• Web search and social media search• Text and data mining for medical informatics and
public health
In collaboration with: - Beth Buffalo (Neurology)- Charlie Clarke (Waterloo)- Ernie Garcia (Radiology)- Phil Wolff (Psychology)- Hongyuan Zha (GaTech)
1st year graduate students: Julia Kiseleva, Dmitry Lagun, Qiaoling Liu, Wang Yu
Yandong Liu (2nd year Phd)
![Page 3: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/3.jpg)
Eugene Agichtein, Emory University, IR Lab 3
Online Behavior and Interactions
Information sharing: blogs, forums, discussions
Search logs: queries, clicks
Client-side behavior: Gaze tracking, mouse movement, scrolling
![Page 4: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/4.jpg)
Research Overview
Eugene Agichtein, Emory University, IR Lab
44
Information sharing
Health Informatics
Cognitive Diagnostics
Intelligent search
Discover Models of Behavior(machine learning/data mining)
![Page 5: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/5.jpg)
Eugene Agichtein, Emory University, IR Lab 5
Key Challenges for Web Search
• Query interpretation (infer intent)
• Ranking (high dimensionality)
• Evaluation (system improvement)
• Result presentation (information visualization)
![Page 6: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/6.jpg)
Eugene Agichtein, Emory University, IR Lab 6
Contextualized Intent Inference
• SERP text• Mouse trajectory, hovering/dynamics• Scrolling• Clicks
![Page 7: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/7.jpg)
Eugene Agichtein, Emory University, IR Lab 7
Research Intent
![Page 8: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/8.jpg)
Eugene Agichtein, Emory University, IR Lab 8
Purchase Intent
![Page 9: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/9.jpg)
Eugene Agichtein, Emory University, IR Lab 9
Relationship between behavior and intent?
• Search intent is contextualized within a search session
• Implication 1: model session-level state • Implication 2: improve detection based on client-
side interactions
![Page 10: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/10.jpg)
Eugene Agichtein, Emory University, IR Lab 10
Model: Linear Chain CRF
![Page 11: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/11.jpg)
Eugene Agichtein, Emory University, IR Lab 11
Results: Ad Click Prediction
• 200%+ precision improvement (within mission)
![Page 12: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/12.jpg)
Research Overview
Eugene Agichtein, Emory University, IR Lab
1212
Information sharing
Health Informatics
Cognitive Diagnostics
Intelligent search
Discover Models of Behavior(machine learning/data mining)
![Page 13: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/13.jpg)
Finding Information Online (Revisited)
13
Next generation of search: Algorithmically-mediated information exchange
CQA (collaborative question answering):• Realistic information exchange
• Searching archives
• Train NLP, IR, QA systems
• Study of social behavior, norms
Content quality, asker satisfaction
Current andfuture work
![Page 14: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/14.jpg)
Goal: Hybrid Human-Powered Search
1414
![Page 15: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/15.jpg)
Eugene Agichtein, Emory University, IR Lab 15
Talk Outline
Overview of the Emory IR Lab
Intent-centric Web Search
Classifying intent of a query
Contextualized search intent detection
![Page 16: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/16.jpg)
16
![Page 17: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/17.jpg)
(Text) Social Media Today
Published: 4Gb/day
Social Media: 10Gb/Day
Technorati+Blogpulse120M blogs2M posts/day
Twitter: since 11/07:2M users3M msgs/day
Facebook/Myspace: 200-300M usersAvg 19 m/day
Yahoo Answers: 90M users, 20M questions, 400M answers[Data from Andrew Tomkins, SSM2008 Keynote]
Yes, we could read your blog. Or, you could tell us about your day
![Page 18: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/18.jpg)
18
![Page 19: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/19.jpg)
19
Total time: 7-10 minutes, active “work”
![Page 20: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/20.jpg)
Someone must know this…
![Page 21: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/21.jpg)
21+1 minute
![Page 22: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/22.jpg)
+7 hours: perfect answer
![Page 23: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/23.jpg)
Update (2/15/2009)
23
![Page 24: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/24.jpg)
24
http://answers.yahoo.com/question/index;_ylt=3?qid=20071008115118AAh1HdO
![Page 25: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/25.jpg)
25
![Page 26: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/26.jpg)
Finding Information Online (Revisited)
26
Next generation of search: Algorithmically-mediated information exchange
CQA (collaborative question answering):• Realistic information exchange
• Searching archives
• Train NLP, IR, QA systems
• Study of social behavior, norms
Content quality, asker satisfaction
Current andfuture work
![Page 27: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/27.jpg)
(Some) Related Work
• Adamic et al., WWW 2007, WWW 2008:– Expertise sharing, network structure
• Elsas et al., SIGIR 2008: – Blog search
• Glance et al.: – Blog Pulse, popularity, information sharing
• Harper et al., CHI 2008, 2009: – Answer quality across multiple CQA sites
• Kraut et al.: – community participation
• Kumar et al., WWW 2004, KDD 2008, …: – Information diffusion in blogspace, network evolution
SIGIR 2009 Workshop on Searching Social Mediahttp://ir.mathcs.emory.edu/SSM2009/
27
![Page 28: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/28.jpg)
Finding High Quality Content in SM
• Well-written• Interesting• Relevant (answer)• Factually correct• Popular?• Provocative?• Useful?
28
As judged by professional editors
E. Agichtein, C. Castillo, D. Donato, A. Gionis, and G. Mishne, Finding High Quality Content in Social Media, in WSDM 2008
![Page 29: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/29.jpg)
Social Media Content Quality
29
E. Agichtein, C. Castillo, D. Donato, A. Gionis, G. Mishne, Finding High Quality Content in Social Media, WSDM 2008
quality
![Page 30: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/30.jpg)
3030
![Page 31: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/31.jpg)
31
How do Question and Answer Quality relate?
![Page 32: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/32.jpg)
3232
![Page 33: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/33.jpg)
3333
![Page 34: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/34.jpg)
3434
![Page 35: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/35.jpg)
3535
![Page 36: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/36.jpg)
Community
36
![Page 37: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/37.jpg)
Link Analysis for Authority Estimation
37
Question 1
Question 2
Answer 5
Answer 1
Answer 2
Answer 4
Answer 3
User 1
User 2
User 3
User 6
User 4
User 5
Answer 6
Question 3
User 1
User 2
User 3
User 6
User 4
User 5
Kj
jAiH..0
)()(
Mi
iHjA..0
)()(
Hub (asker) Authority (answerer)
![Page 38: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/38.jpg)
Qualitative Observations
HITS effective
HITS ineffective
38
![Page 39: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/39.jpg)
3939
Random forest classifier
![Page 40: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/40.jpg)
Result 1: Identifying High Quality Questions
40
![Page 41: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/41.jpg)
Top Features for Question Classification
• Asker popularity (“stars”)
• Punctuation density
• Question category
• Page views
• KL Divergence from reference LM41
![Page 42: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/42.jpg)
Identifying High Quality Answers
42
![Page 43: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/43.jpg)
Top Features for Answer Classification
• Answer length
• Community ratings
• Answerer reputation
• Word overlap
• Kincaid readability score43
![Page 44: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/44.jpg)
Finding Information Online (Revisited)
44
• Next generation of search: • human-machine-human
• CQA: a case study in complex IRContent quality• Asker satisfaction• Understanding the interactions
![Page 45: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/45.jpg)
Dimensions of “Quality”
• Well-written• Interesting• Relevant (answer)• Factually correct• Popular?• Timely?• Provocative?• Useful?
45
As judged by the asker (or community)
![Page 46: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/46.jpg)
Are Editor Labels “Meaningful” for CGC?
• Information seeking process: want to find useful information about topic with incomplete knowledge– N. Belkin: “Anomalous states of knowledge”
• Want to model directly if user found satisfactory information
• Specific (amenable) case: CQA
![Page 47: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/47.jpg)
Yahoo! Answers: The Good News
• Active community of millions of users in many countries and languages
• Effective for subjective information needs– Great forum for socialization/chat
• Can be invaluable for hard-to-find information not available on the web
47
![Page 48: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/48.jpg)
48
![Page 49: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/49.jpg)
Yahoo! Answers: The Bad News
0
5
10
15
20
25
30
35
40
1 2 3 4 5 6 7 8 9 10
49
May have to wait a long time to get a satisfactory answer
May never obtain a satisfying answer
1. FIFA World Cup2. Optical3. Poetry4. Football (American)5. Soccer6. Medicine7. Winter Sports8. Special Education9. General Health Care10. Outdoor RecreationTime to close a question (hours)
![Page 50: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/50.jpg)
Predicting Asker Satisfaction
Given a question submitted by an asker in CQA, predict whether the user will be satisfied with the answers contributed by the community.
– “Satisfied” :• The asker has closed the question AND• Selected the best answer AND• Rated best answer >= 3 “stars” (# not important)
– Else, “Unsatisfied
50
Yandong Liu Jiang Bian
Y. Liu, J. Bian, and E. Agichtein, in SIGIR 2008
![Page 51: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/51.jpg)
51
ASP: Asker Satisfaction Prediction
asker is satisfied
asker is not satisfied
TextCategory
Answerer History
Asker History
Answer
Question
Wikipedia
News
Classifier
![Page 52: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/52.jpg)
52
Experimental Setup: Data
Questions
Answers Askers Categories
% Satisfied
216,170 1,963,615
158,515
100 50.7%
Crawled from Yahoo! Answers in early 2008
“Anonymized” dataset available at: http://ir.mathcs.emory.edu/shared/
1/2009: Yahoo! Webscope : “Comprehensive” Answers dataset: ~5M questions & answers.
![Page 53: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/53.jpg)
Satisfaction by Topic
Topic Questions
Answers
A per Q
Satisfied
Asker rating
Time to close by asker
2006 FIFA World Cup
1194 35,659
329.86
55.4%
2.63 47 minutes
Mental Health
151 1159 7.68 70.9%
4.30 1.5 days
Mathematics
651 2329 3.58 44.5%
4.48 33 minutes
Diet & Fitness
450 2436 5.41 68.4%
4.30 1.5 days
53
![Page 54: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/54.jpg)
54
Satisfaction Prediction: Human Judges
• Truth: asker’s rating• A random sample of 130 questions• Researchers
– Agreement: 0.82 F1: 0.45 2P*R/(P+R)
• Amazon Mechanical Turk– Five workers per question. – Agreement: 0.9 F1: 0.61 – Best when at least 4 out of 5 raters agree
![Page 55: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/55.jpg)
Performance: ASP vs. Humans (F1, Satisfied)
Classifier With Text Without Text Selected Features
ASP_SVM 0.69 0.72 0.62
ASP_C4.5 0.75 0.76 0.77
ASP_RandomForest 0.70 0.74 0.68
ASP_Boosting 0.67 0.67 0.67
ASP_NB 0.61 0.65 0.58
Best Human Perf 0.61
Baseline (random)
0.66
55ASP is significantly more effective than humans
Human F1 is lower than the random baseline!
![Page 56: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/56.jpg)
Top Features by Information Gain
• 0.14 Q: Askers’ previous rating• 0.14 Q: Average past rating by
asker• 0.10 UH: Member since (interval)• 0.05 UH: Average # answers for by
past Q• 0.05 UH: Previous Q resolved for the
asker• 0.04 CA: Average asker rating for
category• 0.04 UH: Total number of answers
received…
56
![Page 57: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/57.jpg)
57
“Offline” vs. “Online” Prediction
• Offline prediction (AFTER answers arrive)– All features( question, answer, asker & category)– F1: 0.77
• Online prediction (BEFORE question posted)– NO answer features– Only asker history and question features (stars,
#comments, sum of votes…)– F1: 0.74
![Page 58: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/58.jpg)
Personalized Prediction of Satisfaction
Same information != same usefulness for different searchers!
Personalization vs. “Groupization”?
58
Y. Liu and E. Agichtein, You've Got Answers: Personalized Models for Predicting Success in Community Question Answering, ACL 2008
![Page 59: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/59.jpg)
Example Personalized Models
59
![Page 60: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/60.jpg)
Outline
60
• Next generation of search: • Algorithmically mediated information exchange
• CQA: a case study in complex IRContent qualityAsker satisfaction
![Page 61: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/61.jpg)
Current Work (in Progress)
• Partially supervised models of expertise(Bian et al., WWW 2009)
• Real-time CQA
• Sentiment, temporal sensitivity analysis
• Understanding Social Media dynamics
![Page 62: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/62.jpg)
Answer Arrival
62
5 10 15 20 25 30 35 40 45 50 55 600
100000
200000
300000
400000
500000
600000
700000
573086
378227
146845
7226046364 34573 27322 23194 19952 17260 15481 13985
First Hour (69%)
Time in minutes
Answer number arrived in < T
![Page 63: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/63.jpg)
Exponential Decay Model [Lerman 2007]
![Page 64: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/64.jpg)
Factors Influencing Dynamics
![Page 65: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/65.jpg)
Example: Answer Arrival | Category
![Page 66: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/66.jpg)
Subjectivity
![Page 67: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/67.jpg)
Answer, Rating Arrival
![Page 68: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/68.jpg)
Preliminary Results: Modeling SM Dynamics for Real-Time Classification
• Adapt SM dynamics models to classificatione.g.: predict ratings
feature value:
![Page 69: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/69.jpg)
Outline
69
• Next generation of search: • Algorithmically mediated information exchange
• CQA: a case study in complex IRContent qualityAsker satisfactionUnderstanding social media dynamics
![Page 70: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/70.jpg)
Eugene Agichtein, Emory University, IR Lab 70
Question Urgency
Problem – a growing volume of questions competing for visibility
• Time-sensitive (urgent) questions pushed out by newer questions
• Delayed responses may become useless to seeker – wastes site resources and responders’ time
![Page 71: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/71.jpg)
Goal: Query Processing over Web and Social Systems
7171
![Page 72: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/72.jpg)
Takeaways
Robust machine learning over behavior data system improvements, insights into behavior
Contextualized models for NLP and text mining system improvements, insights into interactions
Mining social media: potential for transformative impact for IR, sociology, psychology, medical informatics, public health, …
72
![Page 73: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/73.jpg)
References • Modeling web search behavior [SIGIR 2006, 2007]• Estimating content quality [WSDM 2008]• Estimating contributor authority [CIKM 2007]• Searching CQA archives [WWW 2008, WWW 2009]• Inferring asker intent [EMNLP 2008]• Predicting satisfaction [SIGIR 2008, ACL 2008, TKDE]• Coping with spam [AIRWeb 2008]
More information, datasets, papers, slides:http://www.mathcs.emory.edu/~eugene/
![Page 74: Modeling Information Seeking Behavior in Social Media Eugene Agichtein Intelligent Information Access Lab (IRLab)](https://reader036.vdocuments.us/reader036/viewer/2022070410/56649ea45503460f94ba8a5f/html5/thumbnails/74.jpg)
Eugene Agichtein, Emory University, IR Lab 74
Thank you!
• Yandex (for hosting my visit)
Supported by: