modeling user interactions in web search and social media

Modeling User Interactions in Web Search and Social Media

Eugene AgichteinIntelligent Information Access Lab

Emory University

Intelligent Information Access Lab Intelligent Information Access Lab http://ir.mathcs.emory.edu/http://ir.mathcs.emory.edu/

• Research areas:Research areas:– Information retrieval & extraction, text mining, and information integrationInformation retrieval & extraction, text mining, and information integration– User behavior modeling, social networks and interactions, social mediaUser behavior modeling, social networks and interactions, social media

• People

And colleagues at Yahoo! Research, Microsoft Research, Emory Libraries, Psychology, Emory School of Medicine, Neuroscience, and Georgia Tech College of Computing.

• Support

Walter Askew, EC‘09

Qi Guo, 2nd year Ph.D

Yandong Liu,

2nd year Ph.D

Ryan Kelly, Emory’10

Alvin Grissom,2nd year MS

Abulimiti Aji, 1st Year Ph.D

http://ir.mathcs.emory.edu/

33

User Interactions:User Interactions:The 3The 3rdrd Dimension of the Web Dimension of the Web

• Amount exceeds web Amount exceeds web content and structurecontent and structure– Published: 4Gb/day; Published: 4Gb/day; Social Media: 10gb/Day Social Media: 10gb/Day – Page views: Page views: 100Gb/day100Gb/day

[Andrew Tomkins, Yahoo! Search, 2007][Andrew Tomkins, Yahoo! Search, 2007]

Talk Outline• Web Search Interactions

– Click modeling– Browsing

• Social media– Content quality– User satisfaction– Ranking and Filtering

Interpreting User Interactions

• Clickthrough and subsequent browsing behavior of individual users influenced by many factors– Relevance of a result to a query– Visual appearance and layout– Result presentation order– Context, history, etc.

• General idea: – Aggregate interactions across all users and queries– Compute “expected” behavior for any query/page– Recover relevance signal for a given query

Case Study: Clickthrough

Clickthrough frequency for all queries in sample

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 2 3 4 5 6 7 8 9 10

result position

Re

lati

ve

Cli

ck

Fre

qu

en

cy

All queries

Clickthrough (query q, document d, result position p) = expected (p) + relevance (q , d)

Clickthrough for Queries with Known Position of Top Relevant Result

Relative clickthrough for queries with known relevant results in position 1 and 3 respectively

1 2 3 5 10

Result Position

Re

lati

ve

Cli

ck

Fre

qu

en

cy

All queries

PTR=1

PTR=3

Higher clickthrough at top non-relevant than

at top relevant document

Model Deviation from “Expected” Behavior

• Relevance component: deviation from “expected”:

Relevance(q , d)= observed - expected (p)

-0.023-0.029

-0.009-0.001

-0.013

0.010-0.002 -0.001

0.144

0.063

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

1 2 3 5 10

Result position

Clic

k f

req

ue

nc

y d

ev

iati

on

PTR=1

PTR=3

Predicting Result Preferences

• Task: predict pairwise preferences– A user will prefer Result A > Result B

• Models for preference prediction – Current search engine ranking– Clickthrough– Full user behavior model

Predicting Result Preferences: Granka et al., SIGIR 2005

• SA+N: “Skip Above” and “Skip Next”– Adapted from Joachims’ et al. [SIGIR’05]– Motivated by gaze tracking

• Example– Click on results 2, 4– Skip Above: 4 > (1, 3), 2>1– Skip Next: 4 > 5, 2>3

1

2

3

4

5

6

7

8

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

1 2 3 4 5

Result position

Cli

ck

thro

ug

h F

req

ue

nc

y D

ev

iati

on

Our Extension: Use Click Distribution

• CD: distributional model, extends SA+N– Clickthrough considered iff frequency > ε than expected

• Click on result 2 likely “by chance”• 4>(1,2,3,5), but not 2>(1,3)

1

2

3

4

5

6

7

8

Results: Click Deviation vs. Skip Above+Next

Problem: Users click based on result summaries/”captions”/”Snippets”

Effect of Caption Features on Clickthrough Inversions, C. Clarke, E. Agichtien, S. Dumais, R. White, SIGIR 2007

Clickthrough Inversions

Relevance is Not the Dominant Factor!

Snippet Features Studied

Feature Importance

Important Words in Snippet

Summary

• Clickthrough inversions are powerful tool for assessing the influence of caption features.

• Relatively simple caption features can significantly influence user behavior.

•Can help more accurately predicting relevance from clickthough by accounting for summary bias.

2020

Idea: go beyond clickthrough/download countsIdea: go beyond clickthrough/download counts

PresentationPresentation

ResultPositionResultPosition Position of the URL in Current rankingPosition of the URL in Current ranking

QueryTitleOverlapQueryTitleOverlap Fraction of query terms in result TitleFraction of query terms in result Title

Clickthrough Clickthrough

DeliberationTimeDeliberationTime Seconds between query and first clickSeconds between query and first click

ClickFrequencyClickFrequency Fraction of all clicks landing on pageFraction of all clicks landing on page

ClickDeviationClickDeviation Deviation from expected click frequencyDeviation from expected click frequency

Browsing Browsing

DwellTimeDwellTime Result page dwell timeResult page dwell time

DwellTimeDeviationDwellTimeDeviation Deviation from expected dwell time for queryDeviation from expected dwell time for query

User Behavior Model

• Full set of interaction features– Presentation, clickthrough, browsing

• Train the model with explicit judgments– Input: behavior feature vectors for each query-page pair

in rated results

– Use RankNet (Burges et al., [ICML 2005]) to discover model weights

– Output: a neural net that can assign a “relevance” score to a behavior feature vector

RankNet for User Behavior• RankNet: general, scalable, robust Neural Net

training algorithms and implementation

• Optimized for ranking – predicting an ordering of items, not scores for each

• Trains on pairs (where first point is to be ranked higher or equal to second)– Extremely efficient– Uses cross entropy cost (probabilistic model)– Uses gradient descent to set weights – Restarts to escape local minima

RankNet [Burges et al. 2005]

Feature Vector1 Label1

NN output 1

• For query results 1 and 2, present pair of vectors and labels, label(1) > label(2)


Feature Vector2 Label2

NN output 1 NN output 2




Error is function of both outputs(Desire output1 > output2)




Error is function of both outputs(Desire output1 > output2)

• Update feature weights:– Cost function: f(o1-o2) – details in Burges et al. paper– Modified back-prop

Predicting with RankNet

Feature Vector1

NN output

• Present individual vector and get score

2828

Example results: Predicting User PreferencesExample results: Predicting User Preferences

SA+N

0.6

0.62

0.64

0.66

0.68

0.7

0.72

0.74

0.76

0.78

0.8

0 0.1 0.2 0.3 0.4

Recall

Pre

cis

ion

SA+N

CD

UserBehavior

Baseline

• Baseline < SA+N < CD << UserBehavior• Rich user behavior features result in dramatic improvement

How to Use Behavior Models for Ranking?

• Use interactions from previous instances of query– General-purpose (not personalized)– Only for the queries with past user interactions

• Models:– Rerank, clickthrough only:

reorder results by number of clicks

– Rerank, predicted preferences (all user behavior features): reorder results by predicted preferences

– Integrate directly into ranker: incorporate user interactions as features for the ranker

Enhance Ranker Features with User Behavior Features

• For a given query– Merge original feature set with user behavior

features when available

– User behavior features computed from previous interactions with same query

• Train RankNet [Burges et al., ICML’05] on the enhanced feature set

Feature Merging: Details

• Value scaling: – Binning vs. log-linear vs. linear (e.g., μ=0, σ=1)

• Missing Values: – 0? (meaning for normalized feats s.t. μ=0?)

• Runtime: significant plumbing problems

Result URL BM25 PageRank … Clicks DwellTime …

sigir2007.org 2.4 0.5 … ? ? …

Sigir2006.org 1.4 1.1 … 150 145.2 …

acm.org/sigs/sigir/ 1.2 2 … 60 23.5 …

Query: SIGIR, fake results w/ fake feature values

Evaluation Metrics

• Precision at K: fraction of relevant in top K• NDCG at K: norm. discounted cumulative gain

– Top-ranked results most important

• MAP: mean average precision– Average precision for each query: mean of the precision

at K values computed after each relevant document was retrieved

K

j

jrqq jMN

1

)( )1log(/)12(

Content, User Behavior: NDCG

BM25 < Rerank-CT < Rerank-All < +All

0.5

0.52

0.54

0.56

0.58

0.6

0.62

0.64

0.66

0.68

1 2 3 4 5 6 7 8 9 10K

ND

CG

BM25Rerank-CTRerank-AllBM25+All

Full Search Engine, User Behavior: NDCG, MAP

MAP Gain

RN 0.270

RN+ALL 0.321 0.052 (19.13%)

BM25 0.236

BM25+ALL 0.292 0.056 (23.71%)

0.56

0.58

0.6

0.62

0.64

0.66

0.68

0.7

0.72

0.74

1 2 3 4 5 6 7 8 9 10K

ND

CG

RNRerank-AllRN+All

User Behavior Complements Content and Web Topology

0.45

0.5

0.55

0.6

0.65

0.7

1 3 5 10K

Pre

cis

ion

RNRN+AllBM25BM25+All

Method P@1 Gain

RN (Content + Links) 0.632

RN + All (User Behavior) 0.693 0.061(10%)

BM25 0.525

BM25+All 0.687 0.162 (31%)

Which Queries Benefit Most

0

50

100

150

200

250

300

350

0.1 0.2 0.3 0.4 0.5 0.6

-0.4-0.35-0.3-0.25-0.2-0.15-0.1-0.0500.050.10.150.2

Frequency Average Gain

Most gains are for queries with poor ranking

Result Summary

• Incorporating user behavior into web search ranking dramatically improves relevance

• Providing rich user interaction features to ranker is the most effective strategy

• Large improvement shown for up to 50% of test queries

3838

User Generated ContentUser Generated Content

3939

Some goals of mining social mediaSome goals of mining social media

• Find high-quality contentFind high-quality content• Find Find relevantrelevant and high quality content and high quality content• Use millions of interactions toUse millions of interactions to

– Understand complex information needsUnderstand complex information needs– Model subjective information seekingModel subjective information seeking– Understand cultural dynamicsUnderstand cultural dynamics

4141

http://answers.yahoo.com/question/index;_ylt=3?qid=20071008115118AAh1HdO

http://answers.yahoo.com/question/index;_ylt=3?qid=20071008115118AAh1HdO

Lifecycle of a Question in CQA

42

User

Choose a category

Choose a category

Compose the question

Compose the question

Openquestion

Openquestion Examine

Find the answer?Find the answer?

Close questionChoose best answers

Give ratings

Close questionChoose best answers

Give ratings

Question is closed by system.Best answer is chosen by voters

Question is closed by system.Best answer is chosen by voters

Yes

No

AnswerAnswer AnswerAnswer AnswerAnswer

User User UserUser User User User

+-

--+ ++

4848

CommunityCommunity

5454

Editorial Quality != User Popularity != Editorial Quality != User Popularity != UsefulnessUsefulness

Are editor/judge labels “meaningful”?

• Information seeking process: want to find useful information about topic with incomplete knowledge

• N. Belkin: “Anomalous States of Knowledge”

• Want to model directly if user found satisfactory information

• Specific (amenable) case: CQA

5656

Yahoo! Answers: The Good NewsYahoo! Answers: The Good News

• Active community of millions of users in many Active community of millions of users in many countries and languagescountries and languages

• Accumulated a great number of questions and Accumulated a great number of questions and answersanswers

• Effective for Effective for subjectivesubjective information needs information needs– Great forum for socialization/chatGreat forum for socialization/chat

• (Can be) invaluable for hard-to-find (Can be) invaluable for hard-to-find information not available on webinformation not available on web

5858

Yahoo! Answers: The Bad NewsYahoo! Answers: The Bad News

• May have to wait a May have to wait a longlong time to get a satisfactory time to get a satisfactory answeranswer

• May May nevernever obtain a satisfying answer obtain a satisfying answer

0

5

10

15

20

25

30

35

40

1 2 3 4 5 6 7 8 9 10

1. 2006 FIFA World Cup2. Optical3. Poetry4. Football (American)5. Scottish Football (Soccer)6. Medicine7. Winter Sports8. Special Education9. General Health Care10. Outdoor Recreation

Time to close a question (hours) for sample question categories

Tim

e t

o c

lose

(hou

rs)

5959

Asker Satisfaction ProblemAsker Satisfaction Problem

• Given a question submitted by an asker in CQA, Given a question submitted by an asker in CQA, predict whether the user will be predict whether the user will be satisfiedsatisfied with the with the answers contributed by the community.answers contributed by the community.

– Where Where “Satisfied” “Satisfied” is defined as:is defined as:• The The askerasker personally has closed the question personally has closed the question ANDAND• Selected the best answer Selected the best answer ANDAND• Provided a rating of at least 3 “stars” for the best Provided a rating of at least 3 “stars” for the best

answeranswer

– Otherwise, the asker is Otherwise, the asker is ““UnsatisfiedUnsatisfied

Approach: Machine Learning over Content and Usage Features

• Theme: holistic integration of content analysis and usage analysis

• Method: Supervised (and later partially-supervised) machine learning over features

• Tools: – Weka (ML library): SVM, Boosting, DTs, NB, …– Part of speech taggers, chunkers– Corpora (wikipedia, web, queries, …)

6161

Satisfaction Prediction FeaturesSatisfaction Prediction Features

• Approach: Classification algorithms from machine Approach: Classification algorithms from machine learninglearning

ClassifierSupport Vector MachinesDecision TreeBoostingNaïve Bayes

asker is satisfied

asker is not satisfied

Textual Features

Category Features

Answerer HistoryFeaturesAsker History

Features

Answer FeaturesQuestion Features

6262

Prediction AlgorithmsPrediction Algorithms

• Heuristic: Heuristic: # answers # answers • Baseline: Baseline: Simply predicts the majority class (satisfied).Simply predicts the majority class (satisfied).• ASP_SVM: ASP_SVM: Our system with the SVM classifierOur system with the SVM classifier• ASP_C4.5:ASP_C4.5: with the C4.5 classifier with the C4.5 classifier• ASP_RandomForest: ASP_RandomForest: with the RandomForest classifier with the RandomForest classifier• ASP_Boosting: ASP_Boosting: with the AdaBoost algorithm combining with the AdaBoost algorithm combining

weak learnersweak learners• ASP_NaiveBayes: ASP_NaiveBayes: with the Naive Bayes classifierwith the Naive Bayes classifier

6363

Evaluation metricsEvaluation metrics• PrecisionPrecision

– The fraction of the predicted satisfied asker information needs The fraction of the predicted satisfied asker information needs that were indeed rated satisfactory by the asker.that were indeed rated satisfactory by the asker.

• RecallRecall– The fraction of all rated satisfied questions that were correctly The fraction of all rated satisfied questions that were correctly

identified by the system.identified by the system.

• F1F1– The geometric mean of Precision and Recall measures,The geometric mean of Precision and Recall measures,– Computed as 2*(precision*recall)/(precision+recall)Computed as 2*(precision*recall)/(precision+recall)

• AccuracyAccuracy– The overall fraction of instances classified correctly into the The overall fraction of instances classified correctly into the

proper class. proper class.

6464

DatasetsDatasets

Crawled from Yahoo! Answers in early 2008 (Thanks, Yahoo! for support)

QuestioQuestionn

AnswerAnswer AskerAskerss

CategorCategoriesies

% % SatisfiedSatisfied

216,170

1,963,615

158,515

100 50.7%

Data is available at http://ir.mathcs.emory.edu/shared

http://ir.mathcs.emory.edu/shared

6565

Dataset StatisticsDataset Statistics

CategoryCategory #Q#Q #A#A #A per Q#A per Q SatisfiedSatisfied Avg asker Avg asker ratingrating

Time to Time to close by close by askerasker

2006 FIFA 2006 FIFA World World Cup(TM)Cup(TM)

11119494

3563565959

329.86329.86 55.4%55.4% 2.632.63 47 47 minutesminutes

Mental Mental HealthHealth

151511

11511599

7.687.68 70.9%70.9% 4.304.30 1 day and 1 day and 13 hours13 hours

MathematicMathematicss

656511

23223299

3.583.58 44.5%44.5% 4.484.48 33 33 minutesminutes

Diet & Diet & FitnessFitness

454500

24324366

5.415.41 68.4%68.4% 4.304.30 1.5 days1.5 days

Asker satisfaction varies significantly across different categories.

#Q, #A, Time to close… -> Asker Satisfaction

6666

Satisfaction Prediction: Human PerfSatisfaction Prediction: Human Perf

• Truth: asker’s ratingTruth: asker’s rating• A random sample of 130 questionsA random sample of 130 questions• Annotated by researchers to calibrate the Annotated by researchers to calibrate the

asker satisfactionasker satisfaction– Agreement: 0.82Agreement: 0.82– F1: 0.45F1: 0.45

6767

A service provided by Amazon. Workers submit responses to a A service provided by Amazon. Workers submit responses to a Human Intelligence Task (HIT)Human Intelligence Task (HIT) for $0.01-0.1 per for $0.01-0.1 per Can usually get 1000s of items labeled in Can usually get 1000s of items labeled in hourshours

Satisfaction Prediction: Human Perf (Cont’d):Satisfaction Prediction: Human Perf (Cont’d): Amazon Mechanical TurkAmazon Mechanical Turk

6868

Satisfaction Prediction: Human Perf (Cont’d): Amazon Satisfaction Prediction: Human Perf (Cont’d): Amazon Mechanical TurkMechanical Turk

• MethodologyMethodology– Used the same 130 questionsUsed the same 130 questions– For each question, list the best answer, as well as For each question, list the best answer, as well as

other four answers ordered by votesother four answers ordered by votes– Five independent raters for each question. Five independent raters for each question. – Agreement: 0.9 F1: 0.61.Agreement: 0.9 F1: 0.61. – Best accuracy achieved when at least 4 out of 5 Best accuracy achieved when at least 4 out of 5

raters predicted asker to be ‘satisfied’ (otherwise, raters predicted asker to be ‘satisfied’ (otherwise, labeled as “unsatisfied”).labeled as “unsatisfied”).

6969

Comparison of Human and Automatic (F1 Comparison of Human and Automatic (F1 measure)measure)

ClassifierClassifier With TextWith Text Without TextWithout Text Selected Selected FeaturesFeatures

ASP_SVMASP_SVM 0.690.69 0.720.72 0.620.62

ASP_C4.5ASP_C4.5 0.750.75 0.760.76 0.770.77

ASP_RandomForASP_RandomForestest

0.700.70 0.740.74 0.680.68

ASP_BoostingASP_Boosting 0.670.67 0.670.67 0.670.67

ASP_NBASP_NB 0.610.61 0.650.65 0.580.58

Best Human Best Human PerfPerf

0.610.61

Baseline (naïve)Baseline (naïve) 0.660.66C4.5 is the most effective classifier in this task

Human F1 performance is lower than the naïve baseline!

7070

Features by Information Gain (Satisfied class)Features by Information Gain (Satisfied class)

• 0.14219 Q: Askers’ previous rating• 0.13965 Q: Average past rating by asker• 0.10237 UH: Member since (interval)• 0.04878 UH: Average # answers for by past Q• 0.04878 UH: Previous Q resolved for the asker• 0.04381 CA: Average asker rating for the category• 0.04306 UH: Total number of answers received• 0.03274 CA: Average voter rating• 0.03159 Q: Question posting time• 0.02840 CA: Average # answers per Q

7171

““Offline” vs. “Online” PredictionOffline” vs. “Online” Prediction

• Offline prediction:Offline prediction:– All features( question, answer, asker & category)All features( question, answer, asker & category)– F1: 0.77F1: 0.77

• Online prediction:Online prediction:– NONO answer features answer features– Only asker history and question features (stars, Only asker history and question features (stars,

#comments, sum of votes…)#comments, sum of votes…)– F1: 0.74F1: 0.74

7272

Feature AblationFeature Ablation

Precision Recall F1

Selected features 0.80 0.73 0.77

No question-answer features 0.76 0.74 0.75

No answerer features 0.76 0.75 0.75

No category features 0.75 0.76 0.75

No asker features 0.72 0.69 0.71

No question features 0.68 0.72 0.70

Asker & Question features are most important.

Answer quality/Answerer expertise/Category characteristics:

may not be important

caring or supportive answers might be preferred sometimes

7373

Satisfaction: varying by asker experienceSatisfaction: varying by asker experience

Group together questions from askers with the same number of previous questionsAccuracy of prediction increase dramaticallyReaching F1 of 0.9 for askers with >= 5 questions

7474

Personalized Personalized Prediction of Asker Prediction of Asker Satisfaction with infoSatisfaction with info

• Same information != same usefulness for different users!Same information != same usefulness for different users!

• Personalized classifier achieves surprisingly good accuracy Personalized classifier achieves surprisingly good accuracy (even with just 1 previous question!)(even with just 1 previous question!)

• Simple strategy of grouping users by number of previous Simple strategy of grouping users by number of previous questions is more effective than other methods for users with questions is more effective than other methods for users with moderate amount of historymoderate amount of history

• For users with >= 20 questions, textual features are more For users with >= 20 questions, textual features are more significantsignificant

7575

Some Some ResultsResults

7676

Some Personalized ModelsSome Personalized Models

7777

SummarySummary• Asker satisfaction is predictableAsker satisfaction is predictable

– Can achieve Can achieve higher than humanhigher than human accuracy by exploiting accuracy by exploiting interaction historyinteraction history

• User’s experience is importantUser’s experience is important• General model: one-size-fits-allGeneral model: one-size-fits-all

– 2000 questions for training model are enough2000 questions for training model are enough

• PersonalizedPersonalized satisfaction prediction: satisfaction prediction:– Helps with sufficient data (>= 1 prev interactions, can Helps with sufficient data (>= 1 prev interactions, can

observe text patterns with >=20 prev. interactions)observe text patterns with >=20 prev. interactions)

7878

Other tasks in progressOther tasks in progress

• Subjectivity, sentiment analysisSubjectivity, sentiment analysis– B. Li, Y. Liu, and E. Agichtein, B. Li, Y. Liu, and E. Agichtein, CoCQA: Co-Training CoCQA: Co-Training

Over Questions and Answers with an Application Over Questions and Answers with an Application to Predicting Question Subjectivity Orientationto Predicting Question Subjectivity Orientation, in , in Proc. of EMNLP 2008Proc. of EMNLP 2008

• Discourse analysisDiscourse analysis• Cross-cultural comparisonsCross-cultural comparisons• CQA vs. web search comparisonCQA vs. web search comparison

7979

SummarySummary

• User-generated ContentUser-generated Content– GrowingGrowing– Important: impact on main-stream media, Important: impact on main-stream media,

scholarly publishing, …scholarly publishing, …– Can provide insight into information seeking and Can provide insight into information seeking and

social processessocial processes– ““Training” data for IR, machine learning, NLP, ….Training” data for IR, machine learning, NLP, ….– Need to re-think quality, impact, usefulnessNeed to re-think quality, impact, usefulness

References• Y. Liu, J. Bian, and E. Agichtein, Predicting Information Seeker Satisfaction in Community Question

Answering, in Proc. of the ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR), 2008

• Y. Liu and E. Agichtein, You've Got Answers: Towards Personalized Models for Predicting Success in Community Question Answering (short paper), in Proc. of the Annual Meeting of the Association for Computational Linguistics (ACL), 2008

• B. Li, Y. Liu, and E. Agichtein, CoCQA: Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation, in Proc. of Conference on Empirical Methods in Natural Language Processing (EMNLP), 2008

• E. Agichtein, C. Castillo, D. Donato, A. Gionis, G. Mishne, Finding High Quality Content in Social Media, in Proc. of the ACM Web Search and Data Mining Conference (WSDM), 2008

• C. Clarke, E. Agichtein, S. T. Dumais, and R. W. White, The Influence of Caption Features on Clickthrough Patterns in Web Search, in Proc. of the ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2007

• P. Jurczyk and E. Agichtein, Discovering Authorities in Question Answer Communities Using Link Analysis (short paper), in Proc. of the ACM Conference on Information and Knowledge Management (CIKM), 2007

• E. Agichtein, E. Brill, and S. T. Dumais, Improving Web Search Ranking by Incorporating User Behavior Information, in Proc. of the ACM SIGIR Conference on Research and Development on Information Retrieval (SIGIR), 2006

• E. Agichtein, E. Brill, S. T. Dumais, and R. Ragno, Learning User Interaction Models for Predicting Web Search Result Preferences, in Proc. of the ACM SIGIR Conference on Research and Development on Information Retrieval (SIGIR), 2006

Thank you!

8282

Question-Answer FeaturesQuestion-Answer FeaturesQ: length, posting

time…

QA: length, KL divergence

Q:Votes

Q:Terms

8383

User FeaturesUser Features

U: Member since

U: Total points

U: #Questions

U: #Answers

8484

Category FeaturesCategory Features• CA: Average time to close a CA: Average time to close a

questionquestion• CA: Average # answers per CA: Average # answers per

questionquestion• CA: Average asker ratingCA: Average asker rating• CA: Average voter ratingCA: Average voter rating• CA: Average # questions per CA: Average # questions per

hourhour• CA: Average # answers per CA: Average # answers per

hourhour

CategoryCategory #Q#Q #A#A #A per #A per QQ

SatisfiedSatisfied Avg asker Avg asker ratingrating

Time to close by Time to close by askerasker

General General HealthHealth

134134 737737 5.465.46 70.4%70.4% 4.494.49 1 day and 13 hours1 day and 13 hours

modeling user interactions in web search and social media

Documents

year ph

modeling user interactions

gbday social media

yahoo search

social networks

aggregate interactions

expected p relevance

yahoo research