group recommendation: semantics and efficiency sihem amer-yahia (yahoo! labs), senjuti basu-roy...

34
Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam Das (Univ. of Arlington), and Cong Yu (Yahoo! Labs).

Upload: emilio-noyce

Post on 14-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Group Recommendation: Semantics and EfficiencySihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam Das (Univ. of Arlington), and Cong Yu (Yahoo! Labs).

Page 2: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Recommendation

Individual User Recommendation : “Which movie should I watch?” “What city should I visit?” “What book should I read?” “What web page has the information I need?”

2

Page 3: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Group Recommendation Individual User Recommendation

User seeking intelligent ways to search through the enormous volume of information available to her.

How to Recommend?

Solution : Group RecommendationHelps socially acquainted individuals find content of interest to all of them together.

Movies – for a family!

Restaurants – for a work group lunch!

3

Places to visit – using a travel agency!

Page 4: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Existing Solution

No prior work on efficient processing for Group Recommendation

Existing solutions aggregate ratings (referred to as relevance) among group members Preference Aggregation: aggregates group

members’ prior ratings into a single virtual user then computes recommendations for that user

Rating Aggregation: aggregate individual ratings on the fly using Average Least Misery: computes min rating

4

Page 5: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Why Is Rating Aggregation Not Enough?

Task: recommend a movie to group G ={u1, u2 ,u3} relevance (u1,”God Father”) = 5 relevance (u2, “God Father”) = 1 relevance (u3, ”God Father”) = 1

relevance (u1, ”Roman Holiday”) = 3 relevance (u2, “Roman Holiday”) = 3 relevance (u3, ”Roman Holiday”) = 1

Average Relevance and Least Misery fail to distinguish between “God Father” and “Roman Holiday”

But, group members agree more on “Roman Holiday” with higher ratings Difference in opinion (Disagreement) between members

may be important in Group Recommendation Semantics. 5

Page 6: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Outline

Motivation

Modeling & Problem Definition

Algorithms

Optimization Opportunities

Experimental Evaluation

Conclusion & Future Work

6

Page 7: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Semantics in Group Recommendation

Relevance (average or least misery) and Disagreement in a recommendation’s score computed using a consensus function

7

Group G ={u1, u2 ,u3} relevance(u1,”God Father”) = 5, relevance(u2, “God Father”) = 1, relevance=(u3,” God Father”) = 1

Average Pair-wise Variance= (|(5-1)|+|(1-1)|+|(1-5)|)/ 3 = [(5 -2.33)2 + (1-2.33)2

+(1-2.33)2] / 3

score (G, “God Father”) = w1 x .86 + w2 x (1-.2) (after normalization)

Page 8: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Problem Definition

Top-k Group Recommendation:

Given a user group G and a consensus function F, return a list of k items to the group such that each item is new to all users in G and the returned k-items are sorted in decreasing value of F

8

Page 9: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Efficient Recommendation Computation Average and Least Misery are monotone

Sort user relevance list in decreasing value Apply Threshold Algorithm TA

9

i1,2

i3,2

i2,2

i1,0

i4,2

i3,2

i2,0i4,0

ILu1 ILu2

For a user group G = {u1, u2}, 4 Sorted Accesses are required to compute top-1 item for Group Recommendation.

Pruning operates only on the relevance component of consensus function.

How to do pruning on Disagreement component also?

Page 10: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Role of Disagreement Lists

Pair-wise disagreement lists are computed from individual relevance lists

Disagreement lists are sorted in increasing values

Items encountered in disagreement lists play a significant role in attaining early stoppage

10ILu1 ILu2 DL(u1,u2)

i1,2

i3,2

i2,2

i1,0

i4,2

i3,2

i4,2

i1,2

i3,0

i2,0i4,0 i2,2

Top-1 item for u1, u2 can be obtained after 3 Sorted Accesses if DL(u1,u2) is present.

Without DL(u1,u2) 4 Sorted Accesses is required.

Page 11: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Outline

Motivation

Modeling & Problem Definition

Algorithms

Optimization Opportunities

Experimental Evaluation

Conclusion & Future Work

11

Page 12: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Relevance Only (RO) Algorithm Task : Recommend Top-1 item for a group of 3 users, u1, u2, and u3.

Input: 3 Relevance Lists (ILu1 ,ILu2 ,ILu3 )

Relevance lists are sorted in decreasing scores

Lists are chosen in round-robin fashion during Top-k computation.

Performance is calculated by computing no of Sorted Access

12

Threshold is : 1.93

ILu1

i1,4

i3,4

i4,4,

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILu3

i3,3

i1,3

i4,3

i2,3

Top-k Buffer

i1,1.33

Page 13: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Relevance Only (RO) Algorithm

13

Threshold is : 1.86

ILu1

i1,4

i3,4

i4,4,

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILu3

i3,3

i1,3

i4,3

i2,3

Top-k Buffer

i1,1.33

i2,1.33

Page 14: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Relevance Only (RO) Algorithm

After 8 Sorted Accesses (3 on IL(u1), 3 on IL(u2) and 2 on IL(u3))

19

Threshold is : 1.6

IT STOPS!

Top-1 Item is i4

ILu1

i1,4

i3,4

i4,4

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILu3

i3,3

i1,3

i4,3

i2,3

Top-k Buffer

i4,1.6

i1,1.33

i2,1.33

i3,1.33

Page 15: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Full Materialization (FM) Algorithm Input: 3 Relevance Lists (ILu1 ,ILu2 ,ILu3 ) and 3 pair-wise

materialized disagreement lists (DLu1,u2, DLu1,u3, DLu2,u3)

Relevance lists are sorted in decreasing scores and Disagreement lists are sorted in increasing disagreement scores.

20Threshold is : 1.93

ILu1

i1,4

i3,4

i4,4

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILU3

i3,3

i1,3

i3,3

i2,3

DLu1,u2

i4,0

i3,2

i2,2

i1,2

DLu1,u3

i4,1

i3,1

i2,1

i1,2

DLu2,u3

i4,1

i1,1

i3,1

i2,1

i1,1.33 Top-k Buffer

Page 16: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Full Materialization (FM) Algorithm

21

Threshold is : 1.86

ILu1

i1,4

i3,4

i4,4

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILu3

i3,3

i1,3

i3,3

i2,3

DLu1,u2

i4,0

i3,2

i2,2

i1,2

DLu1,u3

i4,1

i3,1

i2,1

i1,2

DLu2,u3

i4,1

i1,1

i3,1

i2,1

i1,1.33

i2,1.33 Top-k Buffer

Page 17: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Full Materialization (FM) Algorithm

22

Threshold is : 1.73

ILu1

i1,4

i3,4

i4,4

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILu3

i3,3

i1,3

i3,3

i2,3

DLu1,u2

i4,0

i3,2

i2,2

i1,2

DLu1,u3

i4,1

i3,1

i2,1

i1,2

DLu2,u3

i4,1

i1,1

i3,1

i2,1

i1,1.33

i2,1.33

i3,1.33

Top-k Buffer

Page 18: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Full Materialization (FM) Algorithm

23

Threshold is : 1.73

ILu1

i1,4

i3,4

i4,4

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILu3

i3,3

i1,3

i3,3

i2,3

DLu1,u2

i4,0

i3,2

i2,2

i1,2

DLu1,u3

i4,1

i3,1

i2,1

i1,2

DLu2,u3

i4,1

i1,1

i3,1

i2,1

i4,1.6

i2,1.33

i3,1.23

i4,1.33

Top-k Buffer

Page 19: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Full Materialization (FM) Algorithm

24

Threshold is : 1.66

ILu1

i1,4

i3,4

i4,4

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILu3

i3,3

i1,3

i3,3

i2,3

DLu1,u2

i4,0

i3,2

i2,2

i1,2

DLu1,u3

i4,1

i3,1

i2,1

i1,2

DLu2,u3

i4,1

i1,1

i3,1

i2,1

Top-k Buffer

i4,1.6

i2,1.33

i3,1.23

i4,1.33

Page 20: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Full Materialization (FM) Algorithm

25

After 6 Sorted Accesses(1 on each list)

Threshold is : 1.6

Score(i4) = Threshold = 1.6

IT STOPS!

Top-1 item is i4

ILu1

i1,4

i3,4

i4,4

i2,2

ILu2

i2,4

i4,4

i1,2

i3,2

ILu3

i3,3

i1,3

i3,3

i2,3

DLu1,u2

i4,0

i3,2

i2,2

i1,2

DLu1,u3

i4,1

i3,1

i2,1

i1,2

DLu2,u3

i4,1

i1,1

i3,1

i2,1

Top-k Buffer

i4,1.6

i2,1.33

i3,1.23

i4,1.33

Page 21: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Outline

Motivation

Modeling & Problem Definition

Algorithms

Optimization Opportunities

Experimental Evaluation

Conclusion & Future Work

26

Page 22: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Optimization Opportunities

Partial Materialization: Given a space budget to materialize only a subset of n(n-1)/2 disagreement lists, which m lists to materialize?

Threshold Sharpening in Recommendation Computation: How to sharpen threshold during top-k computation?

27

Page 23: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Why Partial Materialization ?

A set of 10,000 users has 49995000 disagreement lists

Only 10% of the disagreement lists can be materialized, given a space budget

Problem : Which 4999500 lists should we choose so that those gives “maximum benefit” during query processing?

Intuition : Materialize only those lists that significantly improves efficiency.

Recommendation Algorithm needs to be adapted to it (refer to as PM in the paper)

28

Page 24: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Disagreement lists Materialization Algorithm

Sort the table with decreasing difference (#SAs) and consider first m rows

29

User Pair

#SAs without disagreement list

#SAs with disagreement lists

Difference in #SAs

{u1,u2} 200 100 100

{u3,u4} 290 195 95

{u10,u9} 170 100 70

{u6,u7} 230 190 40

{u2,u3} 175 145 30

{u5,u6} 200 179 21

{u7,u8} 120 100 20

-- -- -- --

-- -- -- --

-- -- -- --

m

Page 25: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Threshold SharpeningCan we exploit the dependencies between relevance and disagreement lists and sharpen thresholds in FM, RO and PM algorithms?

30

Maximize (iu1+ iu2)/2 + (1- |iu1-iu2|) s.t. 0 <= iu1 <= 0.5 0 <= iu1 <= 0.5 0.2 <=| iu1 – iu2 |<= 1

Threshold = 1.3

New Threshold = 1.2

ILu1

i1,0.5

i3,0.5

---

---

ILu2

i2,0.5

i3, 0.4

---

---

DLu1,u2

i3,0.2

I1,0.3

---

---

Page 26: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Outline

Motivation

Modeling & Problem Definition

Algorithms

Optimization Opportunities

Experimental Evaluation

Conclusion & Future Work

31

Page 27: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Experiments Used Dataset

MovieLens data set 71,567 users, 10,681 movies, 10,000,054 ratings

User Studies Compare the effectiveness of proposed Group Recommendation

algorithms with existing approaches using Amazon Mechanical Turk users.

Small and large groups of similar, dissimilar and random users are formed.

Algorithms Average Relevance Only (AR), Least Misery Only (LM), Consensus with Pair-wise Disagreements (RP), Consensus with Disagreement Variance (RV) are compared

Performance Experiments Performance (no of sorted accesses) comparison of FM, RO and PM

varying group size, similarity and no of returned items. Effectiveness of partial Materialization Effectiveness of Threshold Sharpening

32

Page 28: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Disagreement is important for Dissimilar User Group

Misery Only (MO) is the best model for similar user group

Disagreement is important for dissimilar users. Consensus with Disagreement Variance (RV80) is the best model there. 33

Page 29: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Summary of Performance Results

Presence of Disagreement lists improves performance for dissimilar user groups

Sometime Partial Materialization (PM) is the best solution For the same query, different disagreement

lists contribute differently in Top-k processing

Optimization during threshold calculation improves overall performance 34

Page 30: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Performance Results

•Less sorted accesses (SAs) are required for more similar user groups

•Disagreement lists are important for Dissimilar user groups

•FM is the best performer for very dissimilar user groups, RO is the best algorithm for very similar user groups.

Sometimes only few disagreement lists attain the best performance. Therefore Partial Materialization is important

Optimization during threshold calculation always achieves better performance (less #SAs) than without optimization case.

35

Page 31: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Outline

Motivation

Modeling & Problem Definition

Algorithms

Optimization Opportunities

Experimental Evaluation

Conclusion & Future Work

36

Page 32: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Conclusion

Disagreement impacts both quality and efficiency of Group Recommendation.

Threshold algorithm, TA can be adapted to compute group recommendations.

Novel optimization opportunities are present.

37

Page 33: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Ongoing and Future Work

Can disagreement lists be optimized such that they consume less space and contain same information?

Can group recommendation algorithms be adapted to work with those optimized lists?

38

Page 34: Group Recommendation: Semantics and Efficiency Sihem Amer-Yahia (Yahoo! Labs), Senjuti Basu-Roy (Univ. of Arlington), Ashish Chawla (Yahoo! Inc), Gautam

Thank You !

39