the benefit of using tag-based profiles

Post on 01-Feb-2016

18 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

The Benefit of Using Tag-Based Profiles. Claudiu Firan, Wolfgang Nejdl, Raluca Paiu 5 th Latin American Web Congress, 2007. Music Recommendation. Personal Music. Community Data. Challenges. New Approach. Personal Music. Personal Tags. Community Data. Why Use Tags?. Tags are: - PowerPoint PPT Presentation

TRANSCRIPT

The Benefit The Benefit of Using of Using

Tag-Based ProfilesTag-Based Profiles

Claudiu Firan, Wolfgang Nejdl, Raluca Paiu5th Latin American Web Congress, 2007

Music RecommendationMusic Recommendation

2

PersonalMusic

Community

Data

ChallengesChallenges

Collaborative Filtering

Content Based Techniques

Hybrid Methods

• Cold start problem• Items with no ratings• Users with no profile

• Poor artist variety in recommended pieces• Slow

• Unreliability in modeling user’s preferences• Content similarity does not necessarily reflect preferences• Slow

• Heavy user input

3

New ApproachNew Approach

4

PersonalMusic

Community

Data

PersonalTags

Why Use Tags?Why Use Tags?

Tags are:• Written chaotically• Not verified• Unstructured• Heterogeneous• Unreliable

But if many, the correct ones arise

“Wisdom of the masses”

5

Last.fm – “The Social Music Revolution”Last.fm – “The Social Music Revolution”

6

TrackTrack

ArtistArtist

Similar ArtistsSimilar Artists

AlbumsAlbums

Track Usage

Info

Track Usage

Info

Similar TracksSimilar Tracks

Tags(with weight)

Tags(with weight)

User Comments

User Comments

Tracks, Tags, and ProfilesTracks, Tags, and Profiles

7

User ProfilesUser Profiles

weight=preference(user,item)

8

Track-based Profiles (TR)Track-based Profiles (TR)

preference(user,track) = log(user_track_#listened)

9

TRTR

<tracki, weighti> …

<tracki, weighti> …

Track-Tag-based Profiles (TT)Track-Tag-based Profiles (TT)

preference(user,tag) = log( Σi(

log(user_tracki_#listened) ∙

log(user_tag_tracki_#tagged)))

[∙ ITF(tag)]

ITF = Inverse Tag Frequency• With: TTI• Without: TTN

10

TTNTTN

TTITTI

<tagi, weighti> …

<tagi, weighti> …

Tag-based Profiles (TG)Tag-based Profiles (TG)

preference(user,tag) = log(user_tag_#used)

11

TGTG

<tagi, weighti> …

<tagi, weighti> …

User Profiles from Personal MP3sUser Profiles from Personal MP3s

1. Read personal playlist from PC

2. Match MP3s against our database

3. Add overall average usage information values

12

Collaborative Filtering vs. SearchCollaborative Filtering vs. Search

13

Track- & Tag-based RecommendationsTrack- & Tag-based Recommendations

14

Collaborative Filtering

Collaborative Filtering

<tracki, weighti> …

<tracki, weighti> …

<tagi, weighti> …

<tagi, weighti> …

Tag-based SearchTag-based Search

15

<tagi, weighti> …

<tagi, weighti> …

AlgorithmsAlgorithms

16

Experiments & OutcomeExperiments & Outcome

17

Last.fm Crawled DataLast.fm Crawled Data

• 317,058 tracks

• 21,177 tags (most prominent ones are music genres)

• 289,654 users 12,193 listened at least 50 tracks and used at least 10 tags

18

Experimental SetupExperimental Setup

1. Create user profiles• 18 subjects• 658 tracks on average in user profile (not statistically

significant in influencing algorithm outcome)

2. Run algorithms• 7 algorithms• 10 recommended items per algorithm per user

3. Two scores• Quality of recommendation [0-2] NDCG• Novelty of recommendation [0-2] Average

19

ResultsResults

20

Nr

Algorithm

NDCG Signif. vs. CFTR

Average Novelty

Average Popularit

y

1 CFTR 0.54 - 1.39 15,177

2 CFTG 0.25 Highly 1.83 4,065

3 CFTTI 0.36 Highly 1.72 6,632

4 CFTTN 0.37 Highly 1.74 13,671

5 STG 0.60 No 1.07 7,587

6 STTI 0.73 Highly 0.82 10,380

7 STTN 0.77 Highly 0.78 16,309

CFTR: BaselineCFTR: Baseline

STG: • Lower popularity• Higher quality

STG: • Lower popularity• Higher quality

STTI & STTN: • Huge improvement• Statistically significant

STTI & STTN: • Huge improvement• Statistically significant

NDCG – Novelty: • High inverse correlation• Pearson c = -0.987

NDCG – Novelty: • High inverse correlation• Pearson c = -0.987

Gain over the Baseline (CF on Tracks)Gain over the Baseline (CF on Tracks)

21

ConclusionsConclusions

• CF on tag-based profiles worse than CF on track-based profiles

• Search with tags improved recommendation performance substantially• 44% increase in quality• Instant results – virtually no time delay• No cold start problem

• Tag-based profiles work also with less rich music repositories

• Results probably influenced by the consistent tag usage on Last.fm: mostly genres

22

top related