the topic-perspective model for social tagging systems

31
The Topic-Perspective Model for Social Tagging Systems Caimei Lu et al. (KDD 2010) Presented by Anson Liang

Upload: najwa

Post on 23-Feb-2016

29 views

Category:

Documents


0 download

DESCRIPTION

The Topic-Perspective Model for Social Tagging Systems. Caimei Lu et al. (KDD 2010) Presented by Anson Liang. Movitation. Users:. Documents: a rticles, web pages, etc. Annotate. Add. Tags: e.g. economy, science, facebook , twitter …. Movitation. E.g. Delicious.com - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Topic-Perspective Model for Social Tagging Systems

The Topic-Perspective Model for Social Tagging Systems

Caimei Lu et al. (KDD 2010)

Presented by Anson Liang

Page 2: The Topic-Perspective Model for Social Tagging Systems

MovitationDocuments:

articles, web pages, etc.

Tags:e.g. economy, science,

facebook, twitter …

Users:

AddAnnotate

Page 3: The Topic-Perspective Model for Social Tagging Systems

E.g. Delicious.co

m◦ Social

bookmarking

Movitation

Page 4: The Topic-Perspective Model for Social Tagging Systems

Movitation

Web pages

Tags

Users

Page 5: The Topic-Perspective Model for Social Tagging Systems

Different users have different perspectives!

Movitation

Movie, action, inspiration

Movie, action, violent

Page 6: The Topic-Perspective Model for Social Tagging Systems

Objectives◦ Simulate the generation process of social

annotations

◦ Automatically generate tags for documents based on user perspective

Movitation

Page 7: The Topic-Perspective Model for Social Tagging Systems

Topic Analysis using Generative Models◦ Latent Dirichlet Allocation (LDA) model

Discover topics for a given document

A document has many words A document may belong to many topics A word may belong to many topics A topic may be associated with many words

Related Work

Documents

Words

Topics

1..N

1..N

N..N

Page 8: The Topic-Perspective Model for Social Tagging Systems

LDA – Plate notation

α is the parameter of the uniform Dirichlet prior on the per-document topic distributions. β is the parameter of the uniform Dirichlet prior on the per-topic word distribution. θi is the topic distribution for document i, zij is the topic for the j th word in document i, and wij is the specific word. The wij are the only observable variables, and the other variables are latent variables. φ is a K*V (V is the dimension of the vocabulary) Markov matrix each row of which denotes the word

distribution of a topic.

Related Work

Ref: http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation

Page 9: The Topic-Perspective Model for Social Tagging Systems

Conditionally-independent LDA (Ramage et al.)◦ Tags are generated from the topics (the same source as

the words)

Related Work

Page 10: The Topic-Perspective Model for Social Tagging Systems

Correlated LDA (Bundschus et al.)1. Generates topics associated with the words for a

document2. Generates tags from the topics associated with the

words

Related Work

Page 11: The Topic-Perspective Model for Social Tagging Systems

Problem◦ Assume words and tags are generated from the

same source

However, words are generated by the authors, and tags are generated by users (readers)

User information is missed in the tag generation process!

Related Work

Page 12: The Topic-Perspective Model for Social Tagging Systems

Three classification schemas of social tags

Related Work

Page 13: The Topic-Perspective Model for Social Tagging Systems

Overview◦Two factors act on the tag generation

process:1. The topics of the document2. The perspective adopted by the user

Topic-Perspective Model

Page 14: The Topic-Perspective Model for Social Tagging Systems

Model Formulation

Topic-Perspective ModelStandard LDA

Switch 0/1

X=1:Tags are generated from the topics learned from the words

X=0:Tags are generated from user perspectives

User-perspective distribution

Perspective-tag distribution Topic-word distribution

Topic-tag distribution

Document-topic distribution

Probability that each tag is generated from topics

Page 15: The Topic-Perspective Model for Social Tagging Systems

Topic-Perspective Model

User-perspective distribution

Perspective-tag distribution

Topic-word distribution

Topic-tag distribution

Document-topic distribution

Page 16: The Topic-Perspective Model for Social Tagging Systems

Topic-Perspective ModelProbability that each tag is generated from topics

Page 17: The Topic-Perspective Model for Social Tagging Systems

Parameter Estimation1. document-topic distribution θ(d)2. topic-word distribution φ (w)3. topic-tag distribution φ (t)4. user-perspective distribution θ(u)5. perspective-tag distribution ψ6. binomial distribution λ

Gibbs Sampling!

Topic-Perspective Model

Page 18: The Topic-Perspective Model for Social Tagging Systems

Parameter Estimation

Topic-Perspective Model

Page 19: The Topic-Perspective Model for Social Tagging Systems

Parameter Estimation

Topic-Perspective Model

Page 20: The Topic-Perspective Model for Social Tagging Systems

Parameter Estimation

Topic-Perspective Model

Page 21: The Topic-Perspective Model for Social Tagging Systems

Parameter Estimation

Topic-Perspective Model

Page 22: The Topic-Perspective Model for Social Tagging Systems

Datasets & Setup◦ A social tagging dataset from delicious.com after

refining 41190 documents 4414 users 28740 unique tags 129908uniques words

◦ 90% for training, 10% for testing

Experiments & Results

Page 23: The Topic-Perspective Model for Social Tagging Systems

Evaluation Criterion◦ Perplexity

Reflects the ability of the model to predict tags for new unseen documents

A lower perplexity score -> better generalization performance

Experiments & Results

Page 24: The Topic-Perspective Model for Social Tagging Systems

Number of topics vs. number of perspectives

Experiments & Results

Page 25: The Topic-Perspective Model for Social Tagging Systems

Tag Perplexity

Experiments & Results

Page 26: The Topic-Perspective Model for Social Tagging Systems

Discovered Topics ◦ Observed high semantic correlations among the

top words and tags for each topic

Experiments & Results

Page 27: The Topic-Perspective Model for Social Tagging Systems

Discovered Perspectives◦ Perspectives are more complicated than topics◦ The correlation among the tags assigned to

each perspective is not as obvious as those for topics

Experiments & Results

Page 28: The Topic-Perspective Model for Social Tagging Systems

The generation sources of tags

Experiments & Results

Tags are generated from the topics learned from the words

Tags are generated from user perspectives

Page 29: The Topic-Perspective Model for Social Tagging Systems

Apply the results of the model for

1. tag recommendation

2. Personalized information retrieval

Future Work

Page 30: The Topic-Perspective Model for Social Tagging Systems

Topic-perspective LDA model◦ Simulate the tag generation process in a more

meaningful way

◦ Tags are generated from both the topics in the document and the user perspective

◦ May be applied in computer vision tagging problems

Summary

Page 31: The Topic-Perspective Model for Social Tagging Systems

Thank you!

Question?