tag ranking

26
Tag Ranking Present by Jie Xiao Dept. of Computer Science Univ. of Texas at San Antonio

Upload: locke

Post on 23-Feb-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Tag Ranking. Present by Jie Xiao. Dept. of Computer Science Univ. of Texas at San Antonio. Outline. Problem Probabilistic tag relevance estimation Random walk tag relevance refinement Experiment Conclusion. Problem. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tag Ranking

Tag Ranking

Present by Jie Xiao

Dept. of Computer Science

Univ. of Texas at San Antonio

Page 2: Tag Ranking

[email protected] 2

Outline

ProblemProbabilistic tag relevance estimationRandom walk tag relevance refinementExperimentConclusion

Page 3: Tag Ranking

[email protected] 3

Problem

There are millions of social images on internet, which are very attractive for the research purpose.

The tags associated with images are not ordered by the relevance.

Page 4: Tag Ranking

Problem (Cont.)

[email protected] 4

Page 5: Tag Ranking

Tag relevance

There are two types of relevance to be considered.

The relevance between a tag and an image

The relevance between two tags for the same image.

[email protected] 5

Page 6: Tag Ranking

Probabilistic Tag Relevance Estimation

Similarity between a tag and an image

[email protected] 6

x : an imaget : tag i associated with image xP(t|x) : the probability that given an image x, we have the tag t.P(t) : the prior probability of tag t occurred in the dataset

After applying Bayes’ rule, we can derive that

Page 7: Tag Ranking

Probabilistic Relevance Estimation (Cont)

Since the target is to rank that tags for the individual image and p(x) is identical for these tags, we refine it as

[email protected] 7

Page 8: Tag Ranking

Density Estimation

Let (x1, x2, …, xn) be an iid sample drawn from some distribution with an unknown density ƒ.

Two types of methods to describe the densityHistogramKernel density estimator

[email protected] 8

Page 9: Tag Ranking

Histogram

[email protected] 9

Credit: All of Nonparametric Statistics via UTSA library

Page 10: Tag Ranking

Kernel Density Estimation

[email protected] 10

Smooth function K is used to estimate the density

Page 11: Tag Ranking

Kernel Density Estimation (Cont.)

Its kernel density estimator is

[email protected] 11

Page 12: Tag Ranking

Probabilistic Relevance Estimation (Cont)

Kernel Density Estimation (KDE) is adopted to estimate the probability density function p(x|t).

[email protected] 12

Xi : the image set containing tag tixk : the top k near neighbor image in image set XiK : density kernel function used to estimate the probability|x| : cardinality of Xi

Page 13: Tag Ranking

Relevance between tags

ti, tag i associated with image xtj, tag j associated with image x , the image set containing tag i , the image set containing tag jN: the top N nearest neighbor for image x

[email protected] 13

Page 14: Tag Ranking

Relevance between tags (Cont.)

[email protected] 14

Page 15: Tag Ranking

Relevance between tags (Cont.)

Co-occurrence similarity between tags

[email protected] 15

f(ti) : the # of images containing tag tif(ti,tj) : the # of images containing both tag ti and tag tjG : the total # of images in Flickr

Page 16: Tag Ranking

Relevance between tags (Cont.)

[email protected] 16

Page 17: Tag Ranking

Relevance between tags (Cont.)

Relevance score between two tags

[email protected] 17

where

Page 18: Tag Ranking

Random walk over tag graph

P: n by n transition matrix. pij : the probability of the transition from node i to j

[email protected] 18

rk(j): relevance score of node i at iteration k

Page 19: Tag Ranking

Random walk

[email protected] 19

Page 20: Tag Ranking

Random walk over tag graph (Cont.)

[email protected] 20

Page 21: Tag Ranking

Experiments

Dataset: 50,000 image crawled from FlickrPopular tags:Raw tags: more than 100,000 unique tagsFiltered tags: 13,330 unique tags

[email protected] 21

Page 22: Tag Ranking

Performance Metric

Normalized Discounted Cumulative Gain(NDCG)

[email protected] 22

r(i) : the relevance level of the i - th tag

Zn : a normalization constant that is chosen so that the optimalranking’s NDCG score is 1.

Page 23: Tag Ranking

Experimental Result

Comparison among different tag ranking approaches

[email protected] 23

Page 25: Tag Ranking

Conclusion

Estimate the tag - image relevance by kernel density estimation.

Estimate the tag – tag relevance by visual similarity and tag co-occurrence.

A random walk based approach is used to refine the ranking performance.

[email protected] 25

Page 26: Tag Ranking

[email protected] 26

Thank you!