lei wu *, steven c.h. hoi *, rong jin #, jianke zhu, nenghai yu * nanyang technological university,...

37
Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging Lei Wu *† , Steven C.H. Hoi * , Rong Jin # , Jianke Zhu , Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State University, ETH Zurich

Upload: hailee-coulbourne

Post on 28-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Distance Metric Learning from Uncertain Side Information with Application to Automated Photo

Tagging

Lei Wu *†, Steven C.H. Hoi*, Rong Jin#, Jianke Zhu‡, Nenghai Yu†

*Nanyang Technological University, †University of Sci. & Tech. of China, #Michigan State University, ‡ETH Zurich

Page 2: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

BACKGROUND Annotation/tagging is essential to making

images accessible to Web users Billions of images on the Web lack proper

annotation/tags Automatic image annotation has been actively

studied in multimedia community

Page 3: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

BACKGROUND (cont’) Social media data in social websites enjoy rich

tagging information provided by Web users

Can we resolve the challenge of auto-photo annotation by leveraging the emerging huge amount of rich social media data?

Page 4: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

BACKGROUND (cont’) Annotation by Search from Social Images

SunBirdSkyBlue…

BirdFlyWhiteCloud…

SunCloudHawkFly…

HawkBirdSkyEagle…

Page 5: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

MOTIVATION Annotation by Search

Find similar image from social image DB Annotate the image by the tags of high frequency

Research Challenges Visual feature representation Tag data mining Scalable search & indexing Distance/similarity measure

Distance Metric Learning

Page 6: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

MOTIVATION (cont’) Related Work of Automated Photo Tagging

Built a large collection of web images with ground truth labels for helping object recognition research (Russell et al. 2008)

A fast search-based approach for image annotation by some efficient hashing technique (Wang et al. 2006)

Utilized visual and text modalities simultaneously in clustering images (Rege et al. 2008)

Efficient image search and scene matching techniques for exploring a large-scale web image repository. (Torralba et al. 2008)

Learning based method for improving the efficiency of manual image annotation (Yan et al. 2008)Adopt Hamming or Euclidian distance

Page 7: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

MOTIVATION (cont’) Distance Measure

Hamming distance Euclidian distance Mahanalobis distance

Distance Metric Learning Learning to optimize the metric M Side Information (a.k.s. “Pairwise Constraints”)

Similar pairs S(x1, x2) : x1 and x2 belong to the same category

Dissimilar pairs D(x1, x2): x1 and x2 belong to different categories

Page 8: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

MOTIVATION (cont’) Related Work on Distance Metric Learning

Probablistic Global Distance Metric Learning (PGDM) (Xing et al. 2002)

Neighbourhood Components Analysis (NCA) (Goldberger et al. 2005)

Relevance Component Analysis (RCA) (Bar-Hillel et al. 2005) Discriminative Component Analysis (DCA) (Hoi et al. 2006) Large Margin Nearest Neighbor (LMNN) (Weinberger et al.

2006) Regularized Distance Metric Learning (RDML) (Si et al. 2006) Information-Theoretic Metric Learning (ITML) (Davis et al.

2007)Clean side information is given

explicitly

Page 9: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

MOTIVATION (cont’) Annotation by Search from Social Media

NO explicit pairwise side information available But rich information is available with social images

Ideas of our research To discover implicit pairwise relationship between

social images via a probabilistic approach To learn effective distance metrics from

uncertain side information that is discovered from social images implicitly

Page 10: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

METRIC LEARNING FRAMEWORK FOR

AUTOMATED PHOTO TAGGING Overview of Our Approaches

Discovery of probabilistic side information A Graphical Model Approach

Learning distance metrics from probabilistic side information

A Probabilistic RCA Method Automated photo tagging by applying the

optimized metric in visual similarity search

Page 11: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

METRIC LEARNING FRAMEWORK FOR

AUTOMATED PHOTO TAGGING

Page 12: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Latent Chunklet Estimation for Probabilistic Side Info.

Problem Formulation Latent Chunklets

i.e., the hidden topics

Assumption both visual images and associated textual

metadata are generated from the hidden topic Calculation

Multi-model hidden topic analysis

Page 13: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Graphical Model ForLatent Chunklet Estimation

Text Modality

Visual ModalityHidden

Topic

Page 14: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Graphical Model ForLatent Chunklet Estimation (cont’)

Generation Process

Inference

Probabilistic Side Info., as Prior Prob. Matrix

Page 15: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Probabilistic Distance Metric Learning

Problem Definition and Notations Probabilistic Side Info.:

Centers/Means for the Latent Chunklets

Membership Probability

Given the estimation of latent chunklets P0, how to formulate the DML problem to find the optimal metric M?

Propose an extension of RCA with Prob. Side Info.

Page 16: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Probabilistic Relevance Component Analysis (pRCA)

The objective function of pRCA:

Corollary 1. When fixing the means of chunklets μ and the matrix of probability assignments P (assuming with hard assignments of 0 and 1), the Probabilistic Relevance Component Analyasis (pRCA) formulation reduces to the regular RCA learning.

Minimize Sum of square distances of examples from their chunklet’s

centers

regularization preventing the trivial

solution

Page 17: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Probabilistic Relevance Component Analysis (pRCA)

Iterative algorithm Fixing P and μ to optimize M:

Fixing M and μ to optimize P:

Fixing P and M to find μ:

Page 18: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Probabilistic Relevance Component Analysis (pRCA)

pRCA Algorithm

Page 19: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Automated Photo Tagging Query image Steps of Auto Photo Tagging via Search

Distance/Similarity Measure

To retrieve a set of visually similar social photos Set of k-Nearest Neighbor Images

Set of images with distance less than some threshold

Page 20: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Automated Photo Tagging (cont’)

Annotating the query photo by the relevant tags associated with the set of similar images A tag is more preferred if it has a higher frequency

among the set of similar social images A tag is more preferred if its associated social image

are visually more similar to the query photo Our tagging approach

Frequency of tag w among

the retrieved social images

Average distance from the query photo to the tag’s associated

social images

Page 21: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

EXPERIMENTS Experimental Testbed

Totally 205,442 photos from Flickr Distance Metric Learning: 16,588 photos + tags Knowledge Database: 186,854 photos + tags Query Image: 2,000 random photos

Compared Schemes: Relevance Component Analysis (RCA) Discriminative Component Analysis (DCA) Information-Theoretic Metric Learning (ITML) Large Margin Nearest Neighbor (LMNN) Neighbourhood Components Analysis (NCA) Regularized Distance Metric Learning (RDML)

Page 22: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

EXPERIMENTS (cont’) Settings:

500 latent chunklets 1,000 visual words 10,000 tags Learning rate γ=0.5 Top k nearest photos, k=30 Top t relevant tags for annotation, t=1,…,10

Page 23: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Average Precision Fixed the number of nearest neighbors k to

30 for all compared methods

Page 24: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Average Recall Fixed the number of nearest neighbors k to

30 for all compared methods

Page 25: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Precision-Recall Curves Fixed the number of nearest neighbors k to

30 for all compared methods

Page 26: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Empirical Observations DML techniques are beneficial and critical to

the retrieval-based photo tagging tasks In general, pRCA algorithm considerably

outperformed other approaches in most cases. For some cases, some DML methods did not

perform well, which could be even worse than the Euclidean method. Noisy (uncertain) side information issue Robustness is important to DML

Page 27: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Evaluation of Varied k And t Examine the annotation performance of

pRCA by varying the value of k from 10 to 50

Page 28: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Empirical Observations The number of nearest neighbors parameter

k can influence the annotation performance In our case, when k equals to 30, the resulting

performance is generally better than others Too large k, lots of noisy tags may be included as

there may not exist many relevant images in the database.

Too small k, some relevant tags may not appear, which again may degrade the performance

Page 29: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Time Cost For Metric Learning To evaluate the time efficiency performance of the proposed

DML algorithm on the same dataset

Findings The most efficient method is the regular RCA approach The most time-consuming one is NCA pRCA is quite competitive, which is worse than RCA,DCA, and RDML,

but is considerably better than ITML, LMNN, and NCA

Page 30: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Some Good ExamplesQuery Photo Top Recommended Tags

autumn, fall, forest , trees, nature , tree wood, germany , path , creative

sunset, clouds, sky, sea, beach, abigfave,sun, water, landscape, ocean

tiger, zoo , specanimal, impressedbeauty, abigfave, nature, animal , cat, animals, aplusphoto

garden, flowers, yellow, nature, hdr,nikon, spring, festival, impressed beauty

Page 31: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Some Poor ExamplesQuery Photo Top Recommended Tags

macro, nikon, bokeh, nature, flower,canon, storm, eos, plane, flickrsbest

nikon, street, water, sport, blue, bike,lebanon, kids, eric mckenna, krissy mckenna

winter, photography, art , beach usa, fashion, portrait , travel, party, snow

park, river, travel, trees, lake, hiking,winter, green, vacation, water

Page 32: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

CONCLUSIONS Contributions:

Study DML from uncertain side information that exploits probabilistic side information

Propose a two-step probabilistic distance metric learning (PDML) framework

Present an effective probabilistic RCA (pRCA) algorithm

Apply the algorithm to the auto photo annotation by search task

Encouraging results showed that our technique is effective and promising

Page 33: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Future Work To improve visual feature representation,

especially for annotating objects. To expand the scale of database To improve large scale search & indexing To filter spam and irrelevant tags To adopt user’s feedback to improve

automated tagging performance on APT.

Page 34: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Q&A More information is available:

Http://www.cais.ntu.edu.sg/~chhoi/APT/

Online demo of Auto Photo Tagging (APT) is available:Http://msm.cais.ntu.edu.sg/APT/

Contact: WU Lei [email protected] Steven CH Hoi [email protected] School of Computer EngineeringNanyang Technological UniversitySingapore 639798 Email: [email protected] Tel:  (+65) 6513-8040  Fax: (+65) 6792-6559   Http://www.ntu.edu.sg/home/chhoi/

Page 35: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State
Page 36: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

GRAPHICAL MODEL FOR LATENT CHUNKLET ESTIMATION

Inference Joint probability on documents and topics

Conditional probability on tags, visual words and topics

Gibbs sampling estimation

Page 37: Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State

Appendix