joint image clustering and labeling by matrix factorization seunghoon hong cv lab., postech

Post on 24-Dec-2015

218 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Joint Image Clustering and Labeling by Matrix Factorization

Seunghoon HongCV Lab.,POSTECH

Motivation

• OBJECTIVE : From a set of unlabeled, unorganized images, we want to find meaningful clusters and associated labels about each cluster.

“Zoo”

“Park”, “Tree”

“Apple”, “Pear”Unorganized images

Ref-DB

Approaches

• OPT 1. Supervised learning– Learn all possible categories from Ref-DB,

and apply the model to test images– PROBLEMS

• Learning a model for large number of categories is difficult• More importantly, it may not necessary

Approaches

• OPT 2. k-NN based approach– performing k-NN search for individual test image on Ref-DB

to obtain labels for images– PROBLEMS

• Obtained labels are noisy because,– K-NN is obtained based on visual similarity– K-NN is obtained for each test image independently

Approaches

• OPT 3. Cluster test images, and Obtain labels for each cluster– Cluster test images, and obtain labels for each cluster– PROBLEMS

• Images in each cluster may not semantically related.• Finding labels for each cluster may not trivial

Ref-DB

car

dog

bir

d …

car

bird dog

deer

Word Feature

SO-NMF

Visual Feature

Joint Clustering and Annotation

car

bir

d …

……

dog

1

0

car

bird

dog

Proposed method

Proposed method

• Obtain human-interpretable mid-level features for test images based on k-NN search on Ref-DB (word feature).

• Perform clustering on test images with suitable constraints on mid-level semantic feature.

• Assign labels for each cluster directly from mid-level feature.

• BENEFITS– Clustering is performed considering semantic relationship b/w images.– Candidate labels are bounded by test set. – (extension : learn relevant concepts and do classification)

Step 1.Word feature extraction

Ref-DB

car

dog

bir

d …

car

bird dog

deer

Word Feature

SO-NMF

Visual Feature

Joint Clustering and Annotation

car

bir

d …

……

dog

1

0

car

bird

dog

Word-feature Construction

Procedure1. Extract k-Nearest Neighbors from database2. Construct weighted histogram based on labels of k-NNs

Transform feature domain from visual to word space.

Tn1

1 ,

,,i

1,

,,

,1,

,1,i

i

]v,...,[v Vmatrix frequency word x construct .2

||||)(v

as, in vectorslabel of sum weightedaby for feature word theGenerate (c)

,...,1 andconstant a is where

||)||exp(

||)||exp()(

asneighbor nearest each of weight Compute (b)

},...,{ labels, ingcorrespond their and

},...,{, of neighborsnearest -k theFind (a)

, ximageinput each For 1.

tn

l

lyw

N

kl

y

yyw

llL

yyΝ

k

j ji

jiji

i

k

jjii

liili

kiii

kiii

x

x

x

x

Rl ,y of vector label

Ry RDBin image

R ximageinput : Given

jj

j

i

t

d

d

Step 2.Clustering

Ref-DB

car

dog

bir

d …

car

bird dog

deer

Word Feature

SO-NMF

Visual Feature

Joint Clustering and Annotation

car

bir

d …

……

dog

1

0

car

bird

dog

Step 2. Clustering - NMF

• matrix factorization

0 V U,s.t,

||UVX||minHW,

T

tag

1

tag2

tag 3

By low rank approximation,Noise can be cleaned and Result become more homogeneous

Step 2. Clustering - NMF

Bir

dS

nail

Car

Dog

Cat

Ward

rob

ete

levis

ion

Sn

um

ou

se. . .

Bir

dS

nail

Car

Dog

Cat

Ward

rob

ete

levis

ion

Sn

um

ou

se. . .

Step 2. Clustering - NMF

• Limitation of NMF– Diverse form of basis component can be found !!

Not helpful basis to find relevant concepts for dataset!

Need to find Sparse Basis

Step 2. Clustering - NMF

• Limitation of NMF– Membership of each data can be diverse.

= [0.6, 0.4]= [0.9, 0.1]

1.

Step 2. Clustering - NMF

Desirable properties of cluster in word-feature space:

(Sparseness) : Cluster should associated with small number of representative keywords. (Orthogonality) : Data should associated with one cluster.

Step 2. Clustering – NMFSC

• NMF with sparse constraints (Hoyer, 2004)

• Step 1: Initialize W, H to random positive matrices

• Step 2: If constraints apply to W or H or both, project each column or row respectively to have unchanged L2 norm and desired L1 norm

• Step 3: iterateIf sparseness constraints on W apply,

Set W=W-μw(WH-A)HT

Project columns of W as in step 2Else, take standard multiplicative step

If sparseness constraints on H applySet H=H- μHWT(WH-A)Project rows of H as in step 2Else, take standard multiplicative step

Step 2. Clustering - ONMF

• Algorithms for orthogonal matrix factorization (S.-choi, 2008)

– Optimize NMF with orthogonality constrained stiefel mani-fold

Optimize in stiefel manifold Constrained on

Step 2. Clustering - NMF

• So can we just integrate these two constraint simply?

NO ! There is three constraints on two variable

sparse is H and I, W Wand 0 H W,s.t,

||WHV||min

T

HW,

T W

H

Introduce additional variable to make up error in original obj.func.

Step 2. Clustering - SONMF

• Sparse Orthogonal NMF (SO-NMF)

sparse is H and I, W Wand 0 H W,s.t,

||WSHV||min

T

HW,

T

Step 2. Clustering - NMF

• Final Labeling– Per cluster

– Per Image

Experiment result

• Experiment settings :two dataset

1. CIFAR-100 (100 categories)2. Image-net (30 categories, challenging variation)

CIFAR-100 Image-net

Categories 100 30 (more varia-tion)

Test/training im-ages

10K 10K

50K 50K

Features gist gist

Color-histogram Bag of words(SIFT)

Experiment result - Categorization Performance

Experiment resultEffect of RDB quality

Categorization accu-racy in the presence of missing labels in the Ref-DB

Categorization accu-racy in the presence of incorrect labels in Ref-DB

Categorization accu-racy by varying the number of clusters.

Experiment resultEffect of RDB quality

Categorization accu-racy in the presence of missing labels in the Ref-DB

Categorization accu-racy in the presence of incorrect labels in Ref-DB

Categorization accu-racy by varying the number of clusters.

Experiment result - Labeling Performance

Quality of Cluster Labels. Quality of Image Labels.

Experiment result - Labeling Performance

Experiment result - Extension to Supervised Image Classification

Extension with extracted labels : Learn on only relevant categories of dataset using supervised method-in other words, Bound candidate classes on test imageset

Experiment result - Extension to Supervised Image Classification

Example confusion matrix of cifar-100

Example confusion matrix of ImageNet

Experiment result - Extension to Supervised Image Classification

Thank you

top related