joint unsupervised learning of deep representations and image clusters

35
Joint Unsupervised Learning of Deep Representations and Image Clusters Jianwei Yang , Devi Parikh , Dhruv Batra [arXiv ] [GitHub ] Slides by Albert Jiménez [GDoc ] Computer Vision Reading Group (23/09/2016)

Upload: xavier-giro

Post on 15-Apr-2017

165 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 2: Joint unsupervised learning of deep representations and image clusters

1.IntroductionThe main idea

2

Page 3: Joint unsupervised learning of deep representations and image clusters

Learning good image representations is beneficial to image clustering

Main Idea ( 1 )Joint Unsupervised Learning of representations and image clusters

3

Page 4: Joint unsupervised learning of deep representations and image clusters

Clustering provide supervisory signals for image representation

Main Idea ( 2 )Joint Unsupervised Learning of representations and image clusters

4

Page 5: Joint unsupervised learning of deep representations and image clusters

Integrate both processes into a single model

● Unified triplet loss

● End-to-end training

Network Clustering

Forward - Update Clustering

Backward - Update Representation parameters

Main Idea ( 3 )Joint Unsupervised Learning of representations and image clusters

5

Page 6: Joint unsupervised learning of deep representations and image clusters

2.ClusteringAgglomerative Clustering

6

Page 7: Joint unsupervised learning of deep representations and image clusters

Agglomerative ClusteringWhat?

● At the beginning each image is a cluster

● Merge two clusters at each timestep

Affinity Measure

7

Page 8: Joint unsupervised learning of deep representations and image clusters

Agglomerative ClusteringWhy?

● Recurrent process → Implemented in recurrent framework

● Begins with over-clustering (overcome bad initial representations)

● Clusters are merged as better representations are learned

8

Page 9: Joint unsupervised learning of deep representations and image clusters

Agglomerative ClusteringAffinity Measure

● Directed graph

● Affinity matrix corresponding to the edge set

Ks → # Nearest Neighbors

NiKs → Ks NN of Xi

xi,j → Image Representation (vertex)

ns → # Samples

a → Design Parameter = 1 9

Page 10: Joint unsupervised learning of deep representations and image clusters

Agglomerative ClusteringAffinity Measure

● Affinity matrix corresponding to the edge set:

● Affinity measure:

W. Zhang, X. Wang, D. Zhao, and X. Tang. Graph degree linkage: Agglomerative clustering on a directed graph.

10

Page 11: Joint unsupervised learning of deep representations and image clusters

3.System FormulationJoint Optimization

11

Page 12: Joint unsupervised learning of deep representations and image clusters

System FormulationRecurrent Framework

At timestep t:

ht → Image cluster labelsyt → OutputXt → Image representations

12

Page 13: Joint unsupervised learning of deep representations and image clusters

System FormulationRecurrent Framework

● Unrolling strategy:○ Complete○ Partial

■ Split the overall T timesteps into multiple periods

■ In each period we merge a number of clusters and update CNN parameters

13

Page 14: Joint unsupervised learning of deep representations and image clusters

Unrolling rate : control the number of timesteps

ncs : number of clusters at the start of p

Number of timesteps:

System FormulationAlgorithm

14

Page 15: Joint unsupervised learning of deep representations and image clusters

System FormulationObjective Function

● Accumulate the losses from all the timesteps

● For y0 we take each image as a cluster, at time t we merge 2 clusters given yt-1

● In conventional agglomerative clustering○ 2 clusters are selected by maximal Affinity measure over pairs of clusters

● In this approach○ Consider the local structure surrounding the clusters

15

Page 16: Joint unsupervised learning of deep representations and image clusters

System FormulationObjective Function

● Assuming that from yt-1 to yt we have merged a cluster Cit and its NN

Affinity between cluster Ci and NN

Difference of affinity between cluster Ci and NN and

affinities of Ci with its Kc neighbor clusters

16

Page 17: Joint unsupervised learning of deep representations and image clusters

System FormulationObjective Function

● Introducing this loss term:

○ Consider the local structure

surrounding the clusters○ Allows to write the loss in terms of

triplets

17

Page 18: Joint unsupervised learning of deep representations and image clusters

System FormulationForward/Backward Pass

Network Clustering

Forward - Update Clustering

Backward - Update Representation parameters

18

Page 19: Joint unsupervised learning of deep representations and image clusters

System FormulationForward Pass

● In the forward pass of the p-th period p∈(1, ... , P), update the clusters with the network parameters fixed

Overall loss for period p

19

Page 20: Joint unsupervised learning of deep representations and image clusters

● In the forward pass of the p-th period p∈(1, ... , P), we have merged a number of clusters

● In the backward pass we aim to derive the optimal parameters to minimize the losses generated in the forward pass.

System FormulationBackward Pass

We accumulate the losses of p periods

20

Page 21: Joint unsupervised learning of deep representations and image clusters

● Loss defined on clusters○ Need the entire dataset to estimate

○ Difficult to use batch-based optimization

● Sample based loss approximation

System FormulationFrom cluster based loss to sample based

21

Page 22: Joint unsupervised learning of deep representations and image clusters

● Sample based loss intuition○ Agglomerative clustering starts with each datapoint as a cluster

○ Clusters at higher level hierarchy are formed by merging lower level clusters

System FormulationFrom cluster based loss to sample based

22

Page 23: Joint unsupervised learning of deep representations and image clusters

System FormulationFrom cluster based loss to sample based

● Affinities between clusters can be expressed in terms of affinities with datapoints

For more info check supplement of the paper

23

Page 24: Joint unsupervised learning of deep representations and image clusters

● Finally it has got the form of a weight triplet loss

System FormulationFrom cluster based loss to sample based

is a weight whose value depends on

xi and xj are from the same cluster

xk is from the Kc nearest neighbouring clusters

24

Page 25: Joint unsupervised learning of deep representations and image clusters

● Given a dataset with ns samples and n

c number of clusters, T = n

s -n

c timesteps

○ Time consuming optimization in large datasets

● Run a fast algorithm to determine initial clustering○ Agglomerative clustering via maximum incremental path integral

System FormulationOptimization

25

Page 26: Joint unsupervised learning of deep representations and image clusters

4.Experiments

26

Page 27: Joint unsupervised learning of deep representations and image clusters

● They use Caffe/Torch● Convolutional layers of 50 channels, 5x5 filters, stride = 1, padding = 0● Pooling layers, 2x2 kernel, stride = 2● Final size of the output feature map is 10x10

ExperimentsExperimental Setup

Face datasets

27

Page 28: Joint unsupervised learning of deep representations and image clusters

● On top of all the CNNs they append a IP layer (dimension 160)

● Followed by l2 norm layer● wt-loss = weight triplet loss

ExperimentsExperimental Setup

28

Page 29: Joint unsupervised learning of deep representations and image clusters

ExperimentsRobustness Analysis

29

Page 30: Joint unsupervised learning of deep representations and image clusters

ExperimentsQuantitative Comparison Using Image Intensities as an Input

Measure = NMI = Normalized Mutual Information30

Page 31: Joint unsupervised learning of deep representations and image clusters

ExperimentsQuantitative Comparison of the Other State-of-the-Art Using their Representations as Input

31

Page 32: Joint unsupervised learning of deep representations and image clusters

ExperimentsTransfer Learning

32

Page 33: Joint unsupervised learning of deep representations and image clusters

ExperimentsFace Verification

● Representations learn from Youtube-Face dataset evaluated on LFW

33

Page 34: Joint unsupervised learning of deep representations and image clusters

5.Conclusions

34

Page 35: Joint unsupervised learning of deep representations and image clusters

Conclusions

● They combine agglomerative clustering with CNNs and optimize jointly in a recurrent framework.

● Proposal of a partial unrolling strategy to divide the timesteps into multiple periods.

● Obtain more precise image clusters and discriminative representations. ● Able to generalize well across many datasets and tasks.

35