convolutional patch representations for image retrieval an unsupervised approach

19
Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach 29th Mar 2016 Original slides by Eva Mohedano Insight Centre for Data Analytics (Dublin City University Mattis Paulin, Julien Mairal, Matthijs Douze, Zaid Harchaoui, Florent Perronnin, Cordelia Schmidt

Upload: universitat-de-barcelona

Post on 22-Jan-2018

233 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach

29th Mar 2016

Original slides by Eva MohedanoInsight Centre for Data Analytics (Dublin City University

Mattis Paulin, Julien Mairal, Matthijs Douze, Zaid Harchaoui, Florent Perronnin, Cordelia Schmidt

Page 2: Convolutional Patch Representations for Image Retrieval An unsupervised approach

OverviewPublished ICCV 2015 (A.K.A. Local Convolutional Features With Unsupervised

Training for Image Retrieval)

Deep Convolutional Architecture to produce patch-level descriptors

• Unsupervised framework

• Comparison in patch and retrieval datasets

• “RomePatches” dataset

Page 3: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Related Work

• Shallow patch descriptors

• Deep learning for image retrieval

• Deep patch descriptors

Page 4: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Related Work• Shallow patch descriptors

SIFT – Scale-Invariant Feature Transform

- stereo matching

- retrieval

- classification

SURF, BRIEF, LIOP, (…)

Hand crafted → Relatively small number of parameters.

Note: A patch is an

image region extracted

from an image.

Page 5: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Related Work• Deep learning for image retrieval

CNN learned on a sufficiently large labeled dataset (ImageNet) generates intermediate layers that

can be used as image descriptors.

Those descriptors work for a wide variety of tasks, including image retrieval

Page 6: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Related Work• Deep learning for image retrieval

source image: http://pubs.sciepub.com/ajme/2/7/9/

Page 7: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Related Work• Deep learning for image retrieval

source image: http://pubs.sciepub.com/ajme/2/7/9/

Fully connected layers → Global Image Descriptors

● Compact representation

● lack of geometric invariance

Below state-of-the art in image

retrieval

Compute at different scales(Babenko, Razavian)

Page 8: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Related Work• Deep learning for image retrieval

source image: http://pubs.sciepub.com/ajme/2/7/9/

Convolutional layers

Page 9: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Related Work• Deep patch descriptors

3 different kind of supervision:

1. Category labels of ImageNet. [Long et al, 2014]

2. Surrogate patch labels: Each class is a given patch under different transformations [Fischer et al, 2014]

3. Matching/non-matching pairs. [Simo-Serra et al, 2015]

Works focussed in patch-level metrics, not image retrieval.

All approaches requiered some kind of supervision.

Page 10: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Image Retrieval Pipeline• Interest point detection

Hessian-Affine detector.

Rotation invariance.

• Interest point description

Feature representation in a Euclidean space

• Patch Matching

VLAD encoding.

Power normalization with exponent 0.5 + L2-norm.

Page 11: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Image Retrieval Pipeline• Interest point detection

Hessian-Affine detector.

Rotation invariance.

• Interest point description

Feature representation in a Euclidean space

• Patch Matching

VLAD encoding.

Power normalization with exponent 0.5 + L2-norm.

Page 12: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Convolutional DescriptorsPatch size = 51x51 – Optimal for SIFT on Oxford dataset.

CNN extended to retrieval by:

• Encoding local descriptors with model trained with an unrelated classification task

• Devising a surrogate classification problem that is as related as possible to image retrieval:

• Using unsupervised learning: Convolutional Kernel Network

Page 13: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Convolutional Descriptors• Using unsupervised learning: Convolutional Kernel Network

Feature representation based in a kernel (feature) map -- Data independent

Page 14: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Convolutional Descriptors• Using unsupervised learning: Convolutional Kernel Network

Projection in Hilbert space

Explicit kernel map can be computed to approximate it for computational efficiency.

- Sub-sample of patches

- Stochastic Gradient Optimization

Page 15: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Convolutional Descriptors• Using unsupervised learning: Convolutional Kernel Network

4 possible inputs

From left to right: CKN-raw, CKN-mean subs, CKN-white (mean subs + PCA-whitening), CKN-grad (fully invariant to color)

Only CKN-raw, CKN-white and CKN-grad are evaluated.

Page 16: Convolutional Patch Representations for Image Retrieval An unsupervised approach

ExperimentsDatasets:

1. Rome Patches-Image

2. Oxford

3. UKbench and Holidays

CKN trained on 1M sub-patches. 300K iterations. Mini-batches size of 1000.

Page 17: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Experiments

Page 18: Convolutional Patch Representations for Image Retrieval An unsupervised approach

Conclusions• CKN offer similar and sometimes better performance than CNN in the

context of patch description.

• Good patch retrieval translates into good image retrieval.

• CKNs are orders of magnitude faster to train than CNNs (10 min vs 2-3 days

on a modern GPU)

• Fully unsupervised – no labels.

Page 19: Convolutional Patch Representations for Image Retrieval An unsupervised approach

ResourcesRomePatches+Code (Although code is not accessible!)

Discriminative Unsupervised Feature Learning with Exemplar Convolutional

Neural Networks

- Code with augmentations in matlab

- Code for training models.

- Models already trained :-)

Triplet’s net + Code !!

- Greyscale local patches of 32x32. Tested in matching datasets