deep metric learning - xidian · mahalanobis deep metric learning final representation: the...

47
Deep Metric Learning Department of Automation, Tsinghua University, China http://ivg.au.tsinghua.edu.cn/Jiwen_Lu / Jiwen Lu

Upload: others

Post on 25-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

Deep Metric Learning

Department of Automation, Tsinghua University, Chinahttp://ivg.au.tsinghua.edu.cn/Jiwen_Lu/

Jiwen Lu

Page 2: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

Outline

2

• Introduction

• Mahalanobis Deep Metric Learning

• Hamming Deep Metric Learning

• Multi-Modal Deep Metric Learning

• Conclusions

Page 3: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

Visual Content UnderstandingVisual Recognition

3

Page 4: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

Visual TrackingVisual Content Understanding

4

Page 5: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

Visual SearchVisual Content Understanding

5

Page 6: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

Distance Metric LearningApplying Mahalanobis distance to learn a positive semi-denite (PSD) matrix

Relationship with subspace learning

where .

6

Page 7: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

• Linear

• Kernel

• Tensor

The structure of the input

The label type of training samples• Supervised

• Unsupervised

• Semi-supervised

Distance Metric Learning

7

Page 8: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

The supervision type of training samples

The architecture of models

• Weakly-supervised

• Strongly-supervised

• Shallow

• Deep

The number of metrics• Single-Metric Learning

• Multi-Metric Learning

Distance Metric Learning

8

Page 9: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

Distance Metric Learning

9

Large Margin Nearest Neighborhood (LMNN)

[Weinberger et al, NIPS2005]

Page 10: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

Information-Theoretic Metric Learning (ITML)

[Davis et al, ICML2007]

where .

The optimization function can be re-formulated as

Distance Metric Learning

9

Page 11: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

1111

Mahalanobis Deep Metric LearningFinal representation:

The distance of a pair is:

Illustration at the top layer

[1] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Discriminative deep metric learning for face verification in the wild, CVPR, pp. 1875-1882, 2014.

[2] Jiwen Lu, Junlin Hu, and Yap-Peng Tan, Discriminative deep metric learning for face and kinship verification, IEEE Trans. on Image Processing, vol. 26, no. 9, pp. 4269-4282, 2017.

Page 12: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

1212

Mahalanobis Deep Metric LearningFace Verification

Page 13: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

1313

Basic idea of the proposed DTML method.

Mahalanobis Deep Metric Learning

[3] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep transfer metric learning, CVPR, pp. 325-333, 2015.[4] Junlin Hu, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Deep transfer metric learning, IEEE Trans. on Image

Processing, vol. 25, no. 12, pp. 5576-5588, 2016.

Page 14: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

1414Top r matched results of different methods on the VIPeR dataset

Mahalanobis Deep Metric LearningCross-Dataset Person Re-identification

Page 15: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

1515

Main procedure of our proposed DML tracker.

Mahalanobis Deep Metric LearningVisual Tracking

[5] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep metric learning for visual tracking, IEEE Trans. on Circuits and Systems for Video Technology, vol. 26, no. 11, 2056-2068, 2016.

Page 16: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

1616

Mahalanobis Deep Metric Learning

Page 17: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

1717

Mahalanobis Deep Metric LearningImage Set Classification

[6] Jiwen Lu, Gang Wang, Weihong Deng, Pierre Moulin, and Jie Zhou, Multi-manifold deep metric learning for image set classification, CVPR, pp. 1137-1145, 2015.

Page 18: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

1818

Average classification rates of different methods on different datasets

Mahalanobis Deep Metric LearningImage Set Classification

Page 19: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

19

Mahalanobis Deep Metric LearningCross-Modal Matching

[7] Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Deep coupled metric learning for cross-modal matching, IEEE Trans. on Multimedia, vol. 19, no. 6, pp. 1234-1244, 2017.

Page 20: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

2020

Mahalanobis Deep Metric LearningCross-Modal Matching

Page 21: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

21

Mahalanobis Deep Metric LearningFace and Person Recognition

[8] Yueqi Duan, Jiwen Lu, Jianjiang Feng, and Jie Zhou, Deep localized metric learning, IEEE Trans. on Circuits and Systems for Video Technology, 2017, accepted.

Page 22: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

2222

Mahalanobis Deep Metric LearningFace and Person Recognition

Page 23: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

23

Mahalanobis Deep Metric LearningMulti-view Deep Metric Learning

[9] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Sharable and individual multi-view metric learning, IEEE Trans. on Pattern Analysis and Machine Intelligence, 2017, accepted.

Page 24: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

2424

Mahalanobis Deep Metric LearningMulti-view Deep Metric Learning

Page 25: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

25

Mahalanobis Deep Metric LearningLabel Sensitive Deep Metric Learning

[10] Hao Liu, Jiwen Lu, Jianjiang Feng, and Jie Zhou, Label-sensitive deep metric learning for facial age estimation, IEEE Trans. on Information Forensics and Security, 2017, accepted.

Page 26: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

2626

Mahalanobis Deep Metric LearningFacial Age Estimation

Page 27: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

2727

Mahalanobis Deep Metric LearningAdversarial Dep Metric Learning

[11] Yongming Rao, Ji Lin, Jiwen Lu, and Jie Zhou, Learning discriminative aggregation network for video-based face recognition, ICCV, pp. 3781-3790, 2017.

Page 28: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

2828

Mahalanobis Deep Metric LearningAdversarial Dep Metric Learning

Page 29: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

29

Hamming Deep Metric LearningScalable Image Search

[12] Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou, Deep hashing for compact binary codes learning, CVPR, pp. 2475-2483, 2015.

[13] Jiwen Lu, Venice Erin Liong, and Jie Zhou, Deep hashing for scalable image search, IEEE Trans. on Image Processing, vol. 26, no. 5, pp. 2352-2367, 2017.

Page 30: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

30

Results on CIFAR.

Hamming Deep Metric LearningScalable Image Search

Page 31: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

31

Hamming Deep Metric LearningScalable Image Search

[14] Zhixiang Chen, Jiwen Lu, Jianjiang Feng, and Jie Zhou, Nonlinear discrete hashing, IEEE Trans. on Multimedia, vol. 19, no. 1, pp. 123-135, 2017.

Page 32: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

32

Results on CIFAR.

Hamming Deep Metric LearningScalable Image Search

Page 33: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

33

Hamming Deep Metric LearningScalable Video Search

[15] Zhixiang Chen, Jiwen Lu, Jianjiang Feng, and Jie Zhou, Nonlinear structural hashing for scalable video search, IEEE Trans. on Circuits and Systems for Video Technology, 2017, accepted.

Page 34: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

34

Hamming Deep Metric LearningScalable Video Search

Page 35: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

35

Hamming Deep Metric LearningScalable Video Search

[16] Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Deep video hashing, IEEE Trans. on Multimedia, vol. 19, no. 6, pp. 1209-1219, 2017.

Page 36: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

36

Hamming Deep Metric LearningScalable Video Search

Page 37: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

37

Hamming Deep Metric LearningImage Matching

[17] Kevin Lin, Jiwen Lu, Chu-Song Chen, and Jie Zhou, Learning compact binary descriptors with unsupervised deep neural networks, CVPR, pp. 1183-1192, 2016.

Main procedure of our proposed approach.

Page 38: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

38

Hamming Deep Metric LearningImage Matching

Page 39: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

39

Hamming Deep Metric LearningImage Matching

[18] Yueqi Duan, Jiwen Lu, Ziwei Wang, Jianjiang Feng, and Jie Zhou, Learning deep binary descriptorwith multi-quantization, CVPR, pp. 1183-1192, 2017.

Page 40: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

40

Hamming Deep Metric LearningImage Matching

Page 41: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

4141

Multi-Modal Deep Metric Learning

Our proposed multi-modal deep metric learning framework.

[19] Anran Wang, Jiwen Lu, Jianfei Cai, Tat-Jen Cham, and Gang Wang, Large-margin multi-modal deep learning for RGB-D object recognition, IEEE Trans. on Multimedia, vol. 17, no. 11, pp. 1887-1898, 2015.

RGB-D Object Recognition

Page 42: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

4242

Multi-Modal Deep Metric Learning

RGB-D object dataset: 51 classes, 207,920 RGB-D image frames, with roughly 600 images per object

10-fold split. For each split, one object from each class was sampled, resulting in 51 test objects. After subsampling every 5th frame from the videos, there were some 34,000 images for training and 6900 images for testing

RGB-D Object Recognition

Page 43: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

4343

Multi-Modal Deep Metric Learning

[20] Anran Wang, Jianfei Cai, Jiwen Lu, and Tat-Jen Cham, MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition, ICCV, pp. 1125-1133, 2015.

Page 44: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

4444

Multi-Modal Deep Metric Learning

RGB-D object dataset:

RGB-D Object Recognition

Page 45: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

4545

Deep Metric Learning

[21] Jiwen Lu, Junlin Hu, and Jie Zhou, Deep metric learning for visual understanding, IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 76-84, 2017.

Page 46: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

4646

• Deep metric learning is very effective for manyvisual understanding tasks including visualrecognition, visual tracking, and visual search.

• More visual cues can be exploited to helpdevelop more elegant deep metric learningmethods for visual understanding.

• Theoretical analysis for deep metric learning to isrequired to show how it improves various visualunderstanding tasks.

Summary

Page 47: Deep Metric Learning - Xidian · Mahalanobis Deep Metric Learning Final representation: The distance of a pair is: Illustration at the top layer [1] Junlin Hu, Jiwen Lu, and Yap-Peng

4747

Thank you!