saliency prediction using deep learning techniques

Visual Saliency Prediction using Deep Learning Techniques

Junting Pan Xavier Giró-i-Nieto

AUTHOR ADVISOR

20/07/2014

2

OUTLINE

1. Motivation2. Related works3. Methodology4. Results5. Conclusions

3

Let’s play a game!

4

SALIENCY PREDICTION

5

SALIENCY PREDICTION

What have you seen?

6

Tower

SALIENCY PREDICTION

7

Tower

SALIENCY PREDICTION

House

8

SALIENCY PREDICTION

Tower House

Rocks

9

SALIENCY PREDICTION

10

SALIENCY PREDICTION

Eye Tracker Mouse Click

11

LSUN SALIENCY CHALLENGE

12

13

14

OUTLINE

1. Motivation2. Related Works3. Methodology4. Results5. Conclusions

15

RELATED WORK: Deep Learning

@jponttuset

https://twitter.com/jponttuset

16

RELATED WORK: Deep Learning

Deep Learning

http://insights.venturescanner.com/category/artificial-intelligence-2/

17

RELATED WORK: ConvNet

A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012)

Orange

https://scholar.google.es/citations?user=x04W_mMAAAAJ&hl=es&oi=sra

https://scholar.google.es/citations?user=JicYPdAAAAAJ&hl=es&oi=sra

http://papers.nips.cc/paper/4824-imagenet-classification-w

http://papers.nips.cc/book/advances-in-neural-information-processing-systems-25-2012

18

A Krizhevsky, I Sutskever, GE Hinton “Imagenet classification with deep convolutional neural networks” Part of: Advances in Neural Information Processing Systems 25 (NIPS 2012) 19

20

=Downsampling

21

ReLU (non-linearity)

f(x) = max(0,x)

22

Dot Produt

23

RELATED WORK: Conventional Saliency

Jianming Zhang, Stan Sclaroff. Saliency detection: a boolean map approach [ICCV 2013]

http://cs-people.bu.edu/jmzhang/BMS/BMS_iccv13_preprint.pdf

24

RELATED WORK: Deep Saliency

Kümmerer, Matthias, Lucas Theis, and Matthias Bethge. "Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet." arXiv preprint arXiv:1411.1045 (2014).

http://arxiv.org/abs/1411.1045

25

RELATED WORK: Deep Saliency

Vig, Eleonora, Michael Dorr, and David Cox. "Large-scale optimization of hierarchical features for saliency prediction in natural images." Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.

http://dx.doi.org/10.1109/CVPR.2014.358

26

RELATED WORK: End-to-end Architecture

Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on. IEEE, 2015.

http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf

27

OUTLINE

28

SALIENCY PREDICTION: JuntingNet

29

http://vision.princeton.edu/projects/2014/iSUN/

http://salicon.net/

http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Jiang_SALICON_Saliency_in_2015_CVPR_paper.html

http://salicon.net/

30

SALIENCY PREDICTION: Data

TRAIN VALIDATION TEST

10,000 5,000 5,000

6,000 926 2,000

CAT2000 [Borji’15] 2,000 - 2,000

MIT300 [Judd’12] 300 - -

LargeScale

http://salicon.net/

http://saliency.mit.edu/results_cat2000.html

http://saliency.mit.edu/results_mit300.html

31

http://salicon.net/

32

SALIENCY PREDICTION: Architecture

Upsample + filter

2D map

96x96 2340=48x48

IMAGE INPUT(RGB)

33

Upsample + filter

2D map

96x96 2340=48x48

3 CONV LAYERS

34

Upsample + filter

2D map

96x96 2340=48x48

2 DENSE LAYERS

35

Upsample + filter

2D map

96x96 2340=48x48

36

http://salicon.net/

http://www.iro.umontreal.ca/~lisa/pointeurs/theano_scipy2010.pdf

http://arxiv.org/pdf/1211.5590.pdf

http://deeplearning.net/software/theano/

37

SALIENCY PREDICTION: Overfitting

Overfitting: More than 20 Milions of parameters

10.000 images for training

38

SALIENCY PREDICTION: Training

Data augmentation with horizontal mirroring.

39

SALIENCY PREDICTION: TrainingWe split the total training data in TWO parts:

80% Training

20% Validation (simultaneous testing)

40

Training curve of iSUN Database

41

SALIENCY PREDICTION: TrainingLower is better !!

42

Number of iterations (Training time)

43

Longer is better?

44

If the validation loss stops decreasing...

45

If the validation loss stops decreasing...

DANGER OF OVERFITTING!The model is learning from the data, NOT the problem itself

46

Training curve of SALICON Database

47

A: I have just show you our best model.

B: Why is this the best model?

48

SALIENCY PREDICTION: Trial and ErrorWe tried many architectures, too many to be listed here..

49

SALIENCY PREDICTION: Trial and ErrorWe tried many architectures, too many to be listed here..

We tried many architectures, too many to be listed here..

50

SALIENCY PREDICTION: Trial and Error

We tried many architectures, too many to be listed here..

51

SALIENCY PREDICTION: Trial and Error

52

Loss function Mean Square Error (MSE)

Weight initialization Gaussian distribution

Learning rate 0.03 to 0.0001

Mini batch size 128

Training time 7h (SALICON) / 4h (iSUN)

Acceleration SGD+ nesterov momentum (0.9)

Regularisation Maxout norm

GPU NVidia GTX 980

53

OUTLINE

54

RESULTS: Qualitative (iSUN)

JuntingNetGround TruthPixels

55

56

57

58

RESULTS: Quantitative (iSUN)Results from CVPR LSUN Challenge 2015

59

RESULTS: Qualitative (SALICON)

60

61

62

63

RESULTS: Quantitative (SALICON)Results from CVPR LSUN Challenge 2015

64

RESULTS: First Position at LSUN Challenge

65

RESULTS: MIT Saliency Benchmark

Method SImilarity CC AUC_shuffled AUC_Borji AUC_Judd

Baseline: infinite human

1 1 0.80 0.87 0.91

Deep Gaze 0.39 0.48 0.66 0.85 0.84

eDN 0.41 0.45 0.62 0.81 0.82

Our work 0.4708 0.4285 0.5075 0.7416 0.7720

Torralba, Antonio, and Alexei Efros. "Unbiased look at dataset bias." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5995347&tag=1

66

Future Work

Method SImilarity CC AUC_shuffled AUC_Borji AUC_Judd

Baseline: infinite human

1 1 0.80 0.87 0.91

Deep Gaze 0.39 0.48 0.66 0.85 0.84

SalNet 0.52 0.58 0.69 0.82 0.83

eDN 0.41 0.45 0.62 0.81 0.82

Our work 0.4708 0.4285 0.5075 0.7416 0.7720

Torralba, Antonio, and Alexei Efros. "Unbiased look at dataset bias." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011

K. McGuinness

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5995347&tag=1

67

RESULTS: Dissemination

http://bit.ly/juntingnet

Preprint Open Source Software & Models

http://bit.ly/juntingnet

68

RESULTS: Dissemination

Article highlighted at www.upc.edu

on 17 July 2015

http://www.upc.edu

69

OUTLINE

70

LSUN SALIENCY CHALLENGE: A Déjà vu ?

John Markoff, “Scientists see promise in deep learning Programs”, The New York Times (Nov2012).

Photo: Keith Penner

http://www.nytimes.com/2012/11/24/science/scientists-see-advances-in-deep-learning-a-part-of-artificial-intelligence.html?_r=0

71

ACKNOWLEDGMENTSXavier Giró NietoCarlos SeguraCarles FernándezAlbert GilVictor CamposEnric MonteElisa SayrolEdu FontdevilaMíriam BellverAmaia SalvadorMarc CarnéJavier HernandoJavier VeraAll my family members and friends

72

Thank you!

73

Thank you!

74

Thank you!

75

Thank you! : )

Thank you!

saliency prediction using deep learning techniques

deep saliency vig

deep saliency kmmerer

visual saliency prediction

saliency detection

related works

lsun saliency challenge

convolutional networks

deep gaze

Technology

visual saliency based on multiscale deep features · visual...

quaternion-based spectral saliency detection for...

salgan: visual saliency prediction with generative ... ·...

recsal : deep recursive supervision for visual saliency

label consistent quadratic surrogate model for visual...

deep saliency: what is learnt by a deep network about

group-wise deep co-saliency detectiongroup-wise deep...

end to-end convolutional network for saliency prediction

enhancing salient object segmentation through...

disc: deep image saliency computing via progressive...

deep learning for computer vision: saliency prediction (upc...

salgan: visual saliency prediction with generative ... ·...

extending 2d saliency models for head movement prediction

sam: pushing the limits of saliency prediction models ·...

video saliency prediction with deep neural networks · 5.2...

high level saliency prediction for smart game balancing

saliency prediction via multi-level features and deep

automatic image cropping and selection using saliency: an...

object detection and recognition: from saliency prediction

text deconvolution saliency (tds): a deep tool box for