deep end2end voxel2voxel prediction · ahmed osman • motivation –“convolutional neural...

76
Ahmed Osman Deep End2End Voxel2Voxel Prediction Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri Presented by: Ahmed Osman

Upload: others

Post on 09-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Deep End2End Voxel2Voxel Prediction

Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

Presented by: Ahmed Osman

Page 2: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

•Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

•Related Work

•Contribution

•Method

•Experiments and Results

•Conclusion

2

Outline

Page 3: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

• Related Work

• Contribution

• Method

• Experiments and Results

• Conclusion

3

Outline

Page 4: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Semantic Segmentation

Video Semantic Segmentation

4

http://jamie.shotton.org/work/images/resear6.png

Page 5: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Video Semantic Segmentation

Video Semantic Segmentation

5

http://jamie.shotton.org/work/images/resear6.png

Page 6: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

• Related Work

• Contribution

• Method

• Experiments and Results

• Conclusion

6

Outline

Page 7: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Optical Flow Estimation

7

http://www.cvlibs.net/projects/objectsceneflow/showcase.jpg

A Filter Formulation for Computing Real Time Optical FlowAdarve et al.https://www.youtube.com/watch?v=_oW1vMdBMuY

Page 8: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

• Related Work

• Contribution

• Method

• Experiments and Results

• Conclusion

8

Outline

Page 9: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Video Coloring

9

http://images.mentalfloss.com/sites/default/files/styles/article_640x430/public/colorizing-movies_6.jpg

Page 10: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

• Related Work

• Contribution

• Method

• Experiments and Results

• Conclusion

10

Outline

Page 11: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Traditional Computer Vision Pipeline

11

Page 12: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Motivation– “Convolutional Neural Networks (CNN) are biologically-

inspired variants of MLPs.”

– “Revolutionized the traditional computer vision pipeline”

– Re-popularized by Krizhevsky et al. in 2012 by producing state-of-the-art results on the ImageNet dataset (Image Classification).

– Why was AlexNet successful?• Large labeled datasets

• GPU Computing

Convolutional Neural Networks

12

Page 13: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

ConvNets

13

Page 14: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Convolution

ConvNets

14

https://developer.apple.com/library/ios/documentation/Performance/Conceptual/vImage/ConvolutionOperations/ConvolutionOperations.html

Page 15: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Convolution Layer

ConvNets

15

http://cs231n.github.io/convolutional-networks/

Page 16: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Activation function

ConvNets

16

Page 17: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Activation function– Rectified Linear Unit (ReLU)

• No gradient vanishing problem

• Non linear

ConvNets

17

Page 18: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Pooling

ConvNets

18

Page 19: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Fully Connected Layer

ConvNets

19

Page 20: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• How to determine the weights?– Learn them using backpropagation

ConvNets

20

Page 21: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Loss Function

– Softmax

– Huber

– L2

ConvNets

21

Page 22: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Loss Function

– Softmax

– Huber

– L2

ConvNets

22Green: Huber Blue: L2

Page 23: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• How to determine the weights?– Learn them using backpropagation

ConvNets

23

Page 24: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• How to determine the weights?– Learn them using backpropagation

– Chain Rule

ConvNets

24

Page 25: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Backpropagation

25

Slides from Stanford University Course CS231Nhttp://cs231n.stanford.edu/slides/winter1516_lecture4.pdf

Page 26: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Backpropagation

26

Slides from Stanford University Course CS231Nhttp://cs231n.stanford.edu/slides/winter1516_lecture4.pdf

Page 27: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Backpropagation

27

Slides from Stanford University Course CS231Nhttp://cs231n.stanford.edu/slides/winter1516_lecture4.pdf

Page 28: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Backpropagation

28

Slides from Stanford University Course CS231Nhttp://cs231n.stanford.edu/slides/winter1516_lecture4.pdf

Page 29: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Backpropagation

29

Slides from Stanford University Course CS231Nhttp://cs231n.stanford.edu/slides/winter1516_lecture4.pdf

Page 30: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Backpropagation

30

Slides from Stanford University Course CS231Nhttp://cs231n.stanford.edu/slides/winter1516_lecture4.pdf

Page 31: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Fully Convolutional Network

• FlowNet

• Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

Related Work

31

Page 32: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Fully Convolutional Network (FCN)

Related Work

32

Page 33: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• FlowNet

Related Work

33

Page 34: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Related Work

34

• FlowNet

Page 35: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Eigen et al. [2014]

Related Work

35

Page 36: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

• Related Work

• Contribution

• Method

• Experiments and Results

• Conclusion

36

Outline

Page 37: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• 3D CNN end-to-end voxel-wise prediction

• Same network architecture for all three challenges.

• Introduces an approach for training with limited data.

Contribution

37

Page 38: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

• Related Work

• Contribution

• Method

• Experiments and Results

• Conclusion

38

Outline

Page 39: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Input: Channels x # of Frames x Height x Width

• Output: K x # of Frames x Height x Width

Recap: Problem

39

Segmentation done by http://segmentit.sourceforge.net/http://barkpost.com/wp-content/uploads/2013/03/oie_5181838bU3HJXJp.gif

Page 40: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Adapted from C3D

• Main Difference:

Method

40

Learning Spatiotemporal Features with 3D Convolutional NetworksDu Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

Page 41: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Adapted from C3D

• Main Difference: Added deconvolution layers

Method

41

Learning Spatiotemporal Features with 3D Convolutional NetworksDu Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

Page 42: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Deconvolution

42

Visualizing and Understanding Convolutional Networks

Matthew D Zeiler, Rob Fergus

Layer 1 Layer 2

Page 43: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Deconvolution

43

Visualizing and Understanding Convolutional Networks

Matthew D Zeiler, Rob Fergus

Page 44: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Deconvolution

44

Visualizing and Understanding Convolutional Networks

Matthew D Zeiler, Rob Fergus

Page 45: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Deconvolution

45

Visualizing and Understanding Convolutional Networks

Matthew D Zeiler, Rob Fergus

Page 46: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Deconvolution

46

Upsampling

Learnable DeconvolutionVisualization Deconvolution

Page 47: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

• Related Work

• Contribution

• Method

• Experiments and Results

• Conclusion

47

Outline

Page 48: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Video Semantic Segmentation

• Optical Flow Estimation

• Video Coloring

Experiments and Results

48

Page 49: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Dataset: – GATECH dataset

– Training set: 63 videos

– Test set: 38 sequences

– 8 Classes

Experiments: Video Semantic Segmentation

49

Geometric Context from Videos. Hussain Raza Matthias Grundmann Irfan Essa

Page 50: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Experiment: – Training:

• Split each video into all possible clips of length 16 frames (i.e. stride:1).

– Testing:• Performed on all non-overlapping clips (i.e. stride: 16).

Experiments: Video Semantic Segmentation

50

Geometric Context from Videos. Hussain Raza Matthias Grundmann Irfan Essa

16 frames16 frames

Page 51: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Experiment:

– Network details (V2V):• Loss layer: Softmax

• Weights initialized from C3D. New layers are randomly initialized.

• Initial learning rate: 10-4, divided by 10 every 30K iterations

Experiments: Video Semantic Segmentation

51

Page 52: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Semantic Segmentation

52

Page 53: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Semantic Segmentation

53

Bilinear

Page 54: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Semantic Segmentation

54

Bilinear

Page 55: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Semantic Segmentation

55

Bilinear

Page 56: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Semantic Segmentation

56

Page 57: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• d

Results: Video Semantic Segmentation

57

Smooth

Noisy

Net

dep

th

Page 58: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Video Semantic Segmentation

• Optical Flow Estimation

• Video Coloring

Experiments

58

Page 59: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Training:– Problem:

• No large dataset with optical flow ground truth.

– Solution?• Fabricate “semi-truth” from an existing optical flow method.

• Brox’s method was used.

– Dataset: • (V2V) UCF101 (Partial: test split 1)

• (Fine-tuned V2V) MPI-Sintel

• Network:– Loss function: Huber loss

– Initial learning rate: 10-8, divided by 10 every 200K iterations

Experiments: Optical Flow Estimation

59

Page 60: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Testing:– MPI-Sintel

Results: Optical Flow Estimation

60

Input V2V Brox Ground truth

Page 61: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Testing:– MPI-Sintel

Results: Optical Flow Estimation

61

Input V2V Brox Ground truth

Page 62: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Testing:– MPI-Sintel

Results: Optical Flow Estimation

62

Page 63: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Fine-tuning from C3D does not improve a lot.

• Same Architecture, Different Purpose

Results: Optical Flow Estimation

63

Page 64: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Video Semantic Segmentation

• Optical Flow Estimation

• Video Coloring

Experiments

64

Page 65: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Dataset:– UCF101

– Convert color videos to grayscale.

• Experiment: – Training:

• Loss function: L2

• Initial learning rate: 10-8, divided by 10 every 200K iterations

Experiments: Video Coloring

65

Page 66: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Network Average Distance Error (ADE)

2D-V2V 0.1495

V2V 0.1375

Results: Video Coloring

66

Page 67: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Coloring

67

• V2V learns “common sense” colors

Input

Ground TruthV2V

Page 68: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Coloring

68

• V2V learns “common sense” colors

Input

Ground TruthV2V

Page 69: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Coloring

69

• V2V learns “common sense” colors

Input

Ground TruthV2V

Page 70: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Results: Video Coloring

70

• V2V learns “common sense” colors

Input

Ground TruthV2V

Page 71: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Problems– Video Semantic Segmentation

– Optical Flow Estimation

– Video Coloring

• Related Work

• Contribution

• Method

• Experiments and Results

• Conclusion

71

Outline

Page 72: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• Contributions:– 3D CNN end-to-end voxel-wise prediction

– “Same” network architecture for all three challenges.

– Utilizes a well-established method to generate training data.

• Criticisms– Fine-tuning improved the result in OF, noticeably in

comparison with Brox’s method

– No mention activation function even in C3D

Conclusion

72

Page 73: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Thank You

for Listening

73

Questions?

Page 74: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• “Deep End2End Voxel2Voxel Prediction”– Tran et al. 2015

• “Flownet: Learning optical flow with convolutional networks”– Fischer et al. 2015

• “Imagenet classification with deep convolutional neural networks”– Krizhevsky et al. 2012

• “Learning spatiotemporal features with 3d convolutional networks”– Tran et al. 2015

• “Visualizing and understanding convolutional networks”– Zeiler et al. 2014

• “Fully convolutional networks for semantic segmentation”– Long et al. 2015

• “Depth map prediction from a single image using a multi-scale deep network”

– Eigen et al. 2014

• “Large displacement optical flow: Descriptor matching in variational motion estimation”

– Brox et al. 2011

References

74

Page 75: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

Backup Slides

75

Page 76: Deep End2End Voxel2Voxel Prediction · Ahmed Osman • Motivation –“Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs.” –“Revolutionized the

Ahmed Osman

• A perceptron is a linear classifier that utilizes a set of weights to predict an output for a feature vector.

Multi-layer Perceptron

76

https://blog.dbrgn.ch/images/2013/3/26/perceptron.png