bored by classification convnets? end-to-end...

35
Thomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer Vision Tasks Thomas Brox University of Freiburg Germany Research funded by ERC Starting Grant VideoLearn, the German Research Foundation, and the Deutsche Telekom Stiftung

Upload: others

Post on 02-Mar-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

Bored by Classification ConvNets?

End-to-end Learning of other Computer Vision Tasks

Thomas Brox

University of Freiburg

Germany

Research funded by ERC Starting Grant VideoLearn, the German Research Foundation, and the Deutsche Telekom Stiftung

Page 2: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

Generative networks

U-Net: Multi-instance segmentation

FlowNet: Estimating optical flow

2

Outline

Page 3: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 3

Typical ConvNet architecture

cat

Classification network

Page 4: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 4

Typical ConvNet architecture

cat

Classification network

cat

Page 5: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

Small grayoffice chair,side view

cat

5

Up-convolutional network

Image generation

Related work: • Eigen et al. NIPS 2014: Network for depth map prediction• Long et al. CVPR 2015: Network for semantic segmentation

Alexey DosovitskiyCVPR 2015

New: Expanding network architecture

Page 6: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

Generating chair images with a network

Dosovitskiy et al., CVPR 2015

6

Page 7: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 7

Training set

Source: https://github.com/dimatura/seeing3d

3D chair datasetAubry et al. CVPR 2014

Rendering 809 chair styles From 62 viewpoints

Some of the rendered chairs

Page 8: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 8

Generating images of unseen views

Training set split into two subsets:

Source set: 62 viewpoints available (90% of all chair models)

Target set: fewer viewpoints available (10% of all models)

Page 9: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 9

Generating images of unseen views

8 azimuths available

4 azimuths available

2 azimuths available

1 azimuth available

Page 10: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 10

Comparison to baselines

Alexey DosovitskiyCVPR 2015

Page 11: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

Interpolation of chair styles

1111

Alexey DosovitskiyCVPR 2015

Page 12: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 12

Correspondences between chair instances

Alexey DosovitskiyCVPR 2015

Page 13: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

• Generate intermediate images with the network

• Track points with optical flow (LDOF) along the sequence

13

Correspondences between chair instances

all easy difficult

Deformable Spatial Pyramid Matching (Kim et al. 2013)

5.2 3.3 6.3

SIFT Flow (Liu et al. 2008) 4.0 2.8 4.8

Ours 3.9 3.9 3.9

Human performance 1.1 1.1 1.1

Page 14: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 14

Preview: Inverting ConvNets with ConvNets

Alexey DosovitskiyarXiv 2015

Image featurese.g. from AlexNet

Related work: • Mahendran & Vedaldi CVPR 2015• Zeiler & Fergus ECCV 2014

Learn to re-generate the input image from its feature representation

Up-convolutional network

Page 15: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 15

Reconstruction results

Up-Conv.

Mahendran & Vedaldi

Auto-encoder

More reconstructions with up-convolutional network:

Page 16: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 16

Color and position are preserved in high layers

inputAll FC8

Top 5 FC8

All but Top 5 FC8

Color experiment Position experiment

Page 17: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

A generative network

U-Net: Multi-instance segmentation

FlowNet: Estimating optical flow

17

Outline

Page 18: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

U-Net: Image segmentation with a ConvNet

18

Olaf Ronneberger

18

• Similar to Fully Convolutional Network[Long et al., CVPR 2015]

• Original inspiration: Depth map prediction[Eigen et al., NIPS 2014]

Philipp Fischer

MICCAI 2015

Page 19: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 19

Binary segmentation

Electron MicroscopyISBI 2012 Challenge

Rank 1

Light microscopy cell trackingISBI 2015 Challenge

Rank 1

Page 20: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

Intersection over union: 77.5%Second best: 46%

20

Multi-class semantic segmentation

X-ray dental segmentation, ISBI 2015 Challenge, Rank 1

Page 21: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 21

Multi-instance segmentation

Light microscopy, DIC-HeLa cell trackingISBI 2015 Challenge: Rank 1

Page 22: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

A generative network

U-Net: Multi-instance segmentation

FlowNet: Estimating optical flow

22

Outline

Page 23: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 23

FlowNet: Estimating optical flow with a ConvNet

Refinement:expanding architecture

Page 24: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 24

Helping the network with a correlation layer

Alexey Dosovitskiy

Philipp Fischer

EddyIlg

Philip Häusser

Caner Hazirbas

Vladimir Golkov

Joint work with the group of Daniel Cremers

Page 25: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

• Getting ground truth optical flow for realistic videos is hard

• Existing datasets are small:

25

Enough data to train such a network?

Frames with ground truth

Middlebury 8

KITTI 194

Sintel 1041

Needed >10000

Page 26: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 26

Realism is overrated: the “flying chair” dataset

Rendered image Optical flow

Page 27: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 27

It works!

Although the network has only seen flying chairs for training, it predicts good optical flow on Sintel

Input images Ground truth

FlowNetSimple FlowNetCorr

Page 28: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 28

Results on various datasets

Middlebury KITTI Sintel Clean Sintel Final Flying Chairs

EpicFlow 0.39 3.8 4.1 6.3 2.9

DeepFlow 0.42 5.8 5.4 7.2 3.5

LDOF 0.56 12.4 7.6 9.1 3.5

FlowNetS - - 7.4 8.4 2.7

FlowNetS+v - - 6.5 7.7 2.9

FlowNetS+ft - 9.1 7.0 7.8 3.0

FlowNetS+ft+v 0.47 7.6 6.2 7.2 3.0

FlowNetC - - 7.3 8.8 2.2

FlowNetC+v - - 6.3 8.0 2.6

FlowNetC+ft - - 6.9 8.5 2.3

FlowNetC+ft+v 0.5 - 6.1 7.9 2.7

Networks can compete with state-of-the-art conventional optical flow estimation methods

Page 29: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 29

Can handle large displacements

Input images Ground truth

FlowNetSimple FlowNetCorr

DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015)

Page 30: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 30

Sometimes wrong direction

Input images Ground truth

FlowNetSimple FlowNetCorr

DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015)

Page 31: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 31

Often captures fine details

Input images Ground truth

FlowNetSimple FlowNetCorr

DeepFlow (Weinzaepfel et al. ICCV 2013) EpicFlow (Revaud et al. CVPR 2015)

Page 32: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 32

Results on “Flying chairs” test set

Input images

FlowNetCorrEpicFlow (Revaud et al. CVPR 2015)

Ground truth

Page 33: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 33

Runs with 10fps on the GPU

Page 34: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox

A generative network

U-Net: Multi-instance segmentation

FlowNet: Estimating optical flow

34

Summary

Page 35: Bored by Classification ConvNets? End-to-end …lear.inrialpes.fr/workshop/allegro2015/slides/brox.pdfThomas Brox Bored by Classification ConvNets? End-to-end Learning of other Computer

Thomas Brox 35

Tip of the day