deep watershed transform for instance...

Deep Watershed Transform for Instance Segmentation Min Bai & Raquel Urtasun To appear at IEEE CVPR 2017 in Hawaii Presented at NVIDIA GTC 2017

Upload: others

Post on 30-Jul-2020

2 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Deep Watershed Transform for Instance Segmentation

Min Bai & Raquel Urtasun

To appear at IEEE CVPR 2017 in HawaiiPresented at NVIDIA GTC 2017

Page 2: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Semantic Segmentation● Input: RGB Image● Output at each pixel:

○ Semantic label

Page 3: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Instance Segmentation● Input: RGB Image● Output at each pixel:

○ Semantic label ○ Instance label

■ Same for each px in object■ Different among objects

○ Difficulty: How to phrase the problem?

Page 4: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Applications● Object tracking

Image credit: Davi Frossard

Page 5: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Applications● Interacting with the environment

Image credit: http://www.rethinkrobotics.com/build-a-bot/

Page 6: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Applications● Useful information for other algorithms such as optical flow, etc

Image credit: Shenlong Wang

Page 7: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Semantic Segmentation● Semantic segmentation is a well studied problem

○ Our instance segmentation method leverages an existing technique○ H. Zhao et al, Pyramid Scene Parsing Network, https://arxiv.org/abs/1612.01105

Image credit: H. Zhao et al.

Page 8: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Watershed Transform● Classical image segmentation technique

Image (left) credit: Adrian Fisher

Page 9: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Scalar Field and Gradient

Image source: Wikipedia: byVivekj78 - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=15346899

● Scalar field: single number at each pixel

● Gradient: vector at each pixel, pointing toward direction of greatest ascent

Page 10: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Overview of Approach

Gradient of Energy Landscape Energy Landscape Predicted Instances

Input Image

Semantic Segmentation

Page 11: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Overview of Approach

Gradient of Energy Landscape Energy Landscape Predicted Instances

Input Image

Semantic Segmentation

Page 12: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Why Predict Direction First?

Input Image Energy LandscapeDirection of Gradient

Much sharper difference in the direction label at the boundary!

Page 13: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Overall Network

Page 14: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Direction Prediction Network

Ground Truth Directions

Predicted Directions

Input Image

Semantic Segmentation

Page 15: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Energy Prediction Network

Ground Truth Energy

Predicted Energy

Ground Truth Instances

Predicted Instances

Page 16: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Training and Inference● Pre-train both networks● End-to-end fine-tuning● Network trained on NVIDIA DGX-1

○ Approximately 25 hours total for training on one GP100 core○ ~0.1s per image for forward pass○ Thank you NVIDIA for the generous gift!

Image source: www.nvidia.com

Page 17: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Cityscapes Dataset● 2975 training / 500 validation / 1525 testing images● Instances: car, truck, bus, train, person, rider, motorcycle, bicycle

Page 18: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Cityscapes Dataset● 2975 training / 500 validation / 1525 testing images● Instances: car, truck, bus, train, person, rider, motorcycle, bicycle

Page 19: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Cityscapes Instance Segmentation Leaderboard

* Average Precision (AP): higher is better

AP* AP* @ 50% AP* @ 50m AP* @ 100m

van den Brand et al. 2.3% 3.7% 3.9% 4.9%

Cordts et al. 4.6% 12.9% 7.7% 10.3%

Uhrig et al. 8.9% 21.1% 15.3% 16.7%

Ours 19.4% 35.3% 31.4% 36.8%

Recently, new approaches have achieved even higher performance.

Page 20: Deep Watershed Transform for Instance Segmentationon-demand.gputechconf.com/gtc/2017/presentation/s7588... · 2017-05-14 · Deep Watershed Transform for Instance Segmentation Min

Sample Output

Input RGB

Semantic Segmentation

Direction Prediction Energy Prediction