spp-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than r-cnn, as...

13
SPP -net Spatial Pyramid Pooling in Deep Convolutional Networks

Upload: others

Post on 26-Dec-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

SPP-netSpatial Pyramid Poolingin Deep ConvolutionalNetworks

Page 2: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

Highlights

• ILSVRC 2014 (all provided-data tracks)

• DET -2nd

• CLS - 3rd

• LOC - 5th

• ECCV 2014 paper

• Published 2 months ago (arXiv:1406.4729v1, June18)

• Details disclosed (arXiv:1406.4729v2)

Page 3: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

Overview

• SPP-net- a new network structure

• Classification- improves all CNNs

• Detection- 20-60x faster than R-CNN, asaccurate

Page 4: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

Spatial PyramidMatching

• SPM: very successful in traditional computer vision[Grauman & Darrell, ICCV 2005] “The Pyramid Match Kernel: Discriminative Classification with Sets ofImage Features”

[Lazebnik et al, CVPR 2006] “Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural SceneCategories”

denseSIFT encoded

(VQ, SC,FV)SPM SVM

prediction

“fc layers”simply pooling?“conv layers”CNN

counterparts

Page 5: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

SPP-net: SPM inCNN

1000

4096 4096

traditional

CNN

fixedsize conv fc

SPP-net

anysize

1000

4096 4096

spatialpyramid

pooling

• Fix bin numbers

• DO NOT fix binsize

Page 6: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

SPP-net

• variable input size/scale• multi-size training

• multi-scale testing

• full-image view

• multi-level pooling• robust to deformation

• operates on featuremaps• pooling in regions

conv feature maps

conv layers

input image

concatenate

…...

…...

spatial pyramid poolinglayer

fc layers

Page 7: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

14.76

13.92

13.52

11.97

14.14

13.54

11.12

13.64

13.33

12.80

12.33

10.95

10.50

10.00

11.00

11.50

12.00

12.50

13.00

13.50

14.00

14.50

15.00

ZF-5 Convnet*-5 Overfeat-5 Overfeat-7

ILSVRC top-5 val (10-view)

no-SPPbaselines

+ multi-size training

multi-level pooling

All CNNs

improved!

4architectures

Page 8: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

ILSVRC 2014 CLSResults

• “shallow”• 7-conv, 1 Titan GPU, 3weeks

• but potential• SPP can improve deeper nets: >1% gain post-competition

team top-5 test

GoogLeNet 6.66

Oxford VGG 7.32

ours 8.06

Howard 8.11

DeeperVision 9.50

NUS-BST 9.79

TTIC_ECP 10.22

7-conv SPP-net,10-view 10.95%

7-conv SPP-net,9m6u-vltiei-wsc+a2le-f/uvlilew 9.08%

multiple SPP-nets 8.06%

Page 9: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

Detection: SPP onRegions

SPP

conv feature maps

conv layers

input image

region

fc layers

…...

Page 10: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

RCNN vs.SPP

• image regions vs. feature mapregions

SPP-net

1 net on fullimage

image

net

feature

featurefeature

net

image

net

feature

net

feature

net

feature

feature

R-CNN

2000 nets on image regions

Page 11: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

• With regional features, we can do everything ofRCNN• fine-tune, SVM, bbox regression…

• similar accuracy, much faster

SPP-net1-scale

SPP-net5-scale

RCNN

mAP 58.0 59.2 58.5

GPU time / img 0.14s 0.38s 9s

speed-up 64x 24x -

VOC2007

Page 12: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

SPP-net RCNN

GPU time / img 0.6s 32s

40k test imgs 8 hours 15days

cost of a singlemodel

ILSVRC 2014 DETResults

“provided data” track

mAP

NUS 37.2

ours, multiSPP-nets 35.1

UvA 32.0

ours, 1 SPP-net 31.8

Southeast-CASIA 30.4

1-HKUST 28.8

CASIA_CRIPAC_2 28.6

Page 13: SPP-netimlab.postech.ac.kr/dkim/class/csed514_2019s/sppnet.pdf · - 20-60x faster than R-CNN, as accurate. Spatial PyramidMatching ... SPP-net 1-scale SPP-net 5-scale RCNN mAP 58.0

• Conclusion• SPM inCNNs

• CLS: improve all CNNs in the literature

• DET: practical, fast, andaccurate

• Futurework• SPP on advancednetworks

• Resources•code, config, tech report… http://research.microsoft.c

om/en-us/um/people/kahe/

• Acknowledgement• We thank NVIDIA for the GPUdonation.