cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... ·...
TRANSCRIPT
![Page 1: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/1.jpg)
DeepResidualLearningforImageRecognition
KaimingHe,XiangyuZhang,ShaoqingRen,JianSun
workdoneatMicrosoftResearchAsia
1x1conv,64
3x3conv,64
1x1conv,256
1x1conv,64
3x3conv,64
1x1conv,256
1x1conv,64
3x3conv,64
1x1conv,256
1x1conv,128
,/2
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,256
,/2
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,512
,/2
3x3conv,512
1x1conv,2048
1x1conv,512
3x3conv,512
1x1conv,2048
1x1conv,512
3x3conv,512
1x1conv,2048
avepool,fc1
000
7x7conv
,64,/2,pool/2
![Page 2: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/2.jpg)
ResNet @ILSVRC&COCO2015Competitions
1stplacesinallfivemaintracks• ImageNetClassification:“Ultra-deep”152-layer nets• ImageNetDetection: 16% betterthan2nd• ImageNetLocalization: 27% betterthan2nd• COCODetection: 11% betterthan2nd• COCOSegmentation: 12% betterthan2nd
*improvementsarerelativenumbersKaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 3: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/3.jpg)
RevolutionofDepth
3.57
6.7 7.3
11.7
16.4
25.828.2
ILSVRC'15ResNet
ILSVRC'14GoogleNet
ILSVRC'14VGG
ILSVRC'13 ILSVRC'12AlexNet
ILSVRC'11 ILSVRC'10
ImageNetClassificationtop-5error(%)
shallow8layers
19layers22layers
152layers
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
8layers
![Page 4: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/4.jpg)
RevolutionofDepth
34
5866
86
HOG,DPM AlexNet(RCNN)
VGG(RCNN)
ResNet(FasterRCNN)*
PASCALVOC2007ObjectDetectionmAP (%)
shallow8layers
16layers
101layers
*w/otherimprovements&moredata
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
Enginesofvisualrecognition
![Page 5: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/5.jpg)
RevolutionofDepth11x11conv,96,/4,pool/2
5x5conv,256,pool/2
3x3conv,384
3x3conv,384
3x3conv,256,pool/2
fc,4096
fc,4096
fc,1000
AlexNet,8layers(ILSVRC2012)
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 6: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/6.jpg)
RevolutionofDepth11x11conv,96,/4,pool/2
5x5conv,256,pool/2
3x3conv,384
3x3conv,384
3x3conv,256,pool/2
fc,4096
fc,4096
fc,1000
AlexNet,8layers(ILSVRC2012)
3x3conv,64
3x3conv,64,pool/2
3x3conv,128
3x3conv,128,pool/2
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256,pool/2
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512,pool/2
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512,pool/2
fc,4096
fc,4096
fc,1000
VGG,19layers(ILSVRC2014)
input
Conv7x7+ 2(S)
MaxPool 3x3+ 2(S)
LocalRespNorm
Conv1x1+ 1(V)
Conv3x3+ 1(S)
LocalRespNorm
MaxPool 3x3+ 2(S)
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
MaxPool 3x3+ 2(S)
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
AveragePool 5x5+ 3(V)
Dept hConcat
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
AveragePool 5x5+ 3(V)
Dept hConcat
MaxPool 3x3+ 2(S)
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
Conv Conv Conv Conv1x1+ 1(S) 3x3+ 1(S) 5x5+ 1(S) 1x1+ 1(S)
Conv Conv MaxPool 1x1+ 1(S) 1x1+ 1(S) 3x3+ 1(S)
Dept hConcat
AveragePool 7x7+ 1(V)
FC
Conv1x1+ 1(S)
FC
FC
Soft maxAct ivat ion
soft max0
Conv1x1+ 1(S)
FC
FC
Soft maxAct ivat ion
soft max1
Soft maxAct ivat ion
soft max2
GoogleNet,22layers(ILSVRC2014)
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 7: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/7.jpg)
AlexNet,8layers(ILSVRC2012)
RevolutionofDepthResNet,152layers(ILSVRC2015)
3x3conv,64
3x3conv,64,pool/2
3x3conv,128
3x3conv,128,pool/2
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256,pool/2
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512,pool/2
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512,pool/2
fc,4096
fc,4096
fc,1000
11x11conv,96,/4,pool/2
5x5conv,256,pool/2
3x3conv,384
3x3conv,384
3x3conv,256,pool/2
fc,4096
fc,4096
fc,1000
1x1conv,64
3x3conv,64
1x1conv,256
1x1conv,64
3x3conv,64
1x1conv,256
1x1conv,64
3x3conv,64
1x1conv,256
1x2conv,128,/2
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,128
3x3conv,128
1x1conv,512
1x1conv,256,/2
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,256
3x3conv,256
1x1conv,1024
1x1conv,512,/2
3x3conv,512
1x1conv,2048
1x1conv,512
3x3conv,512
1x1conv,2048
1x1conv,512
3x3conv,512
1x1conv,2048
avepool,fc1000
7x7conv,64,/2,pool/2
VGG,19layers(ILSVRC2014)
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 8: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/8.jpg)
Islearningbetternetworksassimpleasstackingmorelayers?
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 9: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/9.jpg)
Simplystackinglayers?
0 1 2 3 4 5 60
10
20
iter. (1e4)
trainerror(%)
0 1 2 3 4 5 60
10
20
iter. (1e4)
testerror(%)CIFAR-10
56-layer
20-layer
56-layer
20-layer
• Plain nets:stacking3x3convlayers…• 56-layernethashighertrainingerror andtesterrorthan20-layernet
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 10: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/10.jpg)
Simplystackinglayers?
0 1 2 3 4 5 60
5
10
20
iter. (1e4)
erro
r (%
)
plain-20plain-32plain-44plain-56
CIFAR-10
20-layer32-layer44-layer56-layer
0 10 20 30 40 5020
30
40
50
60
iter. (1e4)
erro
r (%
)
plain-18plain-34
ImageNet-1000
34-layer
18-layer
• “Overlydeep”plainnetshavehighertrainingerror• Ageneralphenomenon,observedinmanydatasets
solid:test/valdashed:train
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 11: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/11.jpg)
7x7conv,64,/2
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,128,/2
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,256,/2
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,512,/2
3x3conv,512
3x3conv,512
3x3conv,512
fc1000
ashallowermodel
(18layers)
adeepercounterpart(34layers)
7x7conv,64,/2
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,128,/2
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,256,/2
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,512,/2
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512
fc1000
“extra”layers
• Richersolutionspace
• Adeepermodelshouldnothavehighertrainingerror
• Asolutionbyconstruction:• originallayers:copiedfroma
learnedshallowermodel• extralayers:setasidentity• atleastthesametrainingerror
• Optimizationdifficulties:solverscannotfindthesolutionwhengoingdeeper…
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 12: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/12.jpg)
DeepResidualLearning
• Plaintnet
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
anytwostackedlayers
𝑥
𝐻(𝑥)
weightlayer
weightlayer
relu
relu
𝐻 𝑥 isanydesiredmapping,
hopethe2weightlayersfit𝐻(𝑥)
![Page 13: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/13.jpg)
DeepResidualLearning
• Residual net
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
𝐻 𝑥 isanydesiredmapping,
hopethe2weightlayersfit𝐻(𝑥)
hope the2weightlayersfit𝐹(𝑥)
let𝐻 𝑥 = 𝐹 𝑥 + 𝑥weightlayer
weightlayer
relu
relu
𝑥
𝐻 𝑥 = 𝐹 𝑥 + 𝑥
identity𝑥
𝐹(𝑥)
![Page 14: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/14.jpg)
DeepResidualLearning
• 𝐹 𝑥 isaresidual mappingw.r.t.identity
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
• Ifidentitywereoptimal,easytosetweightsas0
• Ifoptimalmappingisclosertoidentity,easiertofindsmallfluctuations
weightlayer
weightlayer
relu
relu
𝑥
𝐻 𝑥 = 𝐹 𝑥 + 𝑥
identity𝑥
𝐹(𝑥)
![Page 15: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/15.jpg)
Network“Design”
• Keepitsimple
• Ourbasicdesign (VGG-style)• all3x3conv(almost)
• spatialsize/2=>#filtersx2• Simpledesign;justdeep!
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
7x7conv,64,/2
pool,/2
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,128,/2
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,256,/2
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,512,/2
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512
avgpool
fc1000
7x7conv,64,/2
pool,/2
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,64
3x3conv,128,/2
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,128
3x3conv,256,/2
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,256
3x3conv,512,/2
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512
3x3conv,512
avgpool
fc1000
plainnet ResNet
![Page 16: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/16.jpg)
CIFAR-10experiments
0 1 2 3 4 5 60
5
10
20
iter. (1e4)
erro
r (%
)
plain-20plain-32plain-44plain-56
20-layer32-layer44-layer56-layer
CIFAR-10plainnets
0 1 2 3 4 5 60
5
10
20
iter. (1e4)
erro
r (%
)
ResNet-20ResNet-32ResNet-44ResNet-56ResNet-110
CIFAR-10ResNets
56-layer44-layer32-layer20-layer
110-layer
• DeepResNetscanbetrainedwithoutdifficulties• DeeperResNetshavelowertrainingerror,andalsolowertesterror
solid:testdashed:train
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 17: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/17.jpg)
ImageNetexperiments
0 10 20 30 40 5020
30
40
50
60
iter. (1e4)
erro
r (%
)
ResNet-18ResNet-34
0 10 20 30 40 5020
30
40
50
60
iter. (1e4)
erro
r (%
)
plain-18plain-34
ImageNetplainnets ImageNetResNets
solid:testdashed:train
34-layer
18-layer
18-layer
34-layer
• DeepResNetscanbetrainedwithoutdifficulties• DeeperResNetshavelowertrainingerror,andalsolowertesterror
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 18: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/18.jpg)
ImageNetexperiments7.4
6.7
6.15.7
4
5
6
7
8
ResNet-34ResNet-50ResNet-101ResNet-15210-crop testing,top-5val error(%)
thismodelhaslowertimecomplexity
thanVGG-16/19
• Deeper ResNetshavelower error
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 19: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/19.jpg)
Beyondclassification
AtreasurefromImageNetisonlearningfeatures.
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.arXiv2015.
![Page 20: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/20.jpg)
“Featuresmatter.”(quote[Girshicketal.2014],theR-CNNpaper)
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
task 2nd-placewinner ResNets margin
(relative)
ImageNetLocalization(top-5error) 12.0 9.0 27%
ImageNetDetection([email protected]) 53.6 62.1 16%
COCO Detection([email protected]:.95) 33.5 37.3 11%
COCOSegmentation([email protected]:.95) 25.1 28.2 12%
• OurresultsareallbasedonResNet-101• Ourfeaturesarewelltransferrable
absolute8.5%better!
![Page 21: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/21.jpg)
ObjectDetection(brief)
• Simply“FasterR-CNN+ResNet”
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.ShaoqingRen,KaimingHe,RossGirshick,&JianSun.“FasterR-CNN:TowardsReal-TimeObjectDetectionwithRegionProposalNetworks”.NIPS2015.
image
CNN
featuremap
RegionProposalNet
proposals
classifier
RoI pooling
FasterR-CNNbaseline [email protected] [email protected]:.95
VGG-16 41.5 21.5ResNet-101 48.4 27.2
COCOdetection results(ResNethas28%relativegain)
![Page 22: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/22.jpg)
OurresultsonMSCOCOKaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
ShaoqingRen,KaimingHe,RossGirshick,&JianSun.“FasterR-CNN:TowardsReal-TimeObjectDetectionwithRegionProposalNetworks”.NIPS2015.
*theoriginalimageisfromtheCOCOdataset
![Page 23: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/23.jpg)
Resultsonrealvideo.ModeltrainedonMSCOCOw/80categories.(frame-by-frame;notemporalprocessing)
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.arXiv2015.ShaoqingRen,KaimingHe,RossGirshick,&JianSun.“FasterR-CNN:TowardsReal-TimeObjectDetectionwithRegionProposalNetworks”.NIPS2015.
thisvideoisavailableonline:https://youtu.be/WZmSMkK9VuA
![Page 24: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/24.jpg)
MoreVisualRecognitionTasksResNets leadonthesebenchmarks(incompletelist):• ImageNet classification,detection,localization• MSCOCO detection,segmentation
• PASCALVOC detection,segmentation• VQA challenge2016
• Humanposeestimation[Newelletal2016]• Depthestimation[Laina etal2016]• Segmentproposal[Pinheiro etal2016]• …
PASCALdetectionleaderboard
PASCALsegmentationleaderboard
ResNet-101
ResNet-101
![Page 25: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/25.jpg)
PotentialApplications
ResNetshaveshownoutstandingorpromisingresultson:
VisualRecognition
ImageGeneration(PixelRNN,NeuralArt,etc.)
NaturalLanguageProcessing(VerydeepCNN)
SpeechRecognition(preliminaryresults)
Advertising,userprediction(preliminaryresults)
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 26: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/26.jpg)
Conclusions
• DeepResidualNetworks:• Easytotrain• Simplygainaccuracyfromdepth• Welltransferrable
• Follow-up[Heetal.arXiv 2016]• 200 layersonImageNet,1000 layersonCIFAR
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“IdentityMappingsinDeepResidualNetworks”.arXiv 2016.KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.
![Page 27: cvpr2016 deep residual learning kaiminghekaiminghe.com/cvpr16resnet/cvpr2016_deep_residual... · 2017-01-22 · Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang,](https://reader034.vdocuments.us/reader034/viewer/2022042804/5f577c938ca92f04b86ca02d/html5/thumbnails/27.jpg)
Resources
• ModelsandCode• OurImageNetmodelsinCaffe:https://github.com/KaimingHe/deep-residual-networks
• Manyavailableimplementations:(listinhttps://github.com/KaimingHe/deep-residual-networks)
• FacebookAIResearch’sTorchResNet:https://github.com/facebook/fb.resnet.torch
• Torch,CIFAR-10,withResNet-20toResNet-110,trainingcode,andcurves:code• Lasagne,CIFAR-10,withResNet-32andResNet-56andtrainingcode:code• Neon,CIFAR-10,withpre-trainedResNet-32toResNet-110models,trainingcode,andcurves:code• Torch,MNIST,100layers:blog,code• AwinningentryinKaggle's rightwhalerecognitionchallenge:blog,code• Neon,Place2(mini),40layers:blog,code• …....
KaimingHe,XiangyuZhang,ShaoqingRen,&JianSun.“DeepResidualLearningforImageRecognition”.CVPR2016.