classification and semantic segmentationyboykov/courses/cs898/lectures/lec5... · semantic...
TRANSCRIPT
![Page 1: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/1.jpg)
Classification
and
Semantic Segmentation
Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, Serena Yeung, Jia-Bin Huang, Bharath Hariharan, Jeremy Jordan
![Page 2: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/2.jpg)
Supervised Machine Learning
▪ Training data 𝑥1, … , 𝑥𝑁 with true labels (targets) 𝑦1, … , 𝑦𝑁▪ Chose hypothesis class ℎ 𝑥,𝑊▪ Define loss function for 𝑥 when the true label is 𝑦
▪ i.e. 𝐿 ℎ 𝑥,𝑊 , 𝑦 = 𝑦 − ℎ 𝑥,𝑊 2
▪ Training stage▪ minimize total loss on training set using gradient descent
min𝑊
𝑖=1
𝑁
𝐿 ℎ 𝑥𝑖 ,𝑊 , 𝑦𝑖
▪ Test stage▪ compute accuracy on test data, unseen during training
![Page 3: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/3.jpg)
Single Layer Neural Network on Images
▪ 2 classes (cat vs dog)▪ ℎ 𝑥,𝑊 = σ 𝑊𝑥
▪ range in (0,1)
![Page 4: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/4.jpg)
Single Layer Neural Network on Images
▪ 2 classes (cat vs dog)▪ ℎ 𝑥,𝑊 = σ 𝑊𝑥
▪ range in (0,1) 𝑊p(dog)sigmoid
▪ Also called Linear Classifier▪ Works well only for linearly separable classes
▪ not expressive enough
![Page 5: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/5.jpg)
Single Layer NN for multiple classes
▪ Several classes (dog, cat, horse)
p(horse)softmax p(dog)
p(cat)
▪ One-hot encoding for labels 𝑦 horse = 100
, dog = 010
, cat = 001
−
𝑐𝑙𝑎𝑠𝑠𝑒𝑠
𝑦𝑡𝑟𝑢𝑒log(𝑦𝑝𝑟𝑒𝑑)▪ Cross-entropy loss
![Page 6: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/6.jpg)
Multilayer Neural Network on images
Linear +
ReLU
Linear +
ReLULinear + sigmoid p(dog)
1024
32
▪ cat vs dog
256x256 2048
▪ Layers are called fully-connected▪ Expressive enough, but huge number of parameters
▪ expensive, requires lots of data to train well
![Page 7: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/7.jpg)
Reducing Number of Parameters
65,536
![Page 8: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/8.jpg)
Idea 1: local connectivity
Pixels only related to nearby pixels
![Page 9: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/9.jpg)
Idea 2: Translation invariance
Pixels only related to nearby pixelsWeights should not depend on the location of the neighborhood
![Page 10: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/10.jpg)
Linear function + translation invariance = convolution
▪ Local connectivity determines kernel size
5.4 0.1 3.6
1.8 2.3 4.5
1.1 3.4 7.2
![Page 11: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/11.jpg)
Convolution over multiple channels
*
*
*
*+
+
= =
![Page 12: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/12.jpg)
CNN: Convolutional layer
w
h
c
w
h
c’
Convolution
c
c’
![Page 13: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/13.jpg)
CNN: Convolution Subsampling Convolution
▪ Subsampling can be implemented by applying convolution in strides▪ every 2 (or 3, or 4,…) pixels ▪ number of features is usually increased after subsampling, to maintain
expressiveness
subsampling
▪ Convolution in earlier steps detects more local patterns less resilient to distortion▪ Convolution in later steps detects more global patterns more resilient to
distortion▪ Subsampling allows capture of larger, more invariant patterns
![Page 14: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/14.jpg)
Invariance to distortions: Pooling
▪ Each window reduced to one value▪ with max or average
…
![Page 15: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/15.jpg)
4 7 6 9 3 11
8 3 21 4 0 0
1 2 1 3 5 6
7 9 4 3 1 8
5 2 1 5 5 0
0 1 6 4 5 6
Invariance to distortions: Max Pooling
8 21 11
9 4 8
5 6 6
![Page 16: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/16.jpg)
4 7 6 9 3 11
8 3 21 4 0 0
1 2 1 3 5 6
7 9 4 3 1 8
5 2 1 5 5 0
0 1 6 4 5 6
Invariance to distortions: Average Pooling
5.5 10 3.5
4.75 2.75 5
2 4 4
▪ Each pooling layer takes a collection of feature maps as input and produces a collection of feature maps as output
▪ Output feature maps are usually smaller in height and width▪ Parameters: None
![Page 17: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/17.jpg)
CNN: Pooling Layer
![Page 18: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/18.jpg)
Convolutional networks
Horse
convolutional and pooling layers
fully connected layers
![Page 19: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/19.jpg)
First Successful Classification CNN
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86.11 (1998): 2278-2324.
![Page 20: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/20.jpg)
AlexNet - 2012
▪ Won ImageNet competition by a large margin
![Page 21: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/21.jpg)
▪ First simple widely used net▪ Smaller filters and Deeper Network
VGGNet 2014
![Page 22: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/22.jpg)
ResNet 2015
▪ Many more layers▪ Special skip connections for better training
![Page 23: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/23.jpg)
Semantic Segmentation
person
grass
trees
motorbike
road
![Page 24: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/24.jpg)
Semantic Segmentation
![Page 25: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/25.jpg)
Semantic Segmentation: One-hot encoding
![Page 26: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/26.jpg)
Semantic Segmentation: Cross-Entropy Loss Function
−
𝑐𝑙𝑎𝑠𝑠𝑒𝑠
𝑦𝑡𝑟𝑢𝑒log(𝑦𝑝𝑟𝑒𝑑)
▪ Pixelwise loss
▪ Added over all pixels
![Page 27: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/27.jpg)
Semantic Segmentation with CNNs
h
w
3
![Page 28: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/28.jpg)
Semantic Segmentation with CNNs
h/4
w/4
d
![Page 29: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/29.jpg)
Semantic Segmentation with CNNs
d
h/4
w/4
![Page 30: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/30.jpg)
Semantic Segmentation with CNNs
h/4
w/4
d𝑑 good features for classifying top left ‘pixel’
![Page 31: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/31.jpg)
Semantic Segmentation with CNNs
𝑑
convolve with 𝑐 filters of size 1x1
𝑐
h/4
w/4
▪ Finally pass 𝑐 features of each pixel feature through softmax
![Page 32: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/32.jpg)
Semantic Segmentation with CNNs
▪ Pass image through convolution and subsampling layers
▪ Final convolution with #classes outputs▪ Get scores for subsampled image▪ Upsample back to original size
person
bicycle
![Page 33: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/33.jpg)
The Resolution Issue
▪ Problem: Need fine details!▪ Shallower network/earlier layers?
▪ not very semantic
Horse
Visualizations from : M. Zeiler and R. Fergus. Visualizing and Understanding Convolutional Networks. In ECCV 2014
![Page 34: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/34.jpg)
The Resolution Issue
▪ Problem: Need fine details!▪ Remove subsampling?
▪ Need many features per pixel▪ expensive without subsampling
▪ Need large field of view for final features▪ very deep network, expensive without subsampling
![Page 35: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/35.jpg)
Solution 1: Image pyramids
Learning Hierarchical Features for Scene LabelingClement Farabet, Camille Couprie, Laurent Najman, Yann LeCun. In TPAMI, 2013
Hig
her
res
olu
tio
nLe
ss c
on
text
▪ Does not scale well to deep architectures
![Page 36: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/36.jpg)
Solution 2: CNN+Conditional Random Fields▪ Combine with CRF as post-processing
▪ “Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFS”, Chen et.al. ICLR’2015
![Page 37: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/37.jpg)
Solution 2: CNN+Conditional Random Fields
CNN
input class probabilities Full CRF final output
RNN
◼ Combine with CRF in end-to-end trainable system ◼ mean field inference implemented as RNN◼ Zheng et.al., “Conditional Random Fields as Recurrent Neural Networks”ICCV’2015
![Page 38: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/38.jpg)
Solution 3: Learn to Upsample
◼ Encoder/decoder structure
Long, Shelhamer, and Darrell, “Fully Convolutional Networks for Semantic Segmentation”, CVPR 2015Noh et al, “Learning Deconvolution Network for Semantic Segmentation”, ICCV 2015Badrinarayanan et al, “SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation”, TPAMI 2017
![Page 39: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/39.jpg)
Methods for Upsampling
![Page 40: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/40.jpg)
Methods for Upsampling
![Page 41: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/41.jpg)
Decoding using only Upsampling
From long et al.: struggles to produce fine-grained segmentationsSemantic segmentation faces an inherent tension between semantics and location: global information resolves what while local information resolves where... Combining fine layers and coarse layers lets the model make local predictions that respect global structure.
![Page 42: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/42.jpg)
Solution 4: Skip connections
upsample
![Page 43: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/43.jpg)
Skip connections
Fully convolutional networks for semantic segmentation. Evan Shelhamer, Jon Long, Trevor Darrell. In CVPR 2015
without skip connections
with skip connections
![Page 44: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/44.jpg)
Solution 5: Dilation
▪ Need subsampling to allow convolutional layers to capture large regions with small filters▪ can we do this without subsampling?
![Page 45: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/45.jpg)
Solution 5: Dilation
▪ Need subsampling to allow convolutional layers to capture large regions with small filters▪ can we do this without subsampling?
Fully convolutional networks for semantic segmentation. Evan Shelhamer, Jon Long, Trevor Darrell. In CVPR 2015Multi-Scale Context Aggregation by Dilated Convolutions. Yu et.al.ICRL’2016
![Page 46: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/46.jpg)
Solution 5: Dilation
▪ Need subsampling to allow convolutional layers to capture large regions with small filters▪ can we do this without subsampling?
![Page 47: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/47.jpg)
Solution 5: Dilation
▪ Instead of subsampling by factor of 2: dilate by factor of 2▪ allows for exponential increase in field of view without decrease
of spatial dimensions▪ Not panacea: without subsampling, feature maps are much larger
▪ memory issues
![Page 48: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/48.jpg)
Putting it all together
55
60
65
70
Basic +Skip +Dilation +CRF
mean IoU on PASCAL VOC
Best Non-CNN approach: ~46.4%
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan Yuille. In ICLR, 2015.
![Page 49: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/49.jpg)
More Architectures: U-net
▪ expanding the decoder with symmetry
“U-Net: Convolutional Networks for Biomedical Image Segmentation”, Ronneberger et. al., ICMI’2015
![Page 50: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/50.jpg)
PSPNet▪ Pyramid Pooling mode
▪ new module to capture global scene context▪ 82.6 mean IoU on PASCAL VOC
“Pyramid Scene Parsing Network”, Zhao et.al., CVPR 2017
![Page 51: Classification and Semantic Segmentationyboykov/Courses/cs898/Lectures/lec5... · Semantic Segmentation Most slides are from Fei-Fei Li, Justin Johnson, Andrej Karpathy, SerenaYeung,](https://reader033.vdocuments.us/reader033/viewer/2022053020/5f2b3092ecb3bc731e416538/html5/thumbnails/51.jpg)
ICNet for Real-Time Semantic Segmentation
“ICNet for Real-Time Semantic Segmentation on High-Resolution Images ”, Zhao et.al., ECCV 2018
▪ Apply heavier CNN to small resolution