introduction to convolutional neural...

31
Introduction to Convolutional Neural Networks 2018 / 02 / 23

Upload: others

Post on 22-Mar-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Introduction to Convolutional Neural Networks

2018 / 02 / 23

Page 2: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Buzzword: CNNConvolutional neural networks (CNN, ConvNet) is a class of deep, feed-forward (not recurrent) artificial neural networks that are applied to analyzing visual imagery.

Page 3: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Buzzword: CNN● Convolution

From wikipedia,

Page 4: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Buzzword: CNN● Neural Networks

Page 5: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Background: Visual Signal Perception

Page 6: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Background: Signal Relay

Starting from V1 primary visual cortex, visual signal is transmitted upwards, becoming more complicated and abstract.

Page 7: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Background: Neural Networks

Express the equations in Matrix form, we have

Page 8: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional
Page 9: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Neural Networks for ImagesFor computer vision, why can’t we just flatten the image and feed it through the neural networks?

Page 10: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Neural Networks for ImagesImages are high-dimensional vectors. It would take a huge amount of parameters to characterize the network.

Page 11: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Convolutional Neural NetworksTo address this problem, bionic convolutional neural networks are proposed to reduced the number of parameters and adapt the network architecture specifically to vision tasks.

Convolutional neural networks are usually composed by a set of layers that can be grouped by their functionalities.

Page 12: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Sample Architecture

Page 13: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Convolution Layer

● The process is a 2D convolution on the inputs.

● The “dot products” between weights and inputs are “integrated” across “channels”.

● Filter weights are shared across receptive fields. The filter has same number of layers as input volume channels, and output volume has same “depth” as the number of filters.

Page 14: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Convolution Layer

● The process is a 2D convolution on the inputs.

● The “dot products” between weights and inputs are “integrated” across “channels”.

● Filter weights are shared across receptive fields. The filter has same number of layers as input volume channels, and output volume has same “depth” as the number of filters.

Page 15: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Convolution Layer

● The process is a 2D convolution on the inputs.

● The “dot products” between weights and inputs are “integrated” across “channels”.

● Filter weights are shared across receptive fields. The filter has same number of layers as input volume channels, and output volume has same “depth” as the number of filters.

Page 16: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Convolution Layer

Page 17: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Activation Layer

● Used to increase non-linearity of the network without affecting receptive fields of conv layers

● Prefer ReLU, results in faster training

● LeakyReLU addresses the vanishing gradient problem

Other types: Leaky ReLU, Randomized Leaky ReLU, Parameterized ReLU Exponential Linear Units (ELU), Scaled Exponential Linear UnitsTanh, hardtanh, softtanh, softsign, softmax, softplus...

Page 18: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Softmax

● A special kind of activation layer, usually at the end of FC layer outputs

● Can be viewed as a fancy normalizer (a.k.a. Normalized exponential function)

● Produce a discrete probability distribution vector

● Very convenient when combined with cross-entropy loss

Given sample vector input x and weight vectors {wi}, the predicted probability of y = j

Page 19: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Pooling Layer

● Convolutional layers provide activation maps.

● Pooling layer applies non-linear downsampling on activation maps.

● Pooling is aggressive (discard info); the trend is to use smaller filter size and abandon pooling

Page 20: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Pooling Layer

Page 21: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

FC Layer● Regular neural network● Can view as the final

learning phase, which maps extracted visual features to desired outputs

● Usually adaptive to classification/encoding tasks

● Common output is a vector, which is then passed through softmax to represent confidence of classification

● The outputs can also be used as “bottleneck”

In above example, FC generates a number which is then passed through a sigmoid to represent grasp success probability

Page 22: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Loss Layer

● L1, L2 loss● Cross-Entropy loss (works

well for classification, e.g., image classification)

● Hinge Loss● Huber Loss, more resilient

to outliers with smooth gradient

● Minimum Squared Error (works well for regression task, e.g., Behavioral Cloning)

Binary case

General case

Page 23: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Regularization

● L1 / L2 ● Dropout● Batch norm● Gradient clipping● Max norm constraint

To prevent overfitting with huge amount of training data

Page 24: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Dropout

● During training, randomly ignore activations by probability p

● During testing, use all activations but scale them by p

● Effectively prevent overfitting by reducing correlation between neurons

Page 25: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Batch Normalization

● Makes networks robust to bad initialization of weights

● Usually inserted right before activation layers● Reduce covariance shift by normalizing and

scaling inputs● The scale and shift parameters are trainable

to avoid losing stability of the network

Page 26: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Example: ResNet ● Residual Network, by Kaiming He (2015)

● Heavy usage of “skip connections” which are similar to RNN Gated Recurrent Units (GRU)

● Commonly used as visual feature extractor in all kinds of learning tasks, ResNet50, ResNet101, ResNet152

● 3.57% Top-5 accuracy, beats human

Page 27: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

ApplicationsCan be viewed as a fancy feature extractor, just like SIFT, SURF, etc

Page 28: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

CNN & Vision: RedEyehttps://roblkw.com/papers/likamwa2016redeye-isca.pdf

Two keys: Add noise, save bandwidth/energy

Page 29: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

CNN & Robotics: RL ExampleUsually used with Multi-Layer Perceptron (MLP, can be viewed as a fancy term for non-trivial neural networks) for policy networks.

Page 30: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Software 2.0Quotes from Andrej Karpathy:

Software 1.0 is what we’re all familiar with — it is written in languages such as Python, C++, etc. It consists of explicit instructions to the computer written by a programmer. By writing each line of code, the programmer is identifying a specific point in program space with some desirable behavior.

Software 2.0 is written in neural network weights. No human is involved in writing this code because there are a lot of weights (typical networks might have millions). Instead, we specify some constraints on the behavior of a desirable program (e.g., a dataset of input output pairs of examples) and use the computational resources at our disposal to search the program space for a program that satisfies the constraints.

Page 31: Introduction to Convolutional Neural Networksweb.stanford.edu/class/cs231a/lectures/intro_cnn.pdf · 2018-02-25 · Convolutional Neural Networks To address this problem, bionic convolutional

Is CNN the Answer?Capsule?

End2End is not the right way?