introduction to deep learninglauras/test/docs/school/ia/2017-2018/...differentiable function:...
TRANSCRIPT
![Page 1: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/1.jpg)
INTRODUCTION TO DEEP LEARNING
![Page 2: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/2.jpg)
CONTENTS
![Page 3: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/3.jpg)
Introduction to deep learning
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.3
Contents
1. Examples
2. Machine learning
3. Neural networks
4. Deep learning
5. Convolutional neural networks
6. Conclusion
7. Additional resources
![Page 4: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/4.jpg)
LET’S START WITH SOME EXAMPLES
![Page 5: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/5.jpg)
Introduction
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.5
Object detection
https://www.youtube.com/watch?v=VOC3huqHrss
![Page 6: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/6.jpg)
Introduction
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.6
Image segmentation
https://www.youtube.com/watch?v=1HJSMR6LW2g
![Page 7: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/7.jpg)
Introduction
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.7
Image colorization
https://www.youtube.com/watch?v=ys5nMO4Q0iY
![Page 8: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/8.jpg)
Introduction
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.8
Mario
https://www.youtube.com/watch?v=L4KBBAwF_bE
![Page 9: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/9.jpg)
MACHINE LEARNING
![Page 10: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/10.jpg)
Machine learning
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.10
What is machine learning?
To learn = algorithmically find the choice of parameters that best explain the data.
- Object detection -- Linear regression -
![Page 11: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/11.jpg)
Machine learning
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.11
Uses machine learning in industry
• Object detection
• Image segmentation
• Image classification
• Speech recognition
• Language understanding and translation
• Spam filters and fraud detection
• Automatic email labelling and sorting
• Personalized search results and recommendations
• Automatic image captioning
• Online advertising
• Medical diagnosis
![Page 12: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/12.jpg)
Machine learning
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.12
Deep learning – what is it?
• A particular subset of ML algorithms a.k.a. “enhanced neural network”
• The closest to an ideal learning agent
![Page 13: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/13.jpg)
NEURAL NETWORKS
![Page 14: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/14.jpg)
Introduction to neural networks
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.14
Biological motivation and connections
Intuition : as humans we use our brains to learn the characteristics of different objects and
phenomena.
Brain neurons: receive input signals from dendrites and produce output signals along axon.
Image taken from http://cs231n.github.io/neural-networks-1
””
![Page 15: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/15.jpg)
Introduction to neural networks
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.15
Biological motivation and connections
Computational model : signals interact multiplicatively with dendrites of the next neuron (linear
combination of the input signals).
The weights (synaptic strengths) and bias (threshold) are learnable. They control the influence of
one neuron over another.
Image taken from http://cs231n.github.io/neural-networks-1
The weights 𝑤𝑖 and bias 𝑏 are
parameters learned through training.
![Page 16: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/16.jpg)
𝑤3
Introduction to neural network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.16
Perceptron
• Element (neuron) that takes decisions based on evidences
• Takes several binary inputs and produces a single binary output
𝑥1
𝑥2
𝑥3
𝑤1
𝑤2
t output
𝑥1, 𝑥2, 𝑥3 ∈ {0,1}
![Page 17: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/17.jpg)
Introduction to neural network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.17
Perceptron: example
You are trying to decide whether to go
to a concert or not.
You might make your decision by weighing up
three factors:
1. Is the weather good?2. Do you have enough time to attend the concert?3. Does your boyfriend or girlfriend want to accompany you?
![Page 18: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/18.jpg)
Perceptron: example
Introduction to neural networks
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.18
weatℎ𝑒𝑟(x1)
time(x2)
company(x3)
output
6
2
2
I only go if the weather is good
5
𝑤1
𝑤2
𝑤3
t
output = 1, 𝑖=1
3 𝑤𝑖 ∗ 𝑥𝑖 > 5
0, 𝑖=13 𝑤𝑖 ∗ 𝑥𝑖 ≤ 5
• Output is 1 only if weather’s input is 1
• T=5 𝑤1 > 5
𝑤2 + 𝑤3 ≤ 5
![Page 19: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/19.jpg)
Perceptron: example
Introduction to neural networks
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.19
I go if the weather is good and I have enough time!
output = 1, 𝑖=1
3 𝑤𝑖 ∗ 𝑥𝑖 > 3
0, 𝑖=13 𝑤𝑖 ∗ 𝑥𝑖 ≤ 3
• Output is 1 only if weather’s input is 1
• T=3
𝑤1 + 𝑤2 > 3𝑤2 + 𝑤3 ≤ 3𝑤1 + 𝑤3 ≤ 3
weather(x1)
time(x2)
company(x3)
output
6
2
2
5
𝑤1
𝑤2
𝑤3
t
![Page 20: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/20.jpg)
Perceptron vs Artificial neuronIntroduction to neural networks
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.20
𝑥1
𝑥2
𝑥3
output
𝑤1
𝑤2
𝑤3t
𝑥1
𝑥2
𝑥3
𝑤1
𝑤2
𝑤3
f(z) output
b
activation function
𝑧 =
𝑖=1
𝑛
𝑤𝑖 ∗ 𝑥𝑖 + 𝑏
𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑓(𝑧)
![Page 21: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/21.jpg)
ACTIVATION FUNCTIONS AND ARCHITECTURES
![Page 22: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/22.jpg)
Architectures and activation functions
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.22
Activation functions
• Properties:
Nonlinear function: makes possible to solve more complex problems -> artificial neural networks are
universal function approximators [Cybenko 1989, Hornik 1991]
Differentiable function: necessary for learning parameters
• The activation function is applied after computing the
linear value z of the neuron
𝑥1
𝑥2
𝑥3
output
𝑤1
𝑤2
𝑤3𝑧 =
𝑖=1
𝑛
𝑤𝑖 ∗ 𝑥𝑖 + 𝑏f(z)
𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑓(𝑧)
![Page 23: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/23.jpg)
Activation functions
Sigmoid function Tanh function
Architectures and activation functions
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.23
σ(z) =1
1+ 𝑒−𝑧tan(z) =
𝑒𝑧−𝑒−𝑧
𝑒𝑧+ 𝑒−𝑧
Images taken from http://neuralnetworksanddeeplearning.com””
![Page 24: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/24.jpg)
Activation functions
ReLU function Leaky ReLU function
Architectures and activation functions
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.24
relu(z) = max(0, 𝑧) Lrelu(z) = max(0.01 ∗ 𝑧, 𝑧)Images taken from https://datascience.stackexchange.com/questions/5706/what-is-the-dying-relu-problem-in-neural-networks””
![Page 25: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/25.jpg)
Architectures and activation functions
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.25
Architectures
Images taken from http://cs231n.github.io/neural-networks-1””
+b
+b
+b
+b
+b
+b
+b
+b
![Page 26: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/26.jpg)
APPLICATIONS
![Page 27: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/27.jpg)
Regression vs Classification
Fit some real-valued function
𝑦𝑖 = 𝑓 𝑥𝑖 ,𝑊 , 𝑦𝑖 ∈ ℝ
Assign a label to an input vector
𝑦𝑖 = 𝑓 𝑥𝑖 ,𝑊 , 𝑦𝑖 ∈ {𝑐𝑎𝑡, 𝑏𝑖𝑘𝑒, 𝑑𝑜𝑔, ℎ𝑜𝑢𝑠𝑒, 𝑐𝑎𝑟}
Neural networks
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.27
0
0,2
0,4
0,6
0,8
Cat Bike Dog House Car
https://cdn.comsol.com/wordpress/2015/03/Experimental-data-and-fitted-function.png
![Page 28: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/28.jpg)
Applications
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.28
Applications of shallow neural networks
Handwritten digit/character
recognitionhttps://knowm.org/wp-content/uploads/Screen-Shot-2015-08-14-at-2.44.57-
PM.png
Stock market (time series) predictionhttp://milenia-finance.com/wp-content/uploads/6359633929809316592035809433_stock-market.jpg
Image compressionhttps://ai2-s2-public.s3.amazonaws.com/figures/2016-11-
08/1e50094bcaf81dac5ea44cea87fd84b25ceb9090/2-Figure3-1.png
![Page 29: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/29.jpg)
TRAINING A NEURAL NETWORK BACKPROPAGATION
![Page 30: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/30.jpg)
Training a neural network - Backpropagation
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.30
General aspects
• Common method for training a neural network
• Goal: optimize the weights so that the neural network can learn how to correctly map arbitrary inputs
to outputs (learn to generalize a problem)
• Steps:
Forward step
Compute the error
Backward step
Update parameters
𝑇𝑎𝑟𝑔𝑒𝑡
𝐶𝑜𝑠𝑡 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛
Image taken from “” https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example””
*example valid for supervised learning
![Page 31: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/31.jpg)
Training a neural network - Backpropagation
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.31
Cost function
• “How good" a neural network did w.r.t. it's given m training samples and the expected output
• The set of weights and biases have done a great job if C(w,b) ≈ 0
• Our aim is to minimize it such that 𝑦 xi, w, b becomes identical to 𝑦 𝑥𝑖
How ?
The Squared Error:
•𝑥𝑖: network input set
•𝑦(𝑥𝑖): labeled output set (expected, true outputs)
• 𝑦(𝑥𝑖 , 𝑤, 𝑏): network outputs
𝐶 𝑤, 𝑏 =1
2
𝑖=1
𝑚
𝑦 𝑥𝑖 − 𝑦 𝑥𝑖 , 𝑤, 𝑏2
![Page 32: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/32.jpg)
Training a neural network - Backpropagation
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.32
Gradient descent
• Optimization algorithm used for finding the minimum of a cost function
• Cost function depends on weights and biases
• Gradient finds how much a weight or bias
causes the cost function’s value
• Update weights and biases to minimize the cost
function
• Learning rate:
used for weights and bias updates
small, positive parameter
fixed or dynamic
Images taken from http://neuralnetworksanddeeplearning.com””
![Page 33: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/33.jpg)
Gradient descent
learning rate
Training a neural network - Backpropagation
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.33
Images taken from http://neuralnetworksanddeeplearning.com””
Weight and bias update after a
training sample:
![Page 34: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/34.jpg)
DEEP LEARNING
![Page 35: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/35.jpg)
NAÏVE DEEP LEARNING
![Page 36: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/36.jpg)
Naïve deep learning
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.36
Applications
Exceptional effective at learning patterns
Solves complex problems
Applications:
Speech recognition Computer vision Natural language processing
![Page 37: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/37.jpg)
Naïve deep learning
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.37
From shallow to deep
Shallow Neural Network Deep Neural NetworkSource: http://neuralnetworksanddeeplearning.com
![Page 38: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/38.jpg)
Naïve deep learning
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.38
Deep neural networks challenges
Inputs are vectors => spatial relationships are not preserved (input scrambling)
The number of parameters increases exponentially with the number of layers
Huge number of parameters would quickly lead to overfitting
Networks with many layers have an unstable gradient problem
![Page 39: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/39.jpg)
DEEP LEARNING
![Page 40: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/40.jpg)
Deep learning
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.40
Neural networks as computational graphs
![Page 41: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/41.jpg)
CONVOLUTIONAL NEURAL NETWORK
![Page 42: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/42.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.42
Layers of Convolutional Neural Network
1. Input layer
2. Convolutional layer
3. Subsampling layer
4. Fully connected layer
Source: https://en.wikipedia.org/wiki/Convolutional_neural_network#/media/File:Typical_cnn.png
![Page 43: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/43.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.43
Input layer: What is an image?
Binary image:
A matrix of pixel values – each pixel is
either 0 or 1
Grayscale image:
A matrix of pixel values – each pixel is a
natural number between 0 and 255
Pixel value – intensity of light
RGB image:
3 matrices of pixel values – each pixel is a
natural number between 0 and 255
Pixel value – intensity of the color (red,
green or blue)
RGB image
BinaryGrayscale
Source: https://medium.com/@ageitgey/machine-learning-is-fun-part-
3-deep-learning-and-convolutional-neural-networks-f40359318721
![Page 44: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/44.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.44
The purpose of convolution is to extract
features from the input:
1. Gets a 3D matrix as an input
e.g.: an RGB image with depth 3
2. “Convolves” multiple kernels on the
input 3D matrix
3. Creates the output 3D matrix:
the feature maps – also called
activation maps
Convolutional layer
Source: http://cs.nyu.edu/~fergus/tutorials/deep_learning_cvpr12/
![Page 45: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/45.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.45
Convolution on a single matrix
Input:
W1 x H1 matrix (e.g. binary image)
Kernel (filter): W2 x H2 matrix
Input matrix KernelInput image
Convolution operation:
Slide the kernel over the W1 x H1 matrix
At each position calculate element wise
multiplication
Calculate the sum of multiplications
Output:
W3 x H3 matrix –> feature map (activation map)
Source: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
![Page 46: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/46.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.46
Another view of the convolution
operation
W1 x H1 matrix W2 x H2 matrix W3 x H3 matrix
(feature map)
Convolution on a single matrix
http://intellabs.github.io/RiverTrail/tutorial/
![Page 47: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/47.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.47
What is a kernel?
It is a learnable filter
The values (weights) of the kernel will be
learned during the training process
The trained filter will activate on the image
(during the forward pass) when it sees some
type of visual feature on the image (e.g.:
edges)
The size of the kernel is a hypermarameter
of the convolutional layer
Typical kernel sizes: 3x3, 5x5
Random kernel Trained kernel
![Page 48: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/48.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.48
Convolution on a 3D matrix
The input is an W1 x H1 x D1 3D matrix
‒ W1 x H1 is the width and height of the 3D matrix
‒ D1 is the depth of the 3D matrix
‒ E.g.: an RGB image (with 3 channels)
To convolve a kernel on the 3D matrix, the depth
of the kernel should be the same: W2 x H2 x D1
The convolution operation is still the same:
‒ Slide (convolve) the kernel across the width and
height of the input 3D matrix
‒ At each position calculate element wise multiplication
‒ Calculate the sum of multiplications
‒ (It produces 1 value at each position)
The output of the convolution is an W3 x H3 x 1
feature map
Kern
elh
eig
ht
Kernel width
Source: https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/convolutional_neural_networks.html
![Page 49: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/49.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.49
Multiple kernels
A convolution layer usually contains
multiple kernels
Each kernel produces a separate 2
dimensional feature map
Stacking these feature maps, one can
get the output volume of the
convolution layer
The number of kernels (the depth of
the output volume) is defined by the
depth hyperparameter
Input 3D matrix Output - 3D matrix6 different feature maps
Source: https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/convolutional_neural_networks.html
Convolution with 6 different kernels:
![Page 50: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/50.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.52
What are the filters learning?
1st layer: edges
2nd layer: corners, local
textures
3rd layer: simple shapes
…
nth layer: complex shapes,
objects
Source: Zeiler & Fergus, 2014
![Page 51: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/51.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.54
Convolutional Layer - Summary
Convolution:
Input: W1 x H1 x D1 3D matrix
Output: W3 x H3 x D3 3D matrix
Hyperparameters of the convolutional layer:
Size of the kernel: W2 x H2 (the depth of the kernel is equal with the input depth)
Number of kernels (depth of the output): D3
Stride: S
Padding: P
Kern
elh
eig
ht
Kernel width
![Page 52: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/52.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.55
Subsampling (Pooling) Layer
Perform a downsampling operation along the
spatial dimensions – width, height
It reduces the spatial size of the representation
It operates independently on every depth slice
of the representation
Most common subsampling operation:
max pooling
Source: http://cs231n.github.io/convolutional-networks/
![Page 53: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/53.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.56
Example: max pooling
Hyperparameters:
Size of stride
Size of the filter
Source: http://cs231n.github.io/convolutional-networks/
![Page 54: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/54.jpg)
Fully Connected LayerConvolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.57
Source: http://cs231n.github.io/convolutional-networks/
It is a neural network (similar as in the
previous course)
The input of the layer are the features
extracted by the convolution and
subsampling layers
The output of the layer is the output of
the full CNN
(for example class probabilities)
![Page 55: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/55.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.58
Information flow in a CNN
Source: http://cs231n.github.io/convolutional-networks/
![Page 56: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/56.jpg)
Convolutional Neural Network
RBRO/ENG2 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.59
History and state of the art
AlexNet
- Initial architecture that
had good
performance (2012)
- 7 layers
VGGNet
- Better performance
using deeper network
with less parameters
(2014)
- 16 layers
GoogLeNet
- Better performance
using processing
done in parallel on
same input (2014)
- Over 100 layers
ResNet (Microsoft)
- Better performance
using residual
information (2015)
- 152 layers
- Models can be trained to perform many different tasks, with few modifications
Sources: http://cv-tricks.com/cnn/understand-resnet-alexnet-vgg-inception/
https://www.saagie.com/blog/object-detection-part1
http://file.scirp.org/Html/4-7800353_65406.htm
![Page 57: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/57.jpg)
Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.60
Case study: AlexNet
ImageNet Large Scale Visual
Recognition Challenge winner in 2012
Task: images classification (each image
is associated with a class)
Dataset:
ImageNet 2012
Training set: 1.2 million images containing 1000
categories (classes)
Testing set: 200.000 images
Training details:
90 full training cycle on the training set
The training took 6 days on two GeForce 580
AlexNet architecture
Train images Results on test imagesSources: http://cv-tricks.com/cnn/understand-resnet-alexnet-vgg-inception/
https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
![Page 58: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/58.jpg)
MOTIVATION OF CONVOLUTIONAL NEURAL
NETWORK
![Page 59: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/59.jpg)
Example:
224x224x3 input image: 150528 long input
=> 150528 weights per neuron
150528*3 (451584) parameters (for 3
neurons)
Motivation of Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.62
Parameter sharing
Example:
224x224x3 input image
11x11x3 size of a filter
3 filters
11*11*3*3 (1089) parameters
(for 3 filters)Source: https://noppa.oulu.fi/noppa/kurssi/521010j/luennot/521010J_convolutional_neural_network.pdf
![Page 60: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/60.jpg)
Motivation of Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.63
Local connectivity and spatial invariance
Share the same
parameters across
different locations
Apply and learn multiple
filters
Source: https://noppa.oulu.fi/noppa/kurssi/521010j/luennot/521010J_convolutional_neural_network.pdf
![Page 61: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/61.jpg)
Motivation of Convolutional Neural Network
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.64
Benefits
The same small set of weights (parameters) is applied on an entire image
Better training. Better generalization
Preserves spatial relationships in the receptive field
Spatial invariance
Local
connectivity
Parameter
sharing
![Page 62: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/62.jpg)
CONCLUSION
![Page 63: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/63.jpg)
Conclusion
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.66
you have access to quite large amounts of
data
the problem is reasonably complex, in a high
dimensional space
no hard constraints or hard logic (smooth,
differentiable)
It’s a general framework
Efficient graph structure
Well performing given the right
circumstances
And some tips
Strengths of deep learning: You should consider deep learning if:
![Page 64: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/64.jpg)
RESOURCES
![Page 65: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/65.jpg)
Libraries and networks
• C++ fans for easy prototyping:
• OpenCV – both Neural Networks and Adaboost
• FANN – Fast Artificial Neural Network Library
• OpenNN – Open Neural Network Library
• Nnabla – Neural network libraries by Sony
• Python:
• Tensorflow
• Keras
• Theano
• Caffe(also for C++)
• Torch7
• PyTorch
• DeepLearning4J
• MXNet
• Deepy
• Lasagne
• Nolearn
• NeuPy
Resources
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.76
![Page 66: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/66.jpg)
Learning resourcesResources
RBRO/ESA1 | 30/08/2017
© Robert Bosch GmbH 2017. All rights reserved, also regarding any disposal, exploitation, reproduction, editing, distribution, as well as in the event of applications for industrial property rights.77
Stanford CS231n course (Karpathy et. al.)
Deep learning book (Goodfellow)
various introductory YouTube videos
For beginners:
www.reddit.com/r/LearnMachineLearning
After you grasp the concepts:
www.reddit.com/r/MachineLearning
![Page 67: INTRODUCTION TO DEEP LEARNINGlauras/test/docs/school/IA/2017-2018/...Differentiable function: necessary for learning parameters • The activation function is applied after computing](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec64edcae97b357ee5ad76e/html5/thumbnails/67.jpg)
THANK YOU
QUESTIONS?