elec 576: neural networks & backpropagation lecture 3€¦ · elec 576: neural networks &...
TRANSCRIPT
![Page 1: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/1.jpg)
ELEC 576: Neural Networks & Backpropagation
Lecture 3Ankit B. Patel
Baylor College of Medicine (Neuroscience Dept.) Rice University (ECE Dept.)
![Page 2: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/2.jpg)
Outline• Neural Networks
• Definition of NN and terminology
• Review of (Old) Theoretical Results about NNs
• Intuition for why compositions of nonlinear functions are more expressive
• Expressive power theorems [McC-Pitts, Rosenblatt, Cybenko]
• Backpropagation algorithm (Gradient Descent + Chain Rule)
• History of backprop summary
• Gradient descent (Review).
• Chain Rule (Review).
• Backprop
• Intro to Convnets
• Convolutional Layer, ReLu, Max-Pooling
![Page 3: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/3.jpg)
Neural Networks
![Page 4: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/4.jpg)
Neural Network: Definitions
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
Input Units
Output Units
Net Input (Output)
Activation
![Page 5: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/5.jpg)
Neural Networks: Activation Functions
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
![Page 6: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/6.jpg)
Neural Networks: Definitions
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
Feedforward Propagation: Scalar Form
Input Units
Output Units
Net Input
(Output) Activation
Hidden Units
![Page 7: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/7.jpg)
Neural Networks: Definitions
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
Feedforward Propagation: Vector Form
Input Units
Output Units
Net Input
(Output) Activation
Hidden Units
![Page 8: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/8.jpg)
Neural Networks: Definitions
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
Deep Feedforward Propagation: Vector Form
![Page 9: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/9.jpg)
Neural Networks: Definitions
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
The Training Objective
![Page 10: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/10.jpg)
Expressive Power Theorems
![Page 11: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/11.jpg)
Compositions of Nonlinear Functions are more expressive
[Yoshua Bengio]
![Page 12: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/12.jpg)
McCulloch-Pitts Neurons
![Page 13: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/13.jpg)
Expressive Power of McCulloch-Pitts Nets
![Page 14: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/14.jpg)
The Perceptron (Rosenblatt)
![Page 15: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/15.jpg)
Limitations of Perceptron• Rosenblatt was overly enthusiastic about the perceptron and made the ill-timed
proclamation that:
• "Given an elementary α-perceptron, a stimulus world W, and any classification C(W) for which a solution exists; let all stimuli in W occur in any sequence, provided that each stimulus must reoccur in finite time; then beginning from an arbitrary initial state, an error correction procedure will always yield a solution to C(W) in finite time…” [4]
• In 1969, Marvin Minsky and Seymour Papert showed that the perceptron could only solve linearly separable functions. Of particular interest was the fact that the perceptron still could not solve the XOR and NXOR functions.
• Problem outlined by Minsky and Papert can be solved by deep NNs. However, many of the artificial neural networks in use today still stem from the early advances of the McCulloch-Pitts neuron and the Rosenblatt perceptron.
![Page 16: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/16.jpg)
Universal Approximation Theorem [Cybenko 1989, Hornik 1991]
• https://en.wikipedia.org/wiki/Universal_approximation_theorem
![Page 17: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/17.jpg)
Universal Approximation Theorem• https://en.wikipedia.org/wiki/Universal_approximation_theorem
• Shallow neural networks can represent a wide variety of interesting functions when given appropriate parameters; however, it does not touch upon the algorithmic learnability of those parameters.
• Proved by George Cybenko in 1989 for sigmoid activation functions.[2]
• Kurt Hornik showed in 1991[3] that it is not the specific choice of the activation function, but rather the multilayer feedforward architecture itself which gives neural networks the potential of being universal approximators.
![Page 18: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/18.jpg)
Question (5 min):Why is the theorem true? What is the intuition?
What happens when you go deep? Try iterating f(x) = x^2 vs. f(x) = ax+b
![Page 19: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/19.jpg)
Training Neural Networks Via Gradient Descent
![Page 20: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/20.jpg)
Gradient Descent
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 21: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/21.jpg)
Gradient Descent
![Page 22: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/22.jpg)
Gradient Descent
![Page 23: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/23.jpg)
Question:What kind of problems might you run into with Gradient Descent? (4 min)
![Page 24: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/24.jpg)
Global Optima is not Guaranteed
![Page 25: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/25.jpg)
Learning Rate Needs to Be Carefully Chosen
![Page 26: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/26.jpg)
Training Neural Networks: Computing Gradients Efficiently
with the Backpropagation Algorithm
![Page 27: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/27.jpg)
Chain Rule
![Page 28: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/28.jpg)
Chain Rule
![Page 29: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/29.jpg)
Chain Rule
![Page 30: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/30.jpg)
Exercise:Do Chain Rule on
a nested function (2 min)
![Page 31: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/31.jpg)
Backpropagation is an efficient way to compute gradients
• a.k.a. Reverse Mode Automatic Differentiation (AD), and based on a systematic application of the chain rule. It is fast for low-dimensional outputs. For one output (e.g. a scalar loss function), time to compute gradients with respect to ALL inputs is proportional to the time to compute the output. An explicit mathematical expression of the output is not required, only an algorithm to compute it.
• it is NOT the same as symbolic differentiation (e.g. mathematica).
• Numerical/Finite Differences are slow for high-dimensional inputs (e.g. model parameters) and outputs. For a single output, time to compute gradients scales as the number of inputs. May suffer from issues of floating point precision and requires a choice of a parameter increment.
https://www.cs.cmu.edu/~mgormley/courses/10601-s17/slides/lecture20-backprop.pdf
Centered Finite DifferenceGeometrical
Secant
![Page 32: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/32.jpg)
There is no equivalent Cheap Jacobian Principle or Cheap Hessian Principle
The Cheap Gradient Principle in Backpropagation: The time complexity scales up to the number of
operations performed in the forward pass
https://www.math.uni-bielefeld.de/documenta/vol-ismp/52_griewank-andreas-b.pdf
𝙾𝙿𝚂 {F′�(x)} ≤ 𝚖 ω 𝙾𝙿𝚂 {F(x)}
for polynomial operations and OPS counting the number of multiplications
𝙾𝙿𝚂 {∇𝚏(x)} ≤ ω 𝙾𝙿𝚂 {𝚏(x)}
ω = 3
ω ∼ 5x Is a multidimensional input,
F(x)More generally, for an m-dimensional output
![Page 33: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/33.jpg)
The Spatial Complexity of Backpropagation scales with the number of operations
performed in the forward pass
https://www.math.uni-bielefeld.de/documenta/vol-ismp/52_griewank-andreas-b.pdf
𝙼𝙴𝙼 {F′�(x)} ∼ 𝙾𝙿𝚂 {F(x)} ≳ 𝙼𝙴𝙼 {F(x)}
![Page 34: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/34.jpg)
There is no Cheap Jacobian Principle or Cheap Hessian Principle but a Jacobian-vector product can be computed as efficiently as the gradient, and a Hessian-vector
product can be computed efficiently in O(n) instead of O(nxn)
Temporal Complexity in Automatic Differentiation
https://arxiv.org/pdf/1502.05767.pdf https://www.math.uni-bielefeld.de/documenta/vol-ismp/52_griewank-andreas-b.pdf
𝙾𝙿𝚂 {F′�(x)} ≤ 𝚖 ω 𝙾𝙿𝚂 {F(x)}
x Is a n-dimensional input, F(x) Is an m-dimensional output
Reverse Mode:
𝙾𝙿𝚂 {F′�(x)} ≤ 𝚗 ω 𝙾𝙿𝚂 {F(x)}Forward Mode:
ω < 6
![Page 35: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/35.jpg)
How to Learn NNs? History of the Backpropagation Algorithm (1960-86)
• Introduced by Henrey J. Kelley (1960) and Arthur Bryson (1961) in control theory, using Dynamic Programming
• Simpler derivation using Chain Rule by Stephen Dreyfus (1962)
• General method for Automatic Differentiation by Seppo Linnainamaa (1970)
• Using backdrop for parameters of controllers minimizing error by Stuart Dreyfus (1973)
• Backprop brought into NN world by Paul Werbos (1974)
• Used it to learn representations in hidden layers of NNs by Rumelhart, Hinton & Williams (1986)
![Page 36: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/36.jpg)
Modified from https://www.cs.cmu.edu/~mgormley/courses/10601-s17/slides/lecture20-backprop.pdf
Output calculation
Pass Pass
Backpropagation Example (5 min)
![Page 37: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/37.jpg)
Modified from https://www.cs.cmu.edu/~mgormley/courses/10601-s17/slides/lecture20-backprop.pdf
Output calculation
Gradient calculationLinked by the chain rule
Pass Pass
Backpropagation Example
![Page 38: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/38.jpg)
Modified from https://www.cs.cmu.edu/~mgormley/courses/10601-s17/slides/lecture20-backprop.pdf
The backward pass computes the derivative of the single output J wrt all
inputs efficiently
Pass Pass
xj , ✓j , a, y<latexit sha1_base64="crdFHbUXSsnvTGKUcKBVcxaFfdM=">AAAB+XicbZDLSsNAFIYn9VbrLerSzWARXJSSiKDLohuXFewF2hAm00k7djIJMyfFEPomblwo4tY3cefbOG2z0NYfBj7+cw7nzB8kgmtwnG+rtLa+sblV3q7s7O7tH9iHR20dp4qyFo1FrLoB0UxwyVrAQbBuohiJAsE6wfh2Vu9MmNI8lg+QJcyLyFDykFMCxvJt+8l/rPVhxIAYILXMt6tO3ZkLr4JbQBUVavr2V38Q0zRiEqggWvdcJwEvJwo4FWxa6aeaJYSOyZD1DEoSMe3l88un+Mw4AxzGyjwJeO7+nshJpHUWBaYzIjDSy7WZ+V+tl0J47eVcJikwSReLwlRgiPEsBjzgilEQmQFCFTe3YjoiilAwYVVMCO7yl1ehfVF3nbp7f1lt3BRxlNEJOkXnyEVXqIHuUBO1EEUT9Ixe0ZuVWy/Wu/WxaC1Zxcwx+iPr8wfex5Ml</latexit><latexit sha1_base64="crdFHbUXSsnvTGKUcKBVcxaFfdM=">AAAB+XicbZDLSsNAFIYn9VbrLerSzWARXJSSiKDLohuXFewF2hAm00k7djIJMyfFEPomblwo4tY3cefbOG2z0NYfBj7+cw7nzB8kgmtwnG+rtLa+sblV3q7s7O7tH9iHR20dp4qyFo1FrLoB0UxwyVrAQbBuohiJAsE6wfh2Vu9MmNI8lg+QJcyLyFDykFMCxvJt+8l/rPVhxIAYILXMt6tO3ZkLr4JbQBUVavr2V38Q0zRiEqggWvdcJwEvJwo4FWxa6aeaJYSOyZD1DEoSMe3l88un+Mw4AxzGyjwJeO7+nshJpHUWBaYzIjDSy7WZ+V+tl0J47eVcJikwSReLwlRgiPEsBjzgilEQmQFCFTe3YjoiilAwYVVMCO7yl1ehfVF3nbp7f1lt3BRxlNEJOkXnyEVXqIHuUBO1EEUT9Ixe0ZuVWy/Wu/WxaC1Zxcwx+iPr8wfex5Ml</latexit><latexit sha1_base64="crdFHbUXSsnvTGKUcKBVcxaFfdM=">AAAB+XicbZDLSsNAFIYn9VbrLerSzWARXJSSiKDLohuXFewF2hAm00k7djIJMyfFEPomblwo4tY3cefbOG2z0NYfBj7+cw7nzB8kgmtwnG+rtLa+sblV3q7s7O7tH9iHR20dp4qyFo1FrLoB0UxwyVrAQbBuohiJAsE6wfh2Vu9MmNI8lg+QJcyLyFDykFMCxvJt+8l/rPVhxIAYILXMt6tO3ZkLr4JbQBUVavr2V38Q0zRiEqggWvdcJwEvJwo4FWxa6aeaJYSOyZD1DEoSMe3l88un+Mw4AxzGyjwJeO7+nshJpHUWBaYzIjDSy7WZ+V+tl0J47eVcJikwSReLwlRgiPEsBjzgilEQmQFCFTe3YjoiilAwYVVMCO7yl1ehfVF3nbp7f1lt3BRxlNEJOkXnyEVXqIHuUBO1EEUT9Ixe0ZuVWy/Wu/WxaC1Zxcwx+iPr8wfex5Ml</latexit><latexit sha1_base64="crdFHbUXSsnvTGKUcKBVcxaFfdM=">AAAB+XicbZDLSsNAFIYn9VbrLerSzWARXJSSiKDLohuXFewF2hAm00k7djIJMyfFEPomblwo4tY3cefbOG2z0NYfBj7+cw7nzB8kgmtwnG+rtLa+sblV3q7s7O7tH9iHR20dp4qyFo1FrLoB0UxwyVrAQbBuohiJAsE6wfh2Vu9MmNI8lg+QJcyLyFDykFMCxvJt+8l/rPVhxIAYILXMt6tO3ZkLr4JbQBUVavr2V38Q0zRiEqggWvdcJwEvJwo4FWxa6aeaJYSOyZD1DEoSMe3l88un+Mw4AxzGyjwJeO7+nshJpHUWBaYzIjDSy7WZ+V+tl0J47eVcJikwSReLwlRgiPEsBjzgilEQmQFCFTe3YjoiilAwYVVMCO7yl1ehfVF3nbp7f1lt3BRxlNEJOkXnyEVXqIHuUBO1EEUT9Ixe0ZuVWy/Wu/WxaC1Zxcwx+iPr8wfex5Ml</latexit>
Backpropagation Example
![Page 39: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/39.jpg)
Modified from https://www.cs.cmu.edu/~mgormley/courses/10601-s17/slides/lecture20-backprop.pdf
How efficient? The backward pass takes time proportional to making
the forward pass.
Pass Pass
Backpropagation Example
![Page 40: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/40.jpg)
Modified from https://www.cs.cmu.edu/~mgormley/courses/10601-s17/slides/lecture20-backprop.pdf
The values of the derivatives are computed at each step. Backprop does not store their mathematical
expressions, unlike in symbolic differentiation
Pass Pass
Backpropagation Example
![Page 41: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/41.jpg)
When is Backpropagation efficient
Backprop(Reverse Mode AD)
FD
SymbolicDifferentiation
@J({✓i}, {xj})@✓1
,
@J({✓i}, {xj})@✓2
, ...
<latexit sha1_base64="TVic7qeEw4LNE1ZBp+bQmDBdDpM=">AAACZHicnVHLS8MwHE7re76q4kmQ4BAmSGmHoEfRi3ia4FRYSkmz1MWlD5JfxVH6T3rz6MW/w2wW1M2TPwh8fI88vkS5FBo8782y5+YXFpeWVxqra+sbm87W9p3OCsV4l2UyUw8R1VyKlHdBgOQPueI0iSS/j4aXY/3+mSstsvQWRjkPEvqYilgwCoYKnZLEirKS5FSBoBJft0hJYMCBhoJUx6R8CZ9IdVR9O2rVr47xf7Ntk3VdN3SanutNBs8CvwZNVE8ndF5JP2NFwlNgkmrd870cgnK8OZO8apBC85yyIX3kPQNTmnAdlJOSKnxomD6OM2VWCnjC/kyUNNF6lETGmVAY6GltTP6l9QqIz4JSpHkBPGVfB8WFxJDhceO4LxRnIEcGUKaEuStmA2qaA/MvDVOCP/3kWXDXdn3P9W9OmucXdR3LaA8doBby0Sk6R1eog7qIoXdryXKsLevDXrN37N0vq23VmR30a+z9T1SSuF8=</latexit><latexit sha1_base64="TVic7qeEw4LNE1ZBp+bQmDBdDpM=">AAACZHicnVHLS8MwHE7re76q4kmQ4BAmSGmHoEfRi3ia4FRYSkmz1MWlD5JfxVH6T3rz6MW/w2wW1M2TPwh8fI88vkS5FBo8782y5+YXFpeWVxqra+sbm87W9p3OCsV4l2UyUw8R1VyKlHdBgOQPueI0iSS/j4aXY/3+mSstsvQWRjkPEvqYilgwCoYKnZLEirKS5FSBoBJft0hJYMCBhoJUx6R8CZ9IdVR9O2rVr47xf7Ntk3VdN3SanutNBs8CvwZNVE8ndF5JP2NFwlNgkmrd870cgnK8OZO8apBC85yyIX3kPQNTmnAdlJOSKnxomD6OM2VWCnjC/kyUNNF6lETGmVAY6GltTP6l9QqIz4JSpHkBPGVfB8WFxJDhceO4LxRnIEcGUKaEuStmA2qaA/MvDVOCP/3kWXDXdn3P9W9OmucXdR3LaA8doBby0Sk6R1eog7qIoXdryXKsLevDXrN37N0vq23VmR30a+z9T1SSuF8=</latexit><latexit sha1_base64="TVic7qeEw4LNE1ZBp+bQmDBdDpM=">AAACZHicnVHLS8MwHE7re76q4kmQ4BAmSGmHoEfRi3ia4FRYSkmz1MWlD5JfxVH6T3rz6MW/w2wW1M2TPwh8fI88vkS5FBo8782y5+YXFpeWVxqra+sbm87W9p3OCsV4l2UyUw8R1VyKlHdBgOQPueI0iSS/j4aXY/3+mSstsvQWRjkPEvqYilgwCoYKnZLEirKS5FSBoBJft0hJYMCBhoJUx6R8CZ9IdVR9O2rVr47xf7Ntk3VdN3SanutNBs8CvwZNVE8ndF5JP2NFwlNgkmrd870cgnK8OZO8apBC85yyIX3kPQNTmnAdlJOSKnxomD6OM2VWCnjC/kyUNNF6lETGmVAY6GltTP6l9QqIz4JSpHkBPGVfB8WFxJDhceO4LxRnIEcGUKaEuStmA2qaA/MvDVOCP/3kWXDXdn3P9W9OmucXdR3LaA8doBby0Sk6R1eog7qIoXdryXKsLevDXrN37N0vq23VmR30a+z9T1SSuF8=</latexit><latexit sha1_base64="TVic7qeEw4LNE1ZBp+bQmDBdDpM=">AAACZHicnVHLS8MwHE7re76q4kmQ4BAmSGmHoEfRi3ia4FRYSkmz1MWlD5JfxVH6T3rz6MW/w2wW1M2TPwh8fI88vkS5FBo8782y5+YXFpeWVxqra+sbm87W9p3OCsV4l2UyUw8R1VyKlHdBgOQPueI0iSS/j4aXY/3+mSstsvQWRjkPEvqYilgwCoYKnZLEirKS5FSBoBJft0hJYMCBhoJUx6R8CZ9IdVR9O2rVr47xf7Ntk3VdN3SanutNBs8CvwZNVE8ndF5JP2NFwlNgkmrd870cgnK8OZO8apBC85yyIX3kPQNTmnAdlJOSKnxomD6OM2VWCnjC/kyUNNF6lETGmVAY6GltTP6l9QqIz4JSpHkBPGVfB8WFxJDhceO4LxRnIEcGUKaEuStmA2qaA/MvDVOCP/3kWXDXdn3P9W9OmucXdR3LaA8doBby0Sk6R1eog7qIoXdryXKsLevDXrN37N0vq23VmR30a+z9T1SSuF8=</latexit>
High-dimensional inputs
YESCheap Gradient Principle
time cost ~ one forward pass
NOtime cost is multiple forward passes; 2 PER input
May not beFormula for J can grow exponentially in size,
aka Expression Swell(https://arxiv.org/pdf/1502.05767.pdf)
Efficient?High-dimensional outputs
@J1({✓i}, {xj})@✓
,
@J2({✓i}, {xj})@✓
, ...
<latexit sha1_base64="nv0sERYZsqD+bDshzoY4pppSDmM=">AAACZHicjVFJSwMxGM2Me91GiydBgkVQKMOMCHoUvYinCnaBpgyZNNNGMwvJN2IZ5k968+jF32G6gNp68IPA4y1ZXsJMCg2e927ZS8srq2vrG5XNre2dXWdvv6XTXDHeZKlMVSekmkuR8CYIkLyTKU7jUPJ2+Hw71tsvXGmRJo8wyngvpoNERIJRMFTgFCRSlBUkowoElfg+8E9JQWDIgQaClHVSvAZPpDwrvz1Ttazjhez5/7Ou6wZOzXO9yeBF4M9ADc2mEThvpJ+yPOYJMEm17vpeBr1ivDWTvKyQXPOMsmc64F0DExpz3SsmJZX4xDB9HKXKrATwhP2ZKGis9SgOjTOmMNTz2pj8S+vmEF31CpFkOfCETQ+KcokhxePGcV8ozkCODKBMCXNXzIbUNAfmXyqmBH/+yYugde76nus/XNSub2Z1rKNDdIxOkY8u0TW6Qw3URAx9WGuWY+1Zn/aWXbUPplbbmmWq6NfYR19D7bhf</latexit><latexit sha1_base64="nv0sERYZsqD+bDshzoY4pppSDmM=">AAACZHicjVFJSwMxGM2Me91GiydBgkVQKMOMCHoUvYinCnaBpgyZNNNGMwvJN2IZ5k968+jF32G6gNp68IPA4y1ZXsJMCg2e927ZS8srq2vrG5XNre2dXWdvv6XTXDHeZKlMVSekmkuR8CYIkLyTKU7jUPJ2+Hw71tsvXGmRJo8wyngvpoNERIJRMFTgFCRSlBUkowoElfg+8E9JQWDIgQaClHVSvAZPpDwrvz1Ttazjhez5/7Ou6wZOzXO9yeBF4M9ADc2mEThvpJ+yPOYJMEm17vpeBr1ivDWTvKyQXPOMsmc64F0DExpz3SsmJZX4xDB9HKXKrATwhP2ZKGis9SgOjTOmMNTz2pj8S+vmEF31CpFkOfCETQ+KcokhxePGcV8ozkCODKBMCXNXzIbUNAfmXyqmBH/+yYugde76nus/XNSub2Z1rKNDdIxOkY8u0TW6Qw3URAx9WGuWY+1Zn/aWXbUPplbbmmWq6NfYR19D7bhf</latexit><latexit sha1_base64="nv0sERYZsqD+bDshzoY4pppSDmM=">AAACZHicjVFJSwMxGM2Me91GiydBgkVQKMOMCHoUvYinCnaBpgyZNNNGMwvJN2IZ5k968+jF32G6gNp68IPA4y1ZXsJMCg2e927ZS8srq2vrG5XNre2dXWdvv6XTXDHeZKlMVSekmkuR8CYIkLyTKU7jUPJ2+Hw71tsvXGmRJo8wyngvpoNERIJRMFTgFCRSlBUkowoElfg+8E9JQWDIgQaClHVSvAZPpDwrvz1Ttazjhez5/7Ou6wZOzXO9yeBF4M9ADc2mEThvpJ+yPOYJMEm17vpeBr1ivDWTvKyQXPOMsmc64F0DExpz3SsmJZX4xDB9HKXKrATwhP2ZKGis9SgOjTOmMNTz2pj8S+vmEF31CpFkOfCETQ+KcokhxePGcV8ozkCODKBMCXNXzIbUNAfmXyqmBH/+yYugde76nus/XNSub2Z1rKNDdIxOkY8u0TW6Qw3URAx9WGuWY+1Zn/aWXbUPplbbmmWq6NfYR19D7bhf</latexit><latexit sha1_base64="nv0sERYZsqD+bDshzoY4pppSDmM=">AAACZHicjVFJSwMxGM2Me91GiydBgkVQKMOMCHoUvYinCnaBpgyZNNNGMwvJN2IZ5k968+jF32G6gNp68IPA4y1ZXsJMCg2e927ZS8srq2vrG5XNre2dXWdvv6XTXDHeZKlMVSekmkuR8CYIkLyTKU7jUPJ2+Hw71tsvXGmRJo8wyngvpoNERIJRMFTgFCRSlBUkowoElfg+8E9JQWDIgQaClHVSvAZPpDwrvz1Ttazjhez5/7Ou6wZOzXO9yeBF4M9ADc2mEThvpJ+yPOYJMEm17vpeBr1ivDWTvKyQXPOMsmc64F0DExpz3SsmJZX4xDB9HKXKrATwhP2ZKGis9SgOjTOmMNTz2pj8S+vmEF31CpFkOfCETQ+KcokhxePGcV8ozkCODKBMCXNXzIbUNAfmXyqmBH/+yYugde76nus/XNSub2Z1rKNDdIxOkY8u0TW6Qw3URAx9WGuWY+1Zn/aWXbUPplbbmmWq6NfYR19D7bhf</latexit>
NO
May not beUnless common subexpressions are leveraged
May not be
YESForward Mode AD NO
![Page 42: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/42.jpg)
Pseudo-Code for Backprop: Scalar Form
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
![Page 43: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/43.jpg)
Pseudo-Code for Backprop: Matrix-Vector Form
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
![Page 44: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/44.jpg)
Gradient Descent for Neural Networks
http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
![Page 45: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/45.jpg)
Backpropagation: Network View
![Page 46: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/46.jpg)
Another Deeper Example (for practice)
![Page 47: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/47.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 48: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/48.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 49: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/49.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 50: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/50.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 51: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/51.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 52: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/52.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 53: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/53.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 54: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/54.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 55: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/55.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 56: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/56.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 57: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/57.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 58: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/58.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 59: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/59.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 60: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/60.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 61: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/61.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 62: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/62.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 63: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/63.jpg)
Example
[Fei-Fei Li, Andrej Karpathy, Justin Johnson]
![Page 64: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/64.jpg)
Question:What problems might you encounter with deeply nested functions? (3 min)
![Page 65: ELEC 576: Neural Networks & Backpropagation Lecture 3€¦ · ELEC 576: Neural Networks & Backpropagation Lecture 3 Ankit B. Patel Baylor College of Medicine (Neuroscience Dept.)](https://reader034.vdocuments.us/reader034/viewer/2022042318/5f0726e07e708231d41b905c/html5/thumbnails/65.jpg)
Visualizing Backprop during Training:Classification with 2-Layer Neural Network
• http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html
• Try playing around with this app to build intuition:
• change datapoints to see how decision boundaries change
• change network layer types, widths, activation functions, etc.
• try shallower vs deeper