it d ti t n lintroduction to neural networks neurali... · 2012. 9. 3. · it d ti t n...
TRANSCRIPT
![Page 1: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/1.jpg)
I t d ti t N lIntroduction to Neural NetworksNetworks
Gianluca Pollastri, Head of LabSchool of Computer Science and Informatics and
Complex and Adaptive Systems LabsComplex and Adaptive Systems LabsUniversity College [email protected]
![Page 2: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/2.jpg)
CreditsCredits
Geoffrey Hinton, University of Toronto.borrowed some of his slides for “Neural
Networks” and “Computation in Neural Networks” courses.
Paolo Frasconi, University of Florence.This guy taught me Neural Networks in the firstThis guy taught me Neural Networks in the first
place (*and* I borrowed some of his slides too!).
![Page 3: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/3.jpg)
Recurrent Neural Networks (RNN)Recurrent Neural Networks (RNN)
One of the earliest versions: Jeffrey Elman, 1990, Cognitive Science., , g
P bl it i ’t t t tiProblem: it isn’t easy to represent time with Feedforward Neural Nets: usually time is represented with space.Attempt to design networks with memoryAttempt to design networks with memory.
![Page 4: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/4.jpg)
RNNsRNNs
The idea is having discrete time steps, and considering the hidden layer at time t-1 as g yan input at time t.This effectively removes cycles: we canThis effectively removes cycles: we can
model the network using an FFNN, and model memory explicitly.
![Page 5: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/5.jpg)
Ot
Xt dXt d
Itd = delay element
![Page 6: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/6.jpg)
BPTTBPTT
BackPropagation Through Time. If Ot is the output at time t It the input at If Ot is the output at time t, It the input at
time t, and Xt the memory (hidden) at time t we can model the dependencies ast, we can model the dependencies as follows:
![Page 7: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/7.jpg)
BPTTBPTT
We can model both f() and g() with (possibly multilayered) networks.(p y y )We can transform the recurrent network by
unrolling it in timeunrolling it in time. Backpropagation works on any DAG. An
RNN becomes one once it’s unrolled.
![Page 8: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/8.jpg)
Ot
Xt dXt d
Itd = delay element
![Page 9: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/9.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 10: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/10.jpg)
gradient in BPTTgradient in BPTT GRADIENT(I O T) { GRADIENT(I,O,T) { # I=inputs, O=outputs, T=targets T := size(O); X0 := 0; for t := 1..T Xt := f( Xt-1 , It ); for t := 1..T { Ot := g( Xt , It ); g.gradient( Ot - Tt );g g δt = g.deltas( Ot - Tt ); } for t := T..1for t : T..1 f.gradient( δt ); δt-1 += f.deltas( δt ); } }
![Page 11: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/11.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 12: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/12.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 13: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/13.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 14: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/14.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 15: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/15.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 16: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/16.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 17: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/17.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 18: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/18.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 19: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/19.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 20: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/20.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 21: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/21.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 22: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/22.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 23: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/23.jpg)
Ot Ot+1Ot-1 Ot+2Ot-2
X XX XX Xt Xt+1Xt-1 Xt+2Xt-2
It It+1It-1 It+2It-2
![Page 24: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/24.jpg)
What I will talk aboutWhat I will talk about
Neurons Neurons Multi-Layered Neural Networks:
Basic learning algorithm E pressi e po er Expressive power Classification
How can we *actually* train Neural Networks: Speeding up training Speeding up training Learning just right (not too little, not too much) Figuring out you got it right
Feed back networks? Feed-back networks? Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann
Machines) Recurrent Neural Networks Recurrent Neural Networks Bidirectional RNN 2D-RNN
Concluding remarksConcluding remarks
![Page 25: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/25.jpg)
Bidirectional Recurrent Neural Networks (BRNN)
![Page 26: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/26.jpg)
BRNNBRNN
Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )
• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:
stationary
![Page 27: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/27.jpg)
BRNNBRNN
Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )
• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:
stationary
![Page 28: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/28.jpg)
BRNNBRNN
Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )
• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:
stationary
![Page 29: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/29.jpg)
BRNNBRNN
Ft = ( Ft-1 , Ut )Bt = ( Bt+1 Ut )Bt ( Bt+1 , Ut )Yt = ( Ft , Bt , Ut )
• () () ed () are realised with NN• () () ed () are realised with NN• (), () and () are independent from t:
stationary
![Page 30: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/30.jpg)
Inference in BRNNsInference in BRNNs
FORWARD(U) { FORWARD(U) { T size(U); F B 0; F0 BT+1 0; for t 1..T Ft = ( Ft 1 , Ut ); Ft ( Ft-1 , Ut ); for t T..1 Bt = ( Bt+1 , Ut );t ( t+1 t ) for t 1..T Yt = ( Ft , Bt , Ut ); return Y; }
![Page 31: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/31.jpg)
Learning in BRNNsLearning in BRNNs
GRADIENT(U Y) { f T 1 GRADIENT(U,Y) { T size(U); F B 0;
for t T..1 δFt-1 +=
.backprop&gradient(δFt ); F0 BT+1 0; for t 1..T Ft = ( Ft-1 , Ut );
p p g ( Ft ); for t 1..T δBt+1 +=
b k & di t(δ )t t 1 t
for t T..1 Bt = ( Bt+1 , Ut );
f t 1 T {
.backprop&gradient(δBt ); }
for t 1..T { Yt = ( Ft , Bt , Ut ); [δFt δBt] = [δFt, δBt]
.backprop&gradient( Yt - Yt ); }
![Page 32: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/32.jpg)
What I will talk aboutWhat I will talk about
Neurons Neurons Multi-Layered Neural Networks:
Basic learning algorithm E pressi e po er Expressive power Classification
How can we *actually* train Neural Networks: Speeding up training Speeding up training Learning just right (not too little, not too much) Figuring out you got it right
Feed back networks? Feed-back networks? Anecdotes on real feed-back networks (Hopfield Nets, Boltzmann
Machines) Recurrent Neural Networks Recurrent Neural Networks Bidirectional RNN 2D-RNN
Concluding remarksConcluding remarks
![Page 33: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/33.jpg)
2D RNNs2D RNNs
ll i ldi fPollastri & Baldi 2002, BioinformaticsBaldi & Pollastri 2003, JMLR
![Page 34: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/34.jpg)
2D RNNs2D RNNs
![Page 35: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/35.jpg)
2D RNNs2D RNNs
![Page 36: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/36.jpg)
2D RNNs2D RNNs
![Page 37: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/37.jpg)
2D RNNs2D RNNs
![Page 38: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/38.jpg)
2D RNNs2D RNNs
![Page 39: It d ti t N lIntroduction to Neural Networks Neurali... · 2012. 9. 3. · It d ti t N lIntroduction to Neural Networks Gianluca Pollastri, Head of Lab School of Computer Science](https://reader035.vdocuments.us/reader035/viewer/2022071415/610f9c37da1e0d4012655839/html5/thumbnails/39.jpg)
2D RNNs2D RNNs