introduction to recurrent neural networks (rnn), long short-term memory (lstm) wenjie pei
TRANSCRIPT
![Page 1: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/1.jpg)
Introduction to Recurrent neural networks (RNN), Long short-
term memory (LSTM)
Wenjie Pei
![Page 2: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/2.jpg)
Artificial Neural Networks
• Feedforward neural networks– ANNs without cycle connections between nodes
• (Feedback) Recurrent neural networks– ANNs with cycle connections between nodes
![Page 3: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/3.jpg)
Feedforward Neural Networks
• Multilayer perceptron (MLP)
Universal function approximation theory:Sufficient nonlinear hidden units approximate any continuous mapping function
Drawback:Output depends only on the current inputNo temporal information dependencies
![Page 4: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/4.jpg)
Recurrent Neural Networks
Advantage: Memory of previous inputsIncorporate contextual information
Feedback from hidden unit activation of last time step to current time step
Universal approximation theory:An RNN with sufficient hidden unitsAny measurable sequence-to-sequence mappingor dynamic system
![Page 5: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/5.jpg)
Recurrent Neural Networks
• Bidirectional RNNs
![Page 6: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/6.jpg)
Recurrent Neural Networks
• Vanishing gradient problem
Sensitivity decay exponentially over the time
![Page 7: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/7.jpg)
Long Short-Term Memory (LSTM)Input gate [0, 1]: How much information from input could go into the cell
Forget gate [0, 1]: How much information from last time step could enter the cell
Output gate [0, 1]: How much information to output
![Page 8: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/8.jpg)
Long Short-Term Memory
• Advantage: long-period time memory
![Page 9: Introduction to Recurrent neural networks (RNN), Long short-term memory (LSTM) Wenjie Pei](https://reader035.vdocuments.us/reader035/viewer/2022072108/56649d6d5503460f94a4dde9/html5/thumbnails/9.jpg)
Applications
• Applications to sequence labeling problems:– Handwritten character recognition– Speech recognition– Protein secondary structure prediction– …
Want to know more about the latest papers:Waiting for my next coffee talk