019 20160907 decoupled neural interfaces using synthetic gradients
TRANSCRIPT
![Page 1: 019 20160907 Decoupled Neural Interfaces using Synthetic Gradients](https://reader035.vdocuments.us/reader035/viewer/2022062821/58a57e6e1a28ab36768b6b27/html5/thumbnails/1.jpg)
Decoupled Neural Interfaces using Synthetic Gradients
Tran Quoc Hoan
@k09ht haduonght.wordpress.com/
Paper Alert 2016-09-09, Hasegawa lab., Tokyo
The University of Tokyo
Max Jaderberg, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
https://arxiv.org/abs/1608.05343
![Page 2: 019 20160907 Decoupled Neural Interfaces using Synthetic Gradients](https://reader035.vdocuments.us/reader035/viewer/2022062821/58a57e6e1a28ab36768b6b27/html5/thumbnails/2.jpg)
Findings
Decoupled Neural Interfaces using Synthetic Gradients 2
• Modelling error gradients: by using the modeled synthetic gradient in place of true back propagated error gradients, decouple subgraphs and update independently and asynchronously
• Speed up training process and save memory for RNN
![Page 3: 019 20160907 Decoupled Neural Interfaces using Synthetic Gradients](https://reader035.vdocuments.us/reader035/viewer/2022062821/58a57e6e1a28ab36768b6b27/html5/thumbnails/3.jpg)
Neural network and the problem of locking
Decoupled Neural Interfaces using Synthetic Gradients 3
• Gradients have been back-propagated sequentially
• Layer 1 must wait for forward/backward computation at layer 2&3 for update
• Layer 1 is locked, coupled to the rest of network
Time consuming problem for complex network or big distributed network spread over multiple machines
Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients
![Page 4: 019 20160907 Decoupled Neural Interfaces using Synthetic Gradients](https://reader035.vdocuments.us/reader035/viewer/2022062821/58a57e6e1a28ab36768b6b27/html5/thumbnails/4.jpg)
Idea: Synthetic Gradient
Decoupled Neural Interfaces using Synthetic Gradients 4
predict this instead using back-propagationhi
�̂i
�̂i
Update
Train estimator
Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients
![Page 5: 019 20160907 Decoupled Neural Interfaces using Synthetic Gradients](https://reader035.vdocuments.us/reader035/viewer/2022062821/58a57e6e1a28ab36768b6b27/html5/thumbnails/5.jpg)
Idea: Synthetic Gradient
Decoupled Neural Interfaces using Synthetic Gradients 5
Mi : mini/simple neural network
Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients
![Page 6: 019 20160907 Decoupled Neural Interfaces using Synthetic Gradients](https://reader035.vdocuments.us/reader035/viewer/2022062821/58a57e6e1a28ab36768b6b27/html5/thumbnails/6.jpg)
Synthetic Gradient for RNN
Decoupled Neural Interfaces using Synthetic Gradients 6Image source: https://deepmind.com/blog#decoupled-neural-interfaces-using-synthetic-gradients
Only use synthetic gradient at trunked point of RNN
![Page 7: 019 20160907 Decoupled Neural Interfaces using Synthetic Gradients](https://reader035.vdocuments.us/reader035/viewer/2022062821/58a57e6e1a28ab36768b6b27/html5/thumbnails/7.jpg)
Experiments
Decoupled Neural Interfaces using Synthetic Gradients 7
Q: How about hardware setup for improvement (specially in DeepMind)? Does it work int my GPU clusters?
![Page 8: 019 20160907 Decoupled Neural Interfaces using Synthetic Gradients](https://reader035.vdocuments.us/reader035/viewer/2022062821/58a57e6e1a28ab36768b6b27/html5/thumbnails/8.jpg)
Experiments
Decoupled Neural Interfaces using Synthetic Gradients 8