memory advances in neural turing machines · 2020. 12. 21. · 8/06/2019 1 hanoi, june 2019 truyen...
TRANSCRIPT
-
8/06/2019 1
Hanoi, June 2019
Truyen TranDeakin University
@truyenoz
truyentran.github.io
letdataspeak.blogspot.com
goo.gl/3jJ1O0
Memory Advancesin Neural Turing Machines
-
8/06/2019 2
Deep Learning
Domain expert
Knowledge-based
-
8/06/2019 3
Can we learn from data a model that is as powerful as a Turing machine?
In other words, can we learn a (neural) program that learns to program from data?
-
8/06/2019 4
Program memory
Outlook
Sparse read/write
Variational memory
Neural Turing Machine
Agenda
-
Modelling
Three interwoven processes:• Disease progression• Interventions & care processes• Recording rules
Example: Electronic medical records
8/06/2019 5
Source: medicalbillingcodings.org
visits/admissions
time gap ?
prediction point
Abstraction
Need memory to handle thousands of events, compute complex healthcare “grammars”, support chain of reasoning, rapid switching of tasks.
-
Neural Turing machine (NTM)
A controller that takes input/output and talks to an external memory module.Memory has read/write operations.
The main issue is where to write, and how to update the memory state.All operations are differentiable.
https://rylanschaeffer.github.io/content/research/neural_turing_machine/main.html
-
8/06/2019 7
Program memory
Outlook
Sparse read/write
Variational memory
Neural Turing Machine
Agenda
-
Motivation: Dialog system
8/06/2019 8
A dialog system needs to maintain the history of chat (e.g., could be hours)Memory is needed
The generation of response needs to be flexible, adapting to variation of moods, styles Current techniques are mostly based on LSTM, leading
to “stiff” default responses (e.g., “I see”).
There are many ways to express the same thought Variational generative methods are needed. vectorstock
-
Variational memory encoder-decoder (VMED)
8/06/2019 9
Conditional Variational Auto-Encoder
contextgenerated
latent variables
VMED
contextgenerated
latent variables memory
reads
-
Sample response
8/06/2019 10
-
8/06/2019 11
Program memory
Outlook
Sparse read/write
Variational memory
Neural Turing Machine
Agenda
-
Problems of current NTMs
Lack of theoretical analysis on optimal memory operations.
Previous works are based on intuitions:Location-based reading/writing; temporal linkage reading; least-used writing [Santoro et.al, Graves et.al]
Sparse access over big memory [Rae et.al]
Very slow due to heavy memory read/write computations
12
-
Cached Uniform Writing (CUW)
13
-
Ablation StudyMemory-augmented Neural Networks w/wo Uniform Writing
14Task: repeat the input sequence twice
-
Synthetic tasks: memorize all
15
-
Synthetic tasks: memorize selectively
16
-
Synthetic sinusoidal generation: memorize featured points
17
-
Flatten MNIST classification
18
-
Document classification
19
-
8/06/2019 20
Program memory
Outlook
Sparse read/write
Variational memory
Neural Turing Machine
Agenda
-
Computing devices vs neural counterparts
FSM (1943) ↔ RNNs (1982)PDA (1954) ↔ Stack RNN (1993)TM (1936) ↔ NTM (2014)UTM/VNA (1936/1945) ↔ NUTM--ours (2019)The missing piece: A memory to store programs Neural stored-program memory
-
NUTM = NTM + NSM
-
Multi-level modellingHierarchical Regression: if the input is clustered, clustering before regression helps
Prove for low dimensions maybe available, higher dimension?
-
NSM is beneficial to NTM
-
Algorithmic single tasks
-
Sequencing tasks
-
Continual Learning
-
Few-shot learning
-
Question answering (bAbI dataset)
-
8/06/2019 30
Program memory
Outlook
Sparse read/write
Variational memory
Neural Turing Machine
Agenda
-
Memory for graphs & relational structuresTuring machine to design machine learning algorithms
Memory-supported reasoningImaginative memorySocial memory: collective mem, theory of mind, memory of others
Full cognitive architecturesTheoretical analysis
8/06/2019 31
https://twitter.com/nvidia/status/1010545517405835264
Towards AGI:Is Human Brain a
(super-)Turing machine?
Memory Advances in Neural Turing MachinesSlide Number 2Slide Number 3Slide Number 4Example: Electronic medical recordsNeural Turing machine (NTM)Slide Number 7Motivation: Dialog systemVariational memory encoder-decoder (VMED)Sample responseSlide Number 11Problems of current NTMsCached Uniform Writing (CUW)Ablation Study�Memory-augmented Neural Networks w/wo Uniform WritingSynthetic tasks: memorize allSynthetic tasks: memorize selectivelySynthetic sinusoidal generation: memorize featured pointsFlatten MNIST classificationDocument classificationSlide Number 20Computing devices vs neural counterpartsNUTM = NTM + NSMMulti-level modellingNSM is beneficial to NTMAlgorithmic single tasksSequencing tasksContinual LearningFew-shot learningQuestion answering (bAbI dataset)Slide Number 30Slide Number 31