DRAWA Recurrent Neural Network for Image
Generation
Brandon Marlowe - 2693414CIS 601 - Spring 18
Agenda● What is a Neural Network? (VERY briefly)● DRAW (Deep Recurrent Attentive Writer) Overview● Why DRAW matters● DRAW...ing in Detail● Experimentation and Results
What is a Neural Network?
● Statistical learning model inspired by the structure of the human mind● Composed of “Neurons” (AKA, nodes)● Consist of three main parts
○ Input Layer○ Hidden Layer○ Output Layer
Extremely Simple Example Feedforward Neural Network
(Computes XOR Function)
Inputs of [1, 1] passed into the Neural Network
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
Random weights are assigned to each Synapse in all layers
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
The weights corresponding to each Neuron are summed
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
Activation function (Sigmoid in this case) is applied to each of the weighted sums
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
Example Activation Functions
Image: Özkan, C., & Erbek, F. S. (2003). The Comparison of Activation Functions for Multispectral Landsat TM Image Classification. Photogrammetric Engineering & Remote Sensing, 69(11), 1225-1234. doi:10.14358/pers.69.11.1225
Hidden layer values multiplied by weights and summed
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
Error = target - calculated = -0.77
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
The derivative of the activation function is used to adjust weights and the process is repeated
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
Recurrent Neural Networks (RNNs) vs. Feedforward Neural Networks (FNNs)
● RNNs are similar to FNNs○ Main difference: RNNs are aware of previous inputs, FNNs are not
● RNNs can be thought of as multiple FNNs
Image: https://image.slidesharecdn.com/mdrnn-yandexmoscowcv-160427182305/95/multidimensional-rnn-4-638.jpg?cb=1461781453
DRAW Overview
DRAW
● DRAW = Deep Recurrent Attentive Writer○ Comprised of two Long Short-Term Memory Recurrent Neural Networks
■ Encoder RNN: compresses images■ Decoder RNN: reconstitutes images
○ Long Short-Term Memory Architecture composed of:■ Read Gate, Write Gate, Keep/Forget Gate
● Not the first image generation Neural Network● Belongs to family of Variational Autoencoders● Mimics behavior of the human eye● Creates portions of scenes independently and iteratively refines them
Why DRAW Matters
● Previous Autoencoders created images in a single pass○ Accuracy suffered○ Details were missed○ Complex images posed problems○ Could not create natural-looking images
● DRAW creates images iteratively○ Generates complex images that cannot be distinguished from the real image○ Gradually refines each portion of the image○ Substantially improves on state of the art image generation models
Structure of DRAW in DetailConventional Auto-Encoder DRAW
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
DRAW...ing with Attention to Detail
● Read Gate places N x N grid of Gaussian Filters on image and determines the image center (gx, gy)
● δ = “stride” or “zoom” of attention patch○ Large stride means more of the image is
visible to the attention model
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
DRAW...ing with Attention to Detail
● Write Gate extracts previous attention parameters, and inverts them
● The inversion alternates focus between highly detailed and broad views of image
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
Key Component:
DRAW...ing with Attention to Detail
DRAW recreating images from MNIST dataset
Image: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
Experimentation
● Three sets of training data were used:○ MNIST (Modified National Institute of Standards and Technology Database)
■ Database of handwritten digits○ SVHN (Street View House Numbers)
■ Database of images containing house numbers○ CIFAR-10 (Canadian Institute For Advanced Research - 10 Classes)
■ Database containing 10 classes of vehicles and animals● Experiment consisted of:
○ Classifying MNIST images○ Generating MNIST images○ Generating SVHN images○ Generating CIFAR-10 images
Classifying MNIST
● MNIST 100 x 100 Clutter Classification○ 100 x 100 pixel images contained digit-like fragments○ DRAW was tasked with identifying digits○ The model was given a fixed number of “glimpses”
■ Each glimpse is 12 x 12 pixels in size○ DRAW compared with RAM (Recurrent Attention Model)
■ DRAW uses ¼ of the attention patches RAM uses
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
Generating MNIST
● DRAW tasked with generating MNIST-like digits○ MNIST is widely used, allowing DRAW to be easily compared
● Trained on MNIST dataset● With vs. without selective attention compared as well
All images generated by DRAW(except rightmost column = training set image)Negative Log-likelihood (lower is better)
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
Generating SVHN
● DRAW trained on 64 x 64 pixel images of house numbers
● 231,053 images in dataset
● 4,701 validation images
Sequence of drawing SVHN digits
All images generated by DRAW(except rightmost column = training set image)
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
Generating CIFAR-10
● DRAW trained on 50,000 images○ Small training sample considering diversity of
images● Still able to capture a good portion of detail
All images generated by DRAW(except rightmost column = training set image)
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
DRAW in Action
Image: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
Sources
● Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
● Özkan, C., & Erbek, F. S. (2003). The Comparison of Activation Functions for Multispectral Landsat TM Image Classification. Photogrammetric Engineering & Remote Sensing, 69(11), 1225-1234. doi:10.14358/pers.69.11.1225
● https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/