draw - cleveland state universityeecs.csuohio.edu/~sschung/cis601/brandonmarlowe... · draw draw =...
TRANSCRIPT
![Page 1: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/1.jpg)
DRAWA Recurrent Neural Network for Image
Generation
Brandon Marlowe - 2693414CIS 601 - Spring 18
![Page 2: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/2.jpg)
Agenda● What is a Neural Network? (VERY briefly)● DRAW (Deep Recurrent Attentive Writer) Overview● Why DRAW matters● DRAW...ing in Detail● Experimentation and Results
![Page 3: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/3.jpg)
What is a Neural Network?
● Statistical learning model inspired by the structure of the human mind● Composed of “Neurons” (AKA, nodes)● Consist of three main parts
○ Input Layer○ Hidden Layer○ Output Layer
![Page 4: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/4.jpg)
Extremely Simple Example Feedforward Neural Network
(Computes XOR Function)
![Page 5: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/5.jpg)
Inputs of [1, 1] passed into the Neural Network
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
![Page 6: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/6.jpg)
Random weights are assigned to each Synapse in all layers
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
![Page 7: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/7.jpg)
The weights corresponding to each Neuron are summed
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
![Page 8: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/8.jpg)
Activation function (Sigmoid in this case) is applied to each of the weighted sums
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
![Page 9: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/9.jpg)
Example Activation Functions
Image: Özkan, C., & Erbek, F. S. (2003). The Comparison of Activation Functions for Multispectral Landsat TM Image Classification. Photogrammetric Engineering & Remote Sensing, 69(11), 1225-1234. doi:10.14358/pers.69.11.1225
![Page 10: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/10.jpg)
Hidden layer values multiplied by weights and summed
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
![Page 11: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/11.jpg)
Error = target - calculated = -0.77
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
![Page 12: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/12.jpg)
The derivative of the activation function is used to adjust weights and the process is repeated
Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/
![Page 13: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/13.jpg)
Recurrent Neural Networks (RNNs) vs. Feedforward Neural Networks (FNNs)
● RNNs are similar to FNNs○ Main difference: RNNs are aware of previous inputs, FNNs are not
● RNNs can be thought of as multiple FNNs
Image: https://image.slidesharecdn.com/mdrnn-yandexmoscowcv-160427182305/95/multidimensional-rnn-4-638.jpg?cb=1461781453
![Page 14: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/14.jpg)
DRAW Overview
![Page 15: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/15.jpg)
DRAW
● DRAW = Deep Recurrent Attentive Writer○ Comprised of two Long Short-Term Memory Recurrent Neural Networks
■ Encoder RNN: compresses images■ Decoder RNN: reconstitutes images
○ Long Short-Term Memory Architecture composed of:■ Read Gate, Write Gate, Keep/Forget Gate
● Not the first image generation Neural Network● Belongs to family of Variational Autoencoders● Mimics behavior of the human eye● Creates portions of scenes independently and iteratively refines them
![Page 16: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/16.jpg)
Why DRAW Matters
● Previous Autoencoders created images in a single pass○ Accuracy suffered○ Details were missed○ Complex images posed problems○ Could not create natural-looking images
● DRAW creates images iteratively○ Generates complex images that cannot be distinguished from the real image○ Gradually refines each portion of the image○ Substantially improves on state of the art image generation models
![Page 17: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/17.jpg)
Structure of DRAW in DetailConventional Auto-Encoder DRAW
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
![Page 18: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/18.jpg)
DRAW...ing with Attention to Detail
● Read Gate places N x N grid of Gaussian Filters on image and determines the image center (gx, gy)
● δ = “stride” or “zoom” of attention patch○ Large stride means more of the image is
visible to the attention model
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
![Page 19: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/19.jpg)
DRAW...ing with Attention to Detail
● Write Gate extracts previous attention parameters, and inverts them
● The inversion alternates focus between highly detailed and broad views of image
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
Key Component:
![Page 20: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/20.jpg)
DRAW...ing with Attention to Detail
DRAW recreating images from MNIST dataset
Image: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
![Page 21: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/21.jpg)
Experimentation
● Three sets of training data were used:○ MNIST (Modified National Institute of Standards and Technology Database)
■ Database of handwritten digits○ SVHN (Street View House Numbers)
■ Database of images containing house numbers○ CIFAR-10 (Canadian Institute For Advanced Research - 10 Classes)
■ Database containing 10 classes of vehicles and animals● Experiment consisted of:
○ Classifying MNIST images○ Generating MNIST images○ Generating SVHN images○ Generating CIFAR-10 images
![Page 22: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/22.jpg)
Classifying MNIST
● MNIST 100 x 100 Clutter Classification○ 100 x 100 pixel images contained digit-like fragments○ DRAW was tasked with identifying digits○ The model was given a fixed number of “glimpses”
■ Each glimpse is 12 x 12 pixels in size○ DRAW compared with RAM (Recurrent Attention Model)
■ DRAW uses ¼ of the attention patches RAM uses
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
![Page 23: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/23.jpg)
Generating MNIST
● DRAW tasked with generating MNIST-like digits○ MNIST is widely used, allowing DRAW to be easily compared
● Trained on MNIST dataset● With vs. without selective attention compared as well
All images generated by DRAW(except rightmost column = training set image)Negative Log-likelihood (lower is better)
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
![Page 24: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/24.jpg)
Generating SVHN
● DRAW trained on 64 x 64 pixel images of house numbers
● 231,053 images in dataset
● 4,701 validation images
Sequence of drawing SVHN digits
All images generated by DRAW(except rightmost column = training set image)
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
![Page 25: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/25.jpg)
Generating CIFAR-10
● DRAW trained on 50,000 images○ Small training sample considering diversity of
images● Still able to capture a good portion of detail
All images generated by DRAW(except rightmost column = training set image)
Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
![Page 26: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/26.jpg)
DRAW in Action
Image: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
![Page 27: DRAW - Cleveland State Universityeecs.csuohio.edu/~sschung/CIS601/BrandonMarlowe... · DRAW DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term Memory Recurrent](https://reader030.vdocuments.us/reader030/viewer/2022041015/5ec613b67de35c0582412e3c/html5/thumbnails/27.jpg)
Sources
● Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .
● Özkan, C., & Erbek, F. S. (2003). The Comparison of Activation Functions for Multispectral Landsat TM Image Classification. Photogrammetric Engineering & Remote Sensing, 69(11), 1225-1234. doi:10.14358/pers.69.11.1225
● https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/