tensorflow: neural networks lab

28
TensorFlow: neural networks lab Gianluca Corrado and Andrea Passerini [email protected] [email protected] Machine Learning Corrado, Passerini (disi) TensorFlow Machine Learning 1 / 27

Upload: trinhkien

Post on 13-Feb-2017

254 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: TensorFlow: neural networks lab

TensorFlow: neural networks lab

Gianluca Corrado and Andrea Passerini

[email protected]@disi.unitn.it

Machine Learning

Corrado, Passerini (disi) TensorFlow Machine Learning 1 / 27

Page 2: TensorFlow: neural networks lab

Introduction

TensorFlow

TensorFlow is a Python package

Numerical computation using data flow graphs

Developed (by Google) for the purpose of machine learning and deepneural networks research

Installation and Documentation

https://www.tensorflow.org/

Corrado, Passerini (disi) TensorFlow Machine Learning 2 / 27

Page 3: TensorFlow: neural networks lab

Introduction

Outline

MNIST datasetI https://www.tensorflow.org/versions/master/tutorials/

mnist/beginners/index.html#the-mnist-data

MNIST for ML BeginnersI https://www.tensorflow.org/versions/master/tutorials/

mnist/beginners/index.html

Deep MNIST for ExpertsI https://www.tensorflow.org/versions/master/tutorials/

mnist/pros/index.html

Corrado, Passerini (disi) TensorFlow Machine Learning 3 / 27

Page 4: TensorFlow: neural networks lab

Introduction

On Ubuntu 12.04

LD LIBRARY PATH

Run the script run me on ubuntu1204.sh

Corrado, Passerini (disi) TensorFlow Machine Learning 4 / 27

Page 5: TensorFlow: neural networks lab

MNIST dataset

MNIST dataset

Dataset of handwritten digits

Each image 28× 28 = 784 pixels

Train: 60k test: 10k

Corrado, Passerini (disi) TensorFlow Machine Learning 5 / 27

Page 6: TensorFlow: neural networks lab

MNIST dataset

Importing MNIST

Use the given input data.py script

Train: 55k, validation: 5k, test: 10k

mnist is a Python Class (has attributes, and methods)I mnist.train.imagesI mnist.train.labelsI mnist.train.next batch(n)I . . .

Corrado, Passerini (disi) TensorFlow Machine Learning 6 / 27

Page 7: TensorFlow: neural networks lab

MNIST dataset

Data representation

The 28× 28 = 784 pixels are represented as a vector

The 10 classes are represented with the one-hot encoding

Corrado, Passerini (disi) TensorFlow Machine Learning 7 / 27

Page 8: TensorFlow: neural networks lab

Softmax regressions

Softmax regressions

Look at an image and give probabilities for it being each digit

Evidence that an image is a particular class i

evidencei =∑j

Wijxj + bi ∀i

Softmax to shape the evidence as a probability distribution over icases

softmaxi =exp(evidencei )∑j exp(evidencej)

∀i

Corrado, Passerini (disi) TensorFlow Machine Learning 8 / 27

Page 9: TensorFlow: neural networks lab

Softmax regressions

Softmax regressions

Schematic view

Vectorized version

Compactlyy = softmax(Wx + b)

Corrado, Passerini (disi) TensorFlow Machine Learning 9 / 27

Page 10: TensorFlow: neural networks lab

Softmax regressions

Implementation: model

Import TensorFlow

Define the placeholders

Define the variables

Define the softmax layer

Corrado, Passerini (disi) TensorFlow Machine Learning 10 / 27

Page 11: TensorFlow: neural networks lab

Softmax regressions

Implementation: optimization

Define the cost (cross-entropy): Hy (y) = −∑

y log y

Define the training algorithm

Start a new session (for now we have not computed anything)

Initialize the variables

Train the model

Corrado, Passerini (disi) TensorFlow Machine Learning 11 / 27

Page 12: TensorFlow: neural networks lab

Softmax regressions

Implementation: evaluationEvaluate accuracy (it should be around 0.91)

Plot the model weights (plotter.py)

Corrado, Passerini (disi) TensorFlow Machine Learning 12 / 27

Page 13: TensorFlow: neural networks lab

Deep convolutional net

Deep architechtures

0.91 accuracy on MNIST is NOT good!

State of the art performance is 0.9979

Let’s refine our modelI 2 convolutional layersI alternated with 2 max pool layersI ReLU layer (with dropout)I Softmax regressions

Accuracy target: 0.99

Corrado, Passerini (disi) TensorFlow Machine Learning 13 / 27

Page 14: TensorFlow: neural networks lab

Deep convolutional net

Convolutional layer

Broadly used in image classification

Local connettivity (width, height, depth)

Spatial arrangement (stride, padding)

Parameter sharing

convolutional

max pool

convolutional

max pool

ReLU (dropout)

softmax

x1x1 x2x2 x784x784. . .

y0y0 y1y1 y9y9. . .

Corrado, Passerini (disi) TensorFlow Machine Learning 14 / 27

Page 15: TensorFlow: neural networks lab

Deep convolutional net

Max pooling layer

Commonly used after convolutional layer(s)

Reduce spatial size

Avoid overfitting

Max pooling 2× 2 is very common

convolutional

max pool

convolutional

max pool

ReLU (dropout)

softmax

x1x1 x2x2 x784x784. . .

y0y0 y1y1 y9y9. . .

Corrado, Passerini (disi) TensorFlow Machine Learning 15 / 27

Page 16: TensorFlow: neural networks lab

Deep convolutional net

ReLU (and Dropout)

Fully connected layer

Rectified Linear Unit activation

Dropout randomly excludes neurons to avoidoverfitting convolutional

max pool

convolutional

max pool

ReLU (dropout)

softmax

x1x1 x2x2 x784x784. . .

y0y0 y1y1 y9y9. . .

Corrado, Passerini (disi) TensorFlow Machine Learning 16 / 27

Page 17: TensorFlow: neural networks lab

Deep convolutional net

Softmax layer

Look at a feature configuration (coming fromthe layers below) and give probabilities for itbeing each digit

Evidence that a feature configuration is aparticular class i

evidencei =∑j

Wijxj + bi ∀i

Softmax to shape the evidence as a probabilitydistribution over i cases

softmaxi =exp(evidencei )∑j exp(evidencej)

∀i

convolutional

max pool

convolutional

max pool

ReLU (dropout)

softmax

x1x1 x2x2 x784x784. . .

y0y0 y1y1 y9y9. . .

Corrado, Passerini (disi) TensorFlow Machine Learning 17 / 27

Page 18: TensorFlow: neural networks lab

Deep convolutional net

Implementation: preparation

Imports

Load data

Placeholders and data reshaping

NOTE

Reshaping is needed for convolution and max pooling

Corrado, Passerini (disi) TensorFlow Machine Learning 18 / 27

Page 19: TensorFlow: neural networks lab

Deep convolutional net

Implementation: weights initialization

Define functions to initialize variables of the model

Corrado, Passerini (disi) TensorFlow Machine Learning 19 / 27

Page 20: TensorFlow: neural networks lab

Deep convolutional net

Implementation: convolution and pooling

Define functions (to keep the code cleaner)

Corrado, Passerini (disi) TensorFlow Machine Learning 20 / 27

Page 21: TensorFlow: neural networks lab

Deep convolutional net

Implementation: model (convolution and max pooling)1st layer: convolutional with max pooling

2nd layer: convolutional with max pooling

Shrinking

Applying 2× 2 max pooling we are shrinking the image

After 2 layers we moved from 28× 28 to 7× 7

For each point we have 64 features

Corrado, Passerini (disi) TensorFlow Machine Learning 21 / 27

Page 22: TensorFlow: neural networks lab

Deep convolutional net

Implementation: model (ReLu, dropout and softmax)3rd layer: ReLU

Reshape

We are switching back to fully connected layers, we want to reshape theinput as a flat vector.

3rd layer: add dropout

4th layer: softmax (output)

Corrado, Passerini (disi) TensorFlow Machine Learning 22 / 27

Page 23: TensorFlow: neural networks lab

Deep convolutional net

Implementation: optimization and evaluation

Define the cost (cross-entropy): Hy (y) = −∑

y log y

Define the training algorithm

Start a new session (for now we have not computed anything)

Initialize the variables

Define the accuracy before training (for monitoring)

Corrado, Passerini (disi) TensorFlow Machine Learning 23 / 27

Page 24: TensorFlow: neural networks lab

Deep convolutional net

Implementation: optimization and evaluation

Train the model (may take a while)

Evaluate the accuracy (it should be around 0.99)

Corrado, Passerini (disi) TensorFlow Machine Learning 24 / 27

Page 25: TensorFlow: neural networks lab

Assignment

Assignment

The third ML assignment is to compare the performance of the deepconvolutional network when removing layers.For this assignment you need to adapt the code of the complete deeparchitecture. By removing one layer at the time, and keeping all theothers, you can evaluate the change in performance of the neural networkin classifing the MNIST dataset.

Note

The only coding required is to modify the shape and/or size of theinput vectors of the layers. The output of each layer has to remainthe same.

The report has to contain a short introduction on the methodologiesused in the deep architecture showed during the lab (convolution,max pooling, ReLU, dropout, softmax).

Corrado, Passerini (disi) TensorFlow Machine Learning 25 / 27

Page 26: TensorFlow: neural networks lab

Assignment

Assignment

Steps

1 Remove the 1st layer: convolutional and max pooling

2 Train and test the network

3 Remove the 2nd layer: convolutional and max pooling

4 Train and test the network

5 Remove the 3rd layer: ReLU with dropout

6 Train and test the network

Computation

Training a model on a quad-core CPU takes 30-45 mins. You may want touse the computers in the lab.

Corrado, Passerini (disi) TensorFlow Machine Learning 26 / 27

Page 27: TensorFlow: neural networks lab

Assignment

Assignment

After completing the assignment submit it via email

Send an email to [email protected] (cc:[email protected])

Subject: tensorflowSubmit2016

Attachment: id name surname.zip containing:I the Python code

F model no1.py (model without the 1st layer)F model no2.py (model without the 2nd layer)F model no3.py (model without the 3rd layer)

I the report (PDF format)

NOTE

No group work

This assignment is mandatory in order to enroll to the oral exam

Corrado, Passerini (disi) TensorFlow Machine Learning 27 / 27

Page 28: TensorFlow: neural networks lab

References

References

https://www.tensorflow.org/

http://cs231n.github.io/convolutional-networks/

https:

//www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf

Corrado, Passerini (disi) TensorFlow Machine Learning 28 / 27