lecture 3: introduction to neural networks with...
TRANSCRIPT
UNIVERSITY OF JYVÄSKYLÄ
Lecture 3: Introduction toNeural Networks with TensorFlow
(Classification)
by:Dr. Oleksiy Khriyenko
IT FacultyUniversity of Jyväskylä
TIES4911 Deep-Learning for Cognitive Computing for DevelopersSpring 2020
UNIVERSITY OF JYVÄSKYLÄ
I am grateful to all the creators/owners of the images that I found from Google and haveused in this presentation.
2
Acknowledgement
16/01/2020 TIES4911 – Lecture 3
UNIVERSITY OF JYVÄSKYLÄ
3
Classification
Classification – is the process of categorizinga group of objects…
Existing classifiers are:o Logistic Regression,o Support Vector Machines (SVM),o Decision Tree,o Naive Bayes,o Neural Networks,o …
In comparison to other classifiers, when the patterns get reallycomplex, neural nets start to outperform all of their competition.The main advantage of DNN is automated feature extraction(those that a not specified/set explicitly).
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
4
Relevant links:https://www.edureka.co/blog/perceptron-learning-algorithm/
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
5
Logistic Regression
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
6
scikit-learn - Machine Learning in Pythonhttps://scikit-learn.org/stable/Dataset:https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/datasets/Social_Network_Ads.csv
https://github.com/Avik-Jain/100-Days-Of-ML-Code/blob/master/Code/Day%206%20Logistic%20Regression.md
import numpy as npimport matplotlib.pyplot as pltimport pandas as pd# from sklearn.cross_validation import train_test_split ### deprecated in new sklearn versionfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import StandardScalerfrom sklearn.linear_model import LogisticRegressionfrom sklearn.metrics import confusion_matrix
# Data Preprocessingdataset = pd.read_csv('Social_Network_Ads.csv')X = dataset.iloc[:, [2, 3]].valuesY = dataset.iloc[:, 4].valuesX_train, X_test, Y_train, Y_test = train_test_split( X, Y,
test_size = 1/4, random_state = 0)sc = StandardScaler()X_train = sc.fit_transform(X_train)X_test = sc.transform(X_test)
# Fitting Logistic Regression Model to the training setclassifier = LogisticRegression()classifier.fit(X_train, Y_train)# Predecting the ResultY_pred = classifier.predict(X_test)
#Evaluationcm = confusion_matrix(y_test, y_pred)
scikit-learn: Logistic Regression
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
7
Linear Classifierbinary logistic regressionPerceptron as a Linear Classifier
https://www.edureka.co/blog/perceptron-learning-algorithm/
#import required libraryimport tensorflow as tf#input1, input2 and biastrain_in = [[1., 1.,1], [1., 0,1], [0, 1.,1], [0, 0,1]]#outputtrain_out = [[1.], [0], [0], [0]]#weight variable initialized with random values using random_normal()w = tf.Variable(tf.random_normal([3, 1], seed=12))#Placeholder for input and outputx = tf.placeholder(tf.float32,[None,3])y = tf.placeholder(tf.float32,[None,1])#calculate outputoutput = tf.nn.relu(tf.matmul(x, w))#Mean Squared Loss or Errorloss = tf.reduce_sum(tf.square(output - y))#Minimize loss using GradientDescentOptimizer with a learning rate of 0.01optimizer = tf.train.GradientDescentOptimizer(0.01)train = optimizer.minimize(loss)#Initialize all the global variablesinit = tf.global_variables_initializer()sess = tf.Session()sess.run(init)print("w (initial): ", w.eval(session=sess))#Compute output and cost w.r.t to input vectorfor i in range(1000):
sess.run(train, {x:train_in,y:train_out})cost = sess.run(loss,feed_dict={x:train_in,y:train_out})print('Epoch--',i,'--loss--',cost)print("w (trained): ", w.eval(session=sess))output_pred = tf.nn.relu(tf.matmul(train_in, w))print("output_pred: ", output_pred.eval(session=sess))
tf.sigmoid(x, name=None)tf.nn.relu(x, name=None)tf.tanh(x, name=None)
a*x1 + b*x2 = c
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
8
Classification on MNIST dataset
Each image is a 28x28 array, flattened out tobe a 1-d tensor of size 784MNIST.train: 55,000 examplesMNIST.validation: 5,000 examplesMNIST.test: 10,000 examples
MNIST Dataset: https://en.wikipedia.org/wiki/MNIST_database; http://yann.lecun.com/exdb/mnist/MNIST Dataset in csv format: https://pjreddie.com/projects/mnist-in-csv/
Model: take a weighted sum of the features and add a bias to get the logit. Convert the logitto a probability via the logistic-sigmoid function.
= ∗ += ( )
= _ ( , ) , where Y is a one-hot vector
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
9
Softmax and Cross Entropy
https://www.ritchieng.com/machine-learning/deep-learning/neural-nets/https://www.youtube.com/watch?v=vq2nnJ4g6N0https://www.youtube.com/watch?v=tRsSi_sqXjI
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
10
A single-layer neural network based classification of handwritten digits form MNIST dataset
import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data
#Import datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)
# Create the modelx = tf.placeholder(tf.float32, [None, 784])W = tf.Variable(tf.zeros([784, 10]))b = tf.Variable(tf.zeros([10]))y = tf.nn.softmax( tf.matmul(x, W) + b)
# Define loss and optimizery_ = tf.placeholder(tf.float32, [None, 10])cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_, logits=y))train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)sess = tf.InteractiveSession()tf.global_variables_initializer().run()
# Trainfor _ in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
# Test trained modelcorrect_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))print(’Accuracy: ’, sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
MNIST Dataset: https://en.wikipedia.org/wiki/MNIST_database; http://yann.lecun.com/exdb/mnist/MNIST Dataset in csv format: https://pjreddie.com/projects/mnist-in-csv/
Linear Classifier
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
Let’s go Deeper!!!11
Classification
Linear model cannot serve non linear classification problems…
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
12
Neural NetworksHigh dimensionality is a one of thebottlenecks for Machine Learning.Handling and processing highdimensional data (where input &output is quite large) becomes verycomplex and resource exhaustive.
Relevant links:https://www.edureka.co/blog/deep-learning-tutorial#deep-learning-use-case
Deep learning is capable of handling thehigh dimensional data being efficient infocusing on the right features on its own(feature extraction process).
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
Relevant links:http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/https://www.youtube.com/watch?v=BR9h47Jtqyw
13
Neural NetworksNeural Network change an original representation of data into the mostappropriate representation to solve the problems …
Play with ConvnetJS demo: toy 2d classification with 2-layer neural networkhttp://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html
TensorFlow Playground - neural network right in your browser: http://playground.tensorflow.org/https://cloud.google.com/blog/big-data/2016/07/understanding-neural-networks-with-tensorflow-playground
ConvNetJS (by Andrej Karpathy) is a Javascriptlibrary for training Deep Learning models (NeuralNetworks) entirely in your browser.http://cs.stanford.edu/people/karpathy/convnetjs/index.html
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
14
Neural NetworksOne of the most exciting features of deep learning is its performance in feature learning;the algorithms perform particularly well in being able to detect features from raw data.
Relevant links:https://udarajay.com/applied-machine-learning-the-less-confusing-guide/https://www.youtube.com/watch?v=aircAruvnKk
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
15
Neural NetworksForward propagation is a neural net'sway of classifying a set of inputs (wheninput data goes through the nodes andtheir activation functions and finally formconfidence levels for the nodes of outputlayer).
To train the net, the output from forward prop iscompared to the output that is known to be correct,and the cost is the difference of the two. The point oftraining is to make that cost as small as possible,across millions of training examples. To do this, thenet tweaks the weights and biases step by step untilthe prediction closely matches the correct output.
Relevant links:https://www.edureka.co/blog/backpropagation/https://en.wikipedia.org/wiki/Backpropagationhttps://www.youtube.com/watch?v=IHZwWFHWa-whttps://www.youtube.com/watch?v=Ilg3gGewQ5U
One way to train the model is called as Backpropagation -algorithm that looks for the minimum value of the error function inweight space using a technique called the delta rule or gradientdescent. The weights that minimize the error function is thenconsidered to be a solution to the learning problem.
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
16
Multi Layer Perceptron (MLP) Classification of handwritten digits form MNIST dataset
import tensorflow as tfimport matplotlib.pyplot as pltimport numpy as np
# Import MNIST datafrom tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("MNIST_data", one_hot=True)print('Test shape:',mnist.test.images.shape)print('Train shape:',mnist.train.images.shape)
# Parameterslearning_rate = 0.001training_epochs = 5batch_size = 100display_step = 1
# Network Parametersn_hidden_1 = 256 # 1st layer number of featuresn_hidden_2 = 256 # 2nd layer number of featuresn_input = 784 # MNIST data input (img shape: 28*28)n_classes = 10 # MNIST total classes (0-9 digits)
x = tf.placeholder("float", [None, n_input])y = tf.placeholder("float", [None, n_classes])
Non Linear Classification with DNN
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
17
Multi Layer Perceptron (MLP) Classification of handwritten digits form MNIST dataset
# Create modeldef multilayer_perceptron(x, weights, biases):
# Use tf.matmul instead of "*" because tf.matmul can change it's dimensions on the fly (broadcast)print( 'x:', x.get_shape(), 'W1:', weights['h1'].get_shape(), 'b1:', biases['b1'].get_shape())# Hidden layer with RELU activationlayer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1']) #(x*weights['h1']) + biases['b1']layer_1 = tf.nn.relu(layer_1)# Hidden layer with RELU activationprint( 'layer_1:', layer_1.get_shape(), 'W2:', weights['h2'].get_shape(), 'b2:', biases['b2'].get_shape())layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']) # (layer_1 * weights['h2']) + biases['b2']layer_2 = tf.nn.relu(layer_2)# Output layer with linear activationprint( 'layer_2:', layer_2.get_shape(), 'W3:', weights['out'].get_shape(), 'b3:', biases['out'].get_shape())out_layer = tf.matmul(layer_2, weights['out']) + biases['out'] # (layer_2 * weights['out']) + biases['out']print('out_layer:',out_layer.get_shape())
return out_layer
# Store layers weight & biasweights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])), #784x256'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])), #256x256'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes])) #256x10
}biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])), #256x1'b2': tf.Variable(tf.random_normal([n_hidden_2])), #256x1'out': tf.Variable(tf.random_normal([n_classes])) #10x1
}
Non Linear Classification with DNN
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
18
Multi Layer Perceptron (MLP) Classification of handwritten digits form MNIST dataset
# Construct modelpred = multilayer_perceptron(x, weights, biases)# Cross entropy loss functioncost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y, logits=pred))# On this case we choose the AdamOptimizeroptimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)# Initializing the variablesinit = tf.global_variables_initializer()# Launch the graphwith tf.Session() as sess:
sess.run(init)
# Training cyclefor epoch in range(training_epochs):
avg_cost = 0.total_batch = int(mnist.train.num_examples/batch_size)# Loop over all batchesfor i in range(total_batch):
batch_x, batch_y = mnist.train.next_batch(batch_size)# Run optimization op (backprop) and cost op (to get loss value)_, c = sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})# Compute average lossavg_cost += c / total_batch
# Display logs per epoch stepif epoch % display_step == 0:
print ("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))print("Optimization Finished!")
Non Linear Classification with DNN
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
19
Multi Layer Perceptron (MLP) Classification of handwritten digits form MNIST dataset
# Test modelcorrect_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))# Calculate accuracyaccuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))# To keep sizes compatible with modelprint ("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
print("Test sample:")#Get 28x28 imagesample_1 = mnist.test.images[47].reshape(28,28)# Get corresponding integer label from one-hot encoded datasample_label_1 = np.where(mnist.test.labels[47] == 1)[0][0]# Plot sampleplt.imshow(sample_1, cmap='Greys')plt.title('label = {}'.format(sample_label_1))plt.show()x_test = mnist.test.images[47].reshape(1,784)predicted_value = tf.argmax(pred, 1)print("is classified as: {}".format(predicted_value.eval(session=sess, feed_dict={x: x_test})))
Fashion-MNIST: a Novel Image Dataset for Benchmarking MachineLearning Algorithms. Han Xiao, Kashif Rasul, Roland Vollgrafhttps://arxiv.org/abs/1708.07747https://github.com/zalandoresearch/fashion-mnisthttps://medium.com/tensorist/classifying-fashion-articles-using-tensorflow-fashion-mnist-f22e8a04728a
Non Linear Classification with DNN
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
20
Iris flower datasetIris flower dataset (Ronald Fisher's Iris data set ): https://en.wikipedia.org/wiki/Iris_flower_data_setThe dataset contains a set of 50 records under 5 attributes - Petal Length , Petal Width , Sepal Length , Sepal Width andClass. The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor).https://gist.github.com/curran/a08a1080b88344b0c8a7
Relevant links:https://www.edureka.co/blog/deep-learning-tutorial#deep-learning-use-case
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
21
ClassificationNeural Network Classifier with TensorFlow Estimatorhttps://www.tensorflow.org/api_docs/python/tf/estimator
import osfrom six.moves.urllib.request import urlopenimport numpy as npimport tensorflow as tf# Data setsIRIS_TRAINING = "iris_training.csv"IRIS_TRAINING_URL =
"http://download.tensorflow.org/data/iris_training.csv"IRIS_TEST = "iris_test.csv"IRIS_TEST_URL = "http://download.tensorflow.org/data/iris_test.csv"def main():
# If the training and test sets aren't stored locally, download them.if not os.path.exists(IRIS_TRAINING):
raw = urlopen(IRIS_TRAINING_URL).read()with open(IRIS_TRAINING, "wb") as f: f.write(raw)
if not os.path.exists(IRIS_TEST):raw = urlopen(IRIS_TEST_URL).read()with open(IRIS_TEST, "wb") as f: f.write(raw)
# Load datasets.training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
filename=IRIS_TRAINING, target_dtype=np.int,features_dtype=np.float32)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(filename=IRIS_TEST, target_dtype=np.int, features_dtype=np.float32)
# Specify that all features have real-value datafeature_columns = [tf.feature_column.numeric_column("x", shape=[4])]# Build 3 layer DNN with 10, 20, 10 units respectively.classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10], n_classes=3, model_dir="")
# Define the training inputstrain_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(training_set.data)},y=np.array(training_set.target), num_epochs=None,shuffle=True)
# Train model.classifier.train(input_fn=train_input_fn, steps=2000)# Define the test inputstest_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(test_set.data)},y=np.array(test_set.target), num_epochs=1, shuffle=False)
# Evaluate accuracy.accuracy_score =
classifier.evaluate(input_fn=test_input_fn)["accuracy"]print("\nTest Accuracy: {0:f}\n".format(accuracy_score))# Classify two new flower samples.new_samples = np.array(
[[6.4, 3.2, 4.5, 1.5],[5.8, 3.1, 5.0, 1.7]], dtype=np.float32)
predict_input_fn = tf.estimator.inputs.numpy_input_fn(x={"x": new_samples}, num_epochs=1, shuffle=False)
predictions = list(classifier.predict(input_fn=predict_input_fn))predicted_classes = [p["classes"][0] for p in predictions]print("New Samples, Class Predictions: {}\n”
.format(predicted_classes))if __name__ == "__main__":
main()
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
22
OverfittingOverfitting is "the production of ananalysis that corresponds too closely orexactly to a particular set of data, and maytherefore fail to fit additional data or predictfuture observations reliably“
Relevant links:https://en.wikipedia.org/wiki/Overfitting
The most popular regularization technique for deep neural networks is Dropout. It isused to prevent overfitting via temporarily “dropping”/disabling a neuron withprobability p (a hyperparameter called the dropout-rate that is typically a numberaround 0.5) at each iteration during the training phase.
§ The dropped-out neurons are resampled with probability p at every training step (adropped out neuron at one step can be active at the next one). (Dropout is not appliedduring test time after the network is trained).
§ Dropout can be applied to input or hidden layer nodes but not the output nodes.§ Applying dropout regularization, it is recommended to increase a number of
training rounds.
Reason. Dropout forces every neuron to be able to operate independently andprevents the network to be too dependent on a small number of neurons.
We may face a case, when training loss keeps going down but the validation lossstarts increasing after certain epoch. It is a sign of overfitting. It tells us that ourmodel is memorizing the training data, but it’s failing to generalize to new instances.
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
23
Multi Layer Perceptron (MLP) Classification on MNIST dataset with Keras# imports for array-handlingimport numpy as np# let's keep our keras backend tensorflow quietimport osos.environ['TF_CPP_MIN_LOG_LEVEL']='3'# for testing on CPU#os.environ['CUDA_VISIBLE_DEVICES'] = ''
# keras imports for the dataset and building our neural networkfrom keras.datasets import mnistfrom keras.models import Sequential, load_modelfrom keras.layers.core import Dense, Dropout, Activationfrom keras.utils import np_utils
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# let's print the shape before we reshape and normalizeprint("X_train shape", X_train.shape)print("y_train shape", y_train.shape)print("X_test shape", X_test.shape)print("y_test shape", y_test.shape)
# building the input vector from the 28x28 pixelsX_train = X_train.reshape(60000, 784)X_test = X_test.reshape(10000, 784)X_train = X_train.astype('float32')X_test = X_test.astype('float32')
# normalizing the data to help with the trainingX_train /= 255X_test /= 255
# print the final input shape ready for trainingprint("Train matrix shape", X_train.shape)print("Test matrix shape", X_test.shape)print(np.unique(y_train, return_counts=True))
# one-hot encoding using keras' numpy-related utilitiesn_classes = 10print("Shape before one-hot encoding: ", y_train.shape)Y_train = np_utils.to_categorical(y_train, n_classes)Y_test = np_utils.to_categorical(y_test, n_classes)print("Shape after one-hot encoding: ", Y_train.shape)
# building a linear stack of layers with the sequential modelmodel = Sequential()model.add(Dense(512, input_shape=(784,)))model.add(Activation('relu'))model.add(Dropout(0.2))model.add(Dense(512))model.add(Activation('relu'))model.add(Dropout(0.2))model.add(Dense(10))model.add(Activation('softmax'))
# compiling the sequential modelmodel.compile(loss='categorical_crossentropy',metrics=['accuracy'], optimizer='adam')
Non Linear Classification with DNN
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
Relevant links:https://nextjournal.com/a/17592186058848https://cambridgespark.com/content/tutorials/deep-learning-for-complete-beginners-recognising-handwritten-digits/index.html
24
Multi Layer Perceptron (MLP) Classification on MNIST dataset with Keras# training the model and saving metrics in historyhistory = model.fit(X_train, Y_train, batch_size=128, epochs=8, verbose=2, validation_data=(X_test, Y_test))
# saving the modelsave_dir = "results/"model_name = 'keras_mnist.h5'model_path = os.path.join(save_dir, model_name)model.save(model_path)print('Saved trained model at %s ' % model_path)
#evaluation and predictionmnist_model = load_model('results/keras_mnist.h5')loss_and_metrics = mnist_model.evaluate(X_test, Y_test, verbose=2)
print("Test Loss", loss_and_metrics[0])print("Test Accuracy", loss_and_metrics[1])
# load the model and create predictions on the test setmnist_model = load_model('results/keras_mnist.h5')predicted_classes = mnist_model.predict_classes(X_test)
# see which we predicted correctly and which notcorrect_indices = np.nonzero(predicted_classes == y_test)[0]incorrect_indices = np.nonzero(predicted_classes != y_test)[0]print()print(len(correct_indices)," classified correctly")print(len(incorrect_indices)," classified incorrectly")
Keras
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
x_train = x_train.reshape(60000, 784)x_test = x_test.reshape(10000, 784)x_train = x_train.astype('float32')x_test = x_test.astype('float32')x_train /= 255x_test /= 255print(x_train.shape[0], 'train samples')print(x_test.shape[0], 'test samples')
# convert class vectors to binary class matricesy_train = keras.utils.to_categorical(y_train, num_classes)y_test = keras.utils.to_categorical(y_test, num_classes)
def train_model(model):history = model.fit(x_train, y_train,
batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)print('Test loss:', score[0])print('Test accuracy:', score[1])
import kerasfrom keras.datasets import fashion_mnistfrom keras.models import Sequentialfrom keras.layers import Dense, Dropoutfrom keras.optimizers import RMSpropfrom keras.optimizers import SGDimport matplotlib.pyplot as plt%matplotlib inline
batch_size = 128num_classes = 10epochs = 20# the data, shuffled and split between train and test sets(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
print (y_train[1000])fig= plt.figure()plt.imshow(x_train[1000], cmap='gray')print (y_train[4000])fig= plt.figure()plt.imshow(x_train[4000], cmap='gray')
fig= plt.figure(figsize=(20,10))for i in range (1,41):
ax1 = fig.add_subplot(5,8,i)plt.xticks([], [])plt.yticks([], [])ax1.imshow(x_train[i], cmap='gray')
25
Different model performance comparison with Fashion-MNIST dataset…https://colab.research.google.com/github/rnditdev/PyData_London_2018_Computer_Vision/blob/master/1_History_of_neural_models.ipynb#scrollTo=3I-cTVJFp0KA
Keras
Label Description§ 0 T-shirt§ 1 Trouser§ 2 Pullover§ 3 Dress§ 4 Coat§ 5 Sandal§ 6 Shirt§ 7 Sneaker§ 8 Bag§ 9 Ankle boot
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
# Single layer with 1 unitmodel0 = Sequential()model0.add(Dense(1, activation='sigmoid', input_shape=(784,)))model0.add(Dense(num_classes, activation='softmax'))model0.summary()model0.compile(loss='categorical_crossentropy', optimizer=SGD(), metrics=['accuracy'])train_model(model0)
26
Different model performance comparison with Fashion-MNIST dataset…
Keras
…Epoch 20/2060000/60000 [==============================] - 1s 16us/step - loss: 1.8718 - acc: 0.2270 - val_loss: 1.8672 - val_acc: 0.2349Test loss: 1.8672341890335082Test accuracy: 0.2349
# Single layer with 10 unitsmodel1 = Sequential()model1.add(Dense(10, activation='sigmoid', input_shape=(784,)))model1.add(Dense(num_classes, activation='softmax'))model1.summary()model1.compile(loss='categorical_crossentropy', optimizer=SGD(), metrics=['accuracy'])train_model(model1)
…Epoch 20/2060000/60000 [==============================] - 1s 21us/step - loss: 0.7420 - acc: 0.7645 - val_loss: 0.7515 - val_acc: 0.7540Test loss: 0.7515270534515381Test accuracy: 0.754
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
# Single Layer with more unitsmodel2 = Sequential()model2.add(Dense(512, activation='sigmoid', input_shape=(784,)))model2.add(Dense(num_classes, activation='softmax'))model2.summary()model2.compile(loss='categorical_crossentropy', optimizer=RMSprop(), metrics=['accuracy'])train_model(model2)
27
Different model performance comparison with Fashion-MNIST dataset…
Keras
…Epoch 20/2060000/60000 [==============================] - 5s 89us/step - loss: 0.2059 - acc: 0.9236 - val_loss: 0.3467 - val_acc: 0.8755Test loss: 0.3466758099913597Test accuracy: 0.8755
# Multiple Layers Without dropoutmodel3 = Sequential()model3.add(Dense(512, activation='relu', input_shape=(784,)))model3.add(Dense(512, activation='relu'))model3.add(Dense(num_classes, activation='softmax'))model3.summary()model3.compile(loss='categorical_crossentropy', optimizer=RMSprop(), metrics=['accuracy'])train_model(model3)
…Epoch 20/2060000/60000 [==============================] - 8s 140us/step - loss: 0.2013 - acc: 0.9270 - val_loss: 0.4674 - val_acc: 0.8827Test loss: 0.4673706584453583Test accuracy: 0.8827
TIES4911 – Lecture 316/01/2020
UNIVERSITY OF JYVÄSKYLÄ
# Multiple Layers With dropoutmodel4 = Sequential()model4.add(Dense(512, activation='relu', input_shape=(784,)))model4.add(Dropout(0.2))model4.add(Dense(512, activation='relu'))model4.add(Dropout(0.2))model4.add(Dense(num_classes, activation='softmax'))model4.summary()model4.compile(loss='categorical_crossentropy', optimizer=RMSprop(), metrics=['accuracy'])train_model(model4)
28
Different model performance comparison with Fashion-MNIST dataset…
Keras
…Epoch 20/2060000/60000 [==============================] - 9s 153us/step - loss: 0.2756 - acc: 0.9055 - val_loss: 0.4045 - val_acc: 0.8869Test loss: 0.40449330792427063Test accuracy: 0.8869
TIES4911 – Lecture 316/01/2020
§ Since TensorFlow 2.x includes Keras, there is no need to install it separately. Simply access if as:import tensorflow as tftf.keras. …
§ Difference between 'categorical_crossentropy‘ and 'sparse_categorical_crossentropy‘ losses:If use 'sparse_categorical_crossentropy‘, you do not need to transform the logist into categorical (via one hotrepresentation).