c ee 6 9 6 d e e p l e a r ning in c ee a nd ea r th sc ie...

CEE 696 Deep Learning in CEE and Earth ScienceHW1/Brief History of Neural Nets/Misc.

9/24/2019

Harry Lee

https://www2.hawaii.edu/~jonghyun/classes/F19/CEE696/schedule.html

https://www2.hawaii.edu/~jonghyun/classes/F19/CEE696/schedule.html

HW1 CommentsObjective: overfit the data as much as possible

Brief summary: students used

NN Architecture with 3 ~ 5 layers

Activation functions: sigmoid, relu, leakyrelu

32 - 512 neurons each layer

MSE: 1E-6 to 1E-2

# of parameters from 5,000 to 500,000 (2-512-512-512-1)

2

Optimizers and Epochs

You don't want to stop at 50 epochs. Monitoring the learning is important.SGD vs. Adaptive learning rate methods (Adam, RMSprop and so on)?

3

Results

Underfitting vs. Overfitting - we will come back to this topic later; focus onminimization for now

4

tf.keras.backend.clear_session() may be helpful

I don't like notebooks by Joel Grus https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit#slide=id.g362da58057_0_1 5

https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit#slide=id.g362da58057_0_1

HW1: ValidationWhile we overfit the data in HW1, we might want to monitor validation error by

model.fit(x, y, epochs = 1, validation_split=0.0, shuffle=True) # validation_data = (x_val,y_val) will override validation_split

validation_split : float between 0 and 1.

Fraction of the training data to be used as validation.

The model will set apart this fraction of the training data, will not train on it, andwill evaluate the loss and any model metrics on this data at the end of each epoch.The validation data is selected from the last samples in the x and y data provided,before shuffling. shuffle for mini-batch learning/optimization. Don't confuse!

Thus, you have to shuffle your data set before you use tf-keras . Alternatively,

from sklearn.model_selection import train_test_split x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, shuffle= True) 6

Extension of Vanilla DNNBefore we move on to more advanced topics, what can we do with what we havelearned? Do you think our NN model looks simple and limited?

Actually we can do lots of interesting tests with our basic DNN. For example:

Q: How can we learn without outputs/labels, i.e., unsupervisedlearning?

A: Construct auto-associative NNs, i.e, train (input,input) pairs

7

Autoassociative/self-associative NNsNNs can reconstruct input images (Input -> model -> Input).

Copy this script to your drive

8

https://colab.research.google.com/drive/1oqCXF5M25-6DBFofLlr5ETY2GV0bM9vb

Autoassociative NNsCopy this script to your drive and run it.

Properties of NNs similar to human memorycontent addressable/autoassociative memorypattern recognition with incomplete information: generalization

model performance less affected by deleting (even large number of) neurons:graceful degradation

9

https://colab.research.google.com/drive/1oqCXF5M25-6DBFofLlr5ETY2GV0bM9vb

Autoassociative NNs: Autoencoder

By introducing latent variables in the middle of hidden layers, we can perform featureselection or construct a generative model.

10

History of NNsMcCulloch & Pitts, "A Logical Calculus of Ideas Immanent in Nervous Activity",1943Rosenblatt, Perceptron, 1962

Minsky and Papert, 1969: perceptron cannot learn unless a problem is simple (i.e.,linear separable).The First NN Winter (1969 ~ 1980): Only a few scientists worked on NNsProgression (1980 ~ 1990): Hopfield Net 1982, Boltzmaan Machine 1985,Backpropagation 1986The Second NN Winter (1993 ~ 2000s): Support Vector Machine 1995, GraphicalModels

Progression (2006 - ): Deep Learning, Convolutional Networks GPUsSo what's the next? 11

adapted from http://www.asimovinstitute.org/neural-network-zoo/ 12

http://www.asimovinstitute.org/neural-network-zoo/

c ee 6 9 6 d e e p l e a r ning in c ee a nd ea r th sc ie...

Documents