c ee 6 9 6 d e e p l e a r ning in c ee a nd ea r th sc ie...
TRANSCRIPT
CEE 696 Deep Learning in CEE and Earth ScienceHW1/Brief History of Neural Nets/Misc.
9/24/2019
Harry Lee
https://www2.hawaii.edu/~jonghyun/classes/F19/CEE696/schedule.html
HW1 CommentsObjective: overfit the data as much as possible
Brief summary: students used
NN Architecture with 3 ~ 5 layers
Activation functions: sigmoid, relu, leakyrelu
32 - 512 neurons each layer
MSE: 1E-6 to 1E-2
# of parameters from 5,000 to 500,000 (2-512-512-512-1)
2
Optimizers and Epochs
You don't want to stop at 50 epochs. Monitoring the learning is important.SGD vs. Adaptive learning rate methods (Adam, RMSprop and so on)?
3
Results
Underfitting vs. Overfitting - we will come back to this topic later; focus onminimization for now
4
tf.keras.backend.clear_session() may be helpful
I don't like notebooks by Joel Grus https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit#slide=id.g362da58057_0_1 5
HW1: ValidationWhile we overfit the data in HW1, we might want to monitor validation error by
model.fit(x, y, epochs = 1, validation_split=0.0, shuffle=True) # validation_data = (x_val,y_val) will override validation_split
validation_split : float between 0 and 1.
Fraction of the training data to be used as validation.
The model will set apart this fraction of the training data, will not train on it, andwill evaluate the loss and any model metrics on this data at the end of each epoch.The validation data is selected from the last samples in the x and y data provided,before shuffling. shuffle for mini-batch learning/optimization. Don't confuse!
Thus, you have to shuffle your data set before you use tf-keras . Alternatively,
from sklearn.model_selection import train_test_split x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, shuffle= True) 6
Extension of Vanilla DNNBefore we move on to more advanced topics, what can we do with what we havelearned? Do you think our NN model looks simple and limited?
Actually we can do lots of interesting tests with our basic DNN. For example:
Q: How can we learn without outputs/labels, i.e., unsupervisedlearning?
A: Construct auto-associative NNs, i.e, train (input,input) pairs
7
Autoassociative/self-associative NNsNNs can reconstruct input images (Input -> model -> Input).
Copy this script to your drive
8
Autoassociative NNsCopy this script to your drive and run it.
Properties of NNs similar to human memorycontent addressable/autoassociative memorypattern recognition with incomplete information: generalization
model performance less affected by deleting (even large number of) neurons:graceful degradation
9
Autoassociative NNs: Autoencoder
By introducing latent variables in the middle of hidden layers, we can perform featureselection or construct a generative model.
10
History of NNsMcCulloch & Pitts, "A Logical Calculus of Ideas Immanent in Nervous Activity",1943Rosenblatt, Perceptron, 1962
Minsky and Papert, 1969: perceptron cannot learn unless a problem is simple (i.e.,linear separable).The First NN Winter (1969 ~ 1980): Only a few scientists worked on NNsProgression (1980 ~ 1990): Hopfield Net 1982, Boltzmaan Machine 1985,Backpropagation 1986The Second NN Winter (1993 ~ 2000s): Support Vector Machine 1995, GraphicalModels
Progression (2006 - ): Deep Learning, Convolutional Networks GPUsSo what's the next? 11
adapted from http://www.asimovinstitute.org/neural-network-zoo/ 12