java and deep learning (introduction)
TRANSCRIPT
![Page 1: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/1.jpg)
Java and Deep Learning
(Introduction)
Java Meetup SF Pivotal Labs
January 10, 2018
Oswald Campesato
![Page 2: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/2.jpg)
Session Overview
partial intro/overview of AI/ML/DL
hyper-parameters
a simple neural network
linear regression
cost/activation functions
gradient descent/back propagation
CNNs and RNNs
Java code (CNN and TensorFlow)
![Page 3: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/3.jpg)
The Data/AI Landscape
![Page 4: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/4.jpg)
Gartner 2017: Deep Learning (YES!)
![Page 5: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/5.jpg)
Neural Network with 3 Hidden Layers
![Page 6: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/6.jpg)
The Official Start of AI (1956)
![Page 7: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/7.jpg)
AI/ML/DL: How They Differ
Traditional AI (20th century):
based on collections of rules
Led to expert systems in the 1980s
The era of LISP and Prolog
![Page 8: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/8.jpg)
AI/ML/DL: How They Differ
Machine Learning:
Started in the 1950s (approximate)
Alan Turing and “learning machines”
Data-driven (not rule-based)
Many types of algorithms
Involves optimization
![Page 9: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/9.jpg)
AI/ML/DL: How They Differ
Deep Learning:
Started in the 1950s (approximate)
The “perceptron” (basis of NNs)
Data-driven (not rule-based)
large (even massive) data sets
Involves neural networks (CNNs: ~1970s)
Lots of heuristics
Heavily based on empirical results
![Page 10: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/10.jpg)
The Rise of Deep Learning
Massive and inexpensive computing power
Huge volumes of data/Powerful algorithms
The “big bang” in 2009:
”deep-learning neural networks and NVidia GPUs"
Google Brain used NVidia GPUs (2009)
![Page 11: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/11.jpg)
AI/ML/DL: Commonality
All of them involve a model
A model represents a system
Goal: a good predictive model
The model is based on:
Many rules (for AI)
data and algorithms (for ML)
large sets of data (for DL)
![Page 12: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/12.jpg)
Clustering Example #1
Given some red dots and blue dots
Red dots are in the upper half plane
Blue dots in the lower half plane
How to detect if a point is red or blue?
![Page 13: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/13.jpg)
Clustering Example #1
![Page 14: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/14.jpg)
Clustering Example #1
![Page 15: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/15.jpg)
Clustering Example #2
Given some red dots and blue dots
Red dots are inside a unit square
Blue dots are outside the unit square
How to detect if a point is red or blue?
![Page 16: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/16.jpg)
Clustering Example #2
Two input nodes X and Y
One hidden layer with 4 nodes (one per line)
X & Y weights are the (x,y) values of the inward pointing perpendicular vector of each side
The threshold values are the negative of the y-intercept (or the x-intercept)
The outbound weights are all equal to 1
The threshold for the output node node is 4
![Page 17: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/17.jpg)
Clustering Example #2
![Page 18: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/18.jpg)
Clustering Example #2
![Page 19: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/19.jpg)
Clustering Exercises #1
Describe an NN for a triangle
Describe an NN for a pentagon
Describe an NN for an n-gon (convex)
Describe an NN for an n-gon (non-convex)
![Page 20: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/20.jpg)
Clustering Exercises #2
Create an NN for an OR gate
Create an NN for a NOR gate
Create an NN for an AND gate
Create an NN for a NAND gate
Create an NN for an XOR gate
=> requires TWO hidden layers
![Page 21: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/21.jpg)
Clustering Exercises #3
Convert example #2 to a 3D cube
![Page 22: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/22.jpg)
Clustering Example #2
A few points to keep in mind:
A “step” activation function (0 or 1)
No back propagation
No cost function
=> no learning involved
![Page 23: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/23.jpg)
A 2D Linear Regression Model
Perform the following steps:
1) Start with a simple model (2 variables)
2) Generalize that model (n variables)
3) See how it might apply to a NN
![Page 24: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/24.jpg)
Linear Regression Details
One of the simplest models in ML
Fits a line (y = m*x + b) to data in 2D
Finds best line by minimizing MSE:
m = average of x values (“mean”)
b also has a closed form solution
![Page 25: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/25.jpg)
Linear Regression in 2D: graph
![Page 26: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/26.jpg)
Sample Cost Function #1 (MSE)
![Page 27: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/27.jpg)
Linear Regression: example #1
One feature (independent variable):
X = number of square feet
Predicted value (dependent variable):
Y = cost of a house
A very “coarse grained” model
We can devise a much better model
![Page 28: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/28.jpg)
Linear Regression: example #2
Multiple features:
X1 = # of square feet
X2 = # of bedrooms
X3 = # of bathrooms (dependency?)
X4 = age of house
X5 = cost of nearby houses
X6 = corner lot (or not): Boolean
a much better model (6 features)
![Page 29: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/29.jpg)
Linear Multivariate Analysis
General form of multivariate equation:
Y = w1*x1 + w2*x2 + . . . + wn*xn + b
w1, w2, . . . , wn are numeric values
x1, x2, . . . , xn are variables (features)
Properties of variables:
Can be independent (Naïve Bayes)
weak/strong dependencies can exist
![Page 30: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/30.jpg)
Neural Network with 3 Hidden Layers
![Page 31: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/31.jpg)
Neural Networks: equations
Node “values” in first hidden layer:
N1 = w11*x1+w21*x2+…+wn1*xn
N2 = w12*x1+w22*x2+…+wn2*xn
N3 = w13*x1+w23*x2+…+wn3*xn
. . .
Nn = w1n*x1+w2n*x2+…+wnn*xn
Similar equations for other pairs of layers
![Page 32: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/32.jpg)
Neural Networks: Matrices
From inputs to first hidden layer:
Y1 = W1*X + B1 (X/Y1/B1: vectors; W1: matrix)
From first to second hidden layers:
Y2 = W2*X + B2 (X/Y2/B2: vectors; W2: matrix)
From second to third hidden layers:
Y3 = W3*X + B3 (X/Y3/B3: vectors; W3: matrix)
Apply an “activation function” to y values
![Page 33: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/33.jpg)
Neural Networks (general)
Multiple hidden layers:
Layer composition is your decision
Activation functions: sigmoid, tanh, RELU
https://en.wikipedia.org/wiki/Activation_function
Back propagation (1980s)
https://en.wikipedia.org/wiki/Backpropagation
=> Initial weights: small random numbers
![Page 34: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/34.jpg)
Euler’s Function
![Page 35: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/35.jpg)
The sigmoid Activation Function
![Page 36: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/36.jpg)
The tanh Activation Function
![Page 37: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/37.jpg)
The ReLU Activation Function
![Page 38: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/38.jpg)
The softmax Activation Function
![Page 39: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/39.jpg)
Activation Functions in Python
import numpy as np
...
# Python sigmoid example:
z = 1/(1 + np.exp(-np.dot(W, x)))
...# Python tanh example:
z = np.tanh(np.dot(W,x));
# Python ReLU example:
z = np.maximum(0, np.dot(W, x))
![Page 40: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/40.jpg)
What’s the “Best” Activation Function?
Initially: sigmoid was popular
Then: tanh became popular
Now: RELU is preferred (better results)
Softmax: for FC (fully connected) layers
NB: sigmoid and tanh are used in LSTMs
![Page 41: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/41.jpg)
Even More Activation Functions!
https://stats.stackexchange.com/questions/115258/comprehensive-list-of-activation-functions-in-neural-networks-with-pros-cons
https://medium.com/towards-data-science/activation-functions-and-its-types-which-is-better-a9a5310cc8f
https://medium.com/towards-data-science/multi-layer-neural-networks-with-sigmoid-function-deep-learning-for-rookies-2-bf464f09eb7f
![Page 42: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/42.jpg)
Sample Cost Function #1 (MSE)
![Page 43: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/43.jpg)
Sample Cost Function #2
![Page 44: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/44.jpg)
Sample Cost Function #3
![Page 45: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/45.jpg)
How to Select a Cost Function
1) Depends on the learning type:
=> supervised/unsupervised/RL
2) Depends on the activation function
3) Other factors
Example:
cross-entropy cost function for supervised
learning on multiclass classification
![Page 46: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/46.jpg)
GD versus SGD
SGD (Stochastic Gradient Descent):
+ involves a SUBSET of the dataset
+ aka Minibatch Stochastic Gradient Descent
GD (Gradient Descent):
+ involves the ENTIRE dataset
More details:
http://cs229.stanford.edu/notes/cs229-notes1.pdf
![Page 47: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/47.jpg)
Setting up Data & the Model
Normalize the data:
Subtract the ‘mean’ and divide by stddev
[Central Limit Theorem]
Initial weight values for NNs:
Random numbers in N(0,1)
More details:
http://cs231n.github.io/neural-networks-2/#losses
![Page 48: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/48.jpg)
What are Hyper Parameters?
higher level concepts about the model such as
complexity, or capacity to learn
Cannot be learned directly from the data in the
standard model training process
must be predefined
![Page 49: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/49.jpg)
Hyper Parameters (examples)
# of hidden layers in a neural network
the learning rate (in many models)
the dropout rate
# of leaves or depth of a tree
# of latent factors in a matrix factorization
# of clusters in a k-means clustering
![Page 50: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/50.jpg)
Hyper Parameter: dropout rate
"dropout" refers to dropping out units (both hidden and visible) in a neural network
a regularization technique for reducing overfitting in neural networks
prevents complex co-adaptations on training data
a very efficient way of performing model averaging with neural networks
![Page 51: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/51.jpg)
How Many Layers in a DNN?
Algorithm #1 (from Geoffrey Hinton):
1) add layers until you start overfitting your
training set
2) now add dropout or some another
regularization method
Algorithm #2 (Yoshua Bengio):
"Add layers until the test error does not improve
anymore.”
![Page 52: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/52.jpg)
How Many Hidden Nodes in a DNN?
Based on a relationship between:
# of input and # of output nodes
Amount of training data available
Complexity of the cost function
The training algorithm
![Page 53: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/53.jpg)
CNNs versus RNNs
CNNs (Convolutional NNs):
Good for image processing
2000: CNNs processed 10-20% of all checks
=> Approximately 60% of all NNs
RNNs (Recurrent NNs):
Good for NLP and audio
![Page 54: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/54.jpg)
CNNs: convolution-pooling (1)
![Page 55: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/55.jpg)
CNNs: Convolution Calculations
https://docs.gimp.org/en/plug-in-
convmatrix.html
![Page 56: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/56.jpg)
CNNs: Convolution Matrices (examples)
Sharpen:
Blur:
![Page 57: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/57.jpg)
CNNs: Convolution Matrices (examples)
Edge detect:
Emboss:
![Page 58: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/58.jpg)
CNNs: Sample Convolutions/Filters
![Page 59: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/59.jpg)
CNNs: Max Pooling Example
![Page 60: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/60.jpg)
CNNs: convolution and pooling (2)
![Page 61: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/61.jpg)
Sample CNN in Keras (fragment) from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Flatten, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.optimizers import Adadelta
input_shape = (3, 32, 32)
nb_classes = 10
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same’,
input_shape=input_shape))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
![Page 62: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/62.jpg)
GANs: Generative Adversarial Networks
![Page 63: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/63.jpg)
GANs: Generative Adversarial Networks
Make imperceptible changes to images
Can consistently defeat all NNs
Can have extremely high error rate
Some images create optical illusions
https://www.quora.com/What-are-the-pros-and-cons-of-using-generative-adversarial-networks-a-type-of-neural-network
![Page 64: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/64.jpg)
GANs: Generative Adversarial Networks
Create your own GANs:
https://www.oreilly.com/learning/generative-adversarial-networks-for-
beginners
https://github.com/jonbruner/generative-adversarial-networks
GANs from MNIST:
http://edwardlib.org/tutorials/gan
![Page 65: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/65.jpg)
GANs: Generative Adversarial Networks
GANs, Graffiti, and Art:
https://thenewstack.io/camouflaged-graffiti-road-signs-can-fool-
machine-learning-models/
GANs and audio:
https://www.technologyreview.com/s/608381/ai-shouldnt-believe-
everything-it-hears
Houdini algorithm: https://arxiv.org/abs/1707.05373
![Page 66: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/66.jpg)
Deep Learning Playground
TF playground home page:
http://playground.tensorflow.org
Demo #1:
https://github.com/tadashi-aikawa/typescript-
playground
Converts playground to TypeScript
![Page 67: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/67.jpg)
Java and DL/ML Frameworks
Deeplearning4j: Pure Java framework for DL
SMILE:
“Statistical Machine Intelligence and Learning Engine”
"outperforms R, Python, Spark, H2O significantly”
https://haifengl.github.io/smile/
Weka (“WAY-kuh”):
https://github.com/Waikato/wekaDeeplearning4j
IBM neuroph:
http://neuroph.sourceforge.net/download.html
![Page 68: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/68.jpg)
Deeplearning4j Library
Open source, distributed library for the JVM
https://deeplearning4j.org/
https://github.com/deeplearning4j/deeplearning4j
Written in Java and Scala (GPU support)
Integrates with Hadoop and Spark
https://deeplearning4j.org/gettingstarted.html
![Page 69: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/69.jpg)
Deeplearning4j Library
Basic set-up steps (command line):
mkdir dl4j-examples
cd dl4j-examples
git clone https://github.com/deeplearning4j/dl4j-
examples
![Page 70: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/70.jpg)
Deeplearning4j Library
Set-up steps for IntelliJ
File > Import Project (or New Project from Existing Sources)
Select the directory with the DL4J examples.
Select Maven build tool in the next window
Check the following two boxes:
1) "Search for projects recursively"
2) "Import Maven projects automatically” (Next)
click on "+" sign (bottom of window) to JDK/SDK
Click through until you reach "Finish"
![Page 71: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/71.jpg)
Smile Framework
Support for many algorithms
classification, regression, clustering
association rule mining, feature selection
manifold learning, multidimensional scaling
genetic algorithm, missing value imputation
efficient nearest neighbor search
![Page 72: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/72.jpg)
Smile Framework
Natural Language Processing:
Tokenizers, stemming, phrase detection
part-of-speech tagging, keyword extraction
named entity recognition, sentiment analysis
relevance ranking, taxomony
![Page 73: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/73.jpg)
Smile Framework
Mathematics and Statistics
linear algebra (LU decomposition)
Cholesk decomposition, QR decomposition
eigenvalue decomposition
singular value decomposition
band matrix, and sparse matrix
tests: t-test, F-test, chi-square test
correlation test (Pearson, Spearman, Kendall)
Kolmogorov-Smirnov test
distributions/random number generators
interpolation, sorting, wavelet, plot
![Page 74: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/74.jpg)
Smile Framework
Random Forest (SMILE Scala API):
val data = read.arff("iris.arff", 4)
val (x, y) = data.unzipInt
val rf = randomForest(x, y)
println(s"OOB error = ${rf.error}")
rf.predict(x(0))
![Page 75: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/75.jpg)
What is TensorFlow?
An open source framework for ML and DL
A “computation” graph
Created by Google (released 11/2015)
Evolved from Google Brain
Linux and Mac OS X support (VM for Windows)
TF home page: https://www.tensorflow.org/
![Page 76: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/76.jpg)
What is TensorFlow?
Support for Python, Java, C++
TPUs available for faster processing
Can be embedded in Python scripts
Installation: pip install tensorflow
TensorFlow cluster:
https://www.tensorflow.org/deploy/distributed
![Page 77: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/77.jpg)
What is a Tensor?
TF tensors are n-dimensional arrays
TF tensors are very similar to numpy ndarrays
scalar number: a zeroth-order tensor
vector: a first-order tensor
matrix: a second-order tensor
3-dimensional array: a 3rd order tensor
https://dzone.com/articles/tensorflow-simplified-examples
![Page 78: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/78.jpg)
TensorFlow: constants (immutable)
import tensorflow as tf # tf-const.py
aconst = tf.constant(3.0)
print(aconst)
# output: Tensor("Const:0", shape=(), dtype=float32)
sess = tf.Session()
print(sess.run(aconst))
# output: 3.0
sess.close()
# => there's a better way…
![Page 79: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/79.jpg)
TensorFlow: constants
import tensorflow as tf # tf-const2.py
aconst = tf.constant(3.0)
print(aconst)
Automatically close “sess”
with tf.Session() as sess:
print(sess.run(aconst))
![Page 80: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/80.jpg)
TensorFlow Arithmetic
import tensorflow as tf # basic1.py
a = tf.add(4, 2)
b = tf.subtract(8, 6)
c = tf.multiply(a, 3)
d = tf.div(a, 6)
with tf.Session() as sess:
print(sess.run(a)) # 6
print(sess.run(b)) # 2
print(sess.run(c)) # 18
print(sess.run(d)) # 1
![Page 81: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/81.jpg)
TensorFlow Arithmetic Methods
import tensorflow as tf #tf-math-ops.py
PI = 3.141592
sess = tf.Session()
print(sess.run(tf.div(12,8)))
print(sess.run(tf.floordiv(20.0,8.0)))
print(sess.run(tf.sin(PI)))
print(sess.run(tf.cos(PI)))
print(sess.run(tf.div(tf.sin(PI/4.), tf.cos(PI/4.))))
![Page 82: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/82.jpg)
TensorFlow Arithmetic Methods
Output from tf-math-ops.py:
1
2.0
6.27833e-07
-1.01.0
![Page 83: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/83.jpg)
TensorFlow: placeholders exampleimport tensorflow as tf # tf-var-multiply.py
a = tf.placeholder("float")
b = tf.placeholder("float")
c = tf.multiply(a,b)
# initialize a and b:
feed_dict = {a:2, b:3}
# multiply a and b:
with tf.Session() as sess:
print(sess.run(c, feed_dict))
![Page 84: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/84.jpg)
TensorFlow fetch/feed_dict
import tensorflow as tf # fetch-feeddict.py
# y = W*x + b: W and x are 1d arrays
W = tf.constant([10,20], name=’W’)
x = tf.placeholder(tf.int32, name='x')
b = tf.placeholder(tf.int32, name='b')
Wx = tf.multiply(W, x, name='Wx')
y = tf.add(Wx, b, name=’y’)
![Page 85: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/85.jpg)
TensorFlow fetch/feed_dict
with tf.Session() as sess:
print("Result 1: Wx = ",
sess.run(Wx, feed_dict={x:[5,10]}))
print("Result 2: y = ",
sess.run(y, feed_dict={x:[5,10], b:[15,25]}))
Result 1: Wx = [50 200]
Result 2: y = [65 225]
![Page 86: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/86.jpg)
TensorFlow Arithmetic Expressions
import tensorflow as tf # tf-save-data.py
x = tf.constant(5,name="x")
y = tf.constant(8,name="y")
z = tf.Variable(2*x+3*y, name="z”)
model = tf.global_variables_initializer()
with tf.Session() as session:
writer = tf.summary.FileWriter(”./tf_logs",session.graph)
session.run(model)
print 'z = ',session.run(z) # => z = 34
# tensorboard –logdir=./tf_logs
![Page 87: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/87.jpg)
TensorFlow Eager Execution
An imperative interface to TF (experimental)
Fast debugging & immediate run-time errors
Eager execution is not included in v1.4 of TF
build TF from source or install the nightly build
pip install tf-nightly # CPU
pip install tf-nightly-gpu #GPU
![Page 88: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/88.jpg)
TensorFlow Eager Execution
integration with Python tools
Supports dynamic models + Python control flow
support for custom and higher-order gradients
Supports most TensorFlow operations
https://research.googleblog.com/2017/10/eager-
execution-imperative-define-by.html
![Page 89: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/89.jpg)
TensorFlow Eager Execution
import tensorflow as tf # tf-eager1.py
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()
x = [[2.]]
m = tf.matmul(x, x)
print(m)
![Page 90: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/90.jpg)
Android and Deep Learning
TensorFlow Lite (announced 2017 Google I/O)
A subset of the TensorFlow APIs (which ones?)
Provides “regular” TensorFlow APIs for apps
Does not require Python scripts (?)
![Page 91: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/91.jpg)
Deep Learning and Art
“Convolutional Blending” images:
=> 19-layer Convolutional Neural Network
www.deepart.io
Prisma: Android app with CNN
https://www.fastcodesign.com/90124942/this-google-
engineer-taught-an-algorithm-to-make-train-footage-
and-its-hypnotic
![Page 92: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/92.jpg)
What Do I Learn Next?
PGMs (Probabilistic Graphical Models)
MC (Markov Chains)
MCMC (Markov Chains Monte Carlo)
HMMs (Hidden Markov Models)
RL (Reinforcement Learning)
Hopfield Nets
Neural Turing Machines
Autoencoders
Hypernetworks
Pixel Recurrent Neural Networks
Bayesian Neural Networks
SVMs
![Page 93: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/93.jpg)
About Me: Recent Books
1) HTML5 Canvas and CSS3 Graphics (2013)
2) jQuery, CSS3, and HTML5 for Mobile (2013)
3) HTML5 Pocket Primer (2013)
4) jQuery Pocket Primer (2013)
5) HTML5 Mobile Pocket Primer (2014)
6) D3 Pocket Primer (2015)
7) Python Pocket Primer (2015)
8) SVG Pocket Primer (2016)
9) CSS3 Pocket Primer (2016)
10) Android Pocket Primer (2017)
11) Angular Pocket Primer (2017)
12) Data Cleaning Pocket Primer (2018)
13) RegEx Pocket Primer (2018)
![Page 94: Java and Deep Learning (Introduction)](https://reader033.vdocuments.us/reader033/viewer/2022052201/5aac90377f8b9a435e8b4e35/html5/thumbnails/94.jpg)
About Me: Training
=> Deep Learning. Keras, and TensorFlow:
http://codeavision.io/training/deep-learning-workshop
=> Mobile and TensorFlow Lite
=> R and Deep Learning (Keras and TensorFlow)
=> Android for Beginners