csce 636 neural networks (deep learning)what a neural network does: learn a function x neural...

42
CSCE 636 Neural Networks (Deep Learning) Lecture 2: Mathematical Building Blocks of Neural Networks Anxiao (Andrew) Jiang

Upload: others

Post on 16-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

CSCE636NeuralNetworks(DeepLearning)

Lecture2:Mathematical BuildingBlocksofNeuralNetworks

Anxiao (Andrew)Jiang

Page 2: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Chapter2

Beforewebegin:themathematicalbuildingblocksofneuralnetworks

Page 3: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Whataneuralnetworkdoes:learnafunction

NeuralNetworkx

valueoff(x)

Theneuralnetworklearnsthefunctionf(x),eitherexactlyorapproximately.

Page 4: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Application:HandwrittenDigitRecognition

NeuralNetwork 4

Task:Classifygrayscaleimagesofhandwrittendigits(28x28pixels)intotheir10categories(0through9).

Page 5: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Howtostart?

Page 6: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step1:Loadthedataset

MNISTDataset:60,000 trainingimagesand10,000 testimages,alongwiththeirlabels.

Page 7: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step1:Loadthedataset

train_images: 60,000x28x28array,whereeachelement (pixel) isaninteger in[0,255]train_labels: vectoroflength60,000,whereeachelement (label) isaninteger in[0,9]

test_images: 10,000x28x28array,whereeachelement (pixel) isaninteger in[0,255]test_labels: vectoroflength10,000,whereeachelement (label)isaninteger in[0,9]

Rule:trainingdataandtestdataaredisjoint.Onlyusetrainingdatatotrainneuralnetwork!

Page 8: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

4

Page 9: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step2:Buildneuralnetworkarchitecture

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

512neurons

Page 10: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Whatisaneuron

Activation function(manypossible forms)

ReLU (mostpopularActivationfunction)

Page 11: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step2:Buildneuralnetworkarchitecture

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

512neurons

0

1

2

9

10neurons

Page 12: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Softmax (popularactivationfunctionforthelastlayerofaclassificationnetwork)

Page 13: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step2:Buildneuralnetworkarchitecture

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

512neurons

0

1

2

9

10neurons

Probability oflabel “0”

Probability oflabel “1”

Probability oflabel “2”

Probability oflabel “9”

Page 14: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step3:chooselossfunction,optimizer,andtargetmetrics

Page 15: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Categoricalcross-entropy(apopularlossfunctionformulti-classclassification)

Numberofsamples

Numberofclasses

Trueprobability(1or0)thisinputbelongstoclassj

probabilitypredictedbyneuralnetworkthatthisinputbelongsto

classj

Page 16: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

RMSProp (apopularoptimizer,detailstobeintroduced later)

Page 17: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Accuracy: fractionoftimesthattheneuralnetworkmakescorrectionpredictions• Ifwecareaboutaccuracy,whydoweoptimizecategoricalcross-entropyduringtraining?• Answer:lossfunctionneedstobedifferentiable.(Andthelossfunctioniscloselyrelatedtothetargetmetric.Minimizingthelossfunctionis(approximatelyorprecisely)optimizingthetargetmetric.

Page 18: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

512neurons

0

1

2

9

10neurons

Probability oflabel “0”

Probability oflabel “1”

Probability oflabel “2”

Probability oflabel “9”

The“Teacher”:

Loss function: categoricalcross-entropyOptimizer: RMSPropTargetMetric:Accuracy

Page 19: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step4:PreparetrainingandtestdataHere:Reshapeandnormalizeinputtrainingdata

train_images:Originally: 3-dimensional arrayofsize60000 x28x28,whereeach element isanintegerin [0,255]Afterreshaping: 2-dimensional arrayofsize60000 x784,whereeach element isanintegerin [0,255]Afternormalization: 2-dimensional arrayofsize60000 x784,where eachelement is arealnumber in[0,1]

28x282-d array

1-d array oflength28*28 =784

Normalize values inthearraytobetween0and 1

Page 20: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

428x282-d array

0

1

2

0

1

2

511

512neurons

0

1

2

9

10neurons

Probability oflabel “0”

Probability oflabel “1”

Probability oflabel “2”

Probability oflabel “9”

The“Teacher”:

Loss function: categoricalcross-entropyOptimizer: RMSPropTargetMetric:Accuracy

783

0

1

2

783

Page 21: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

“Reshape”outputtrainingdata:categoricallyencodeeachlabelusingone-hotencoding

Label One-hotencoding0 1,0,0,0,0,0,0,0,0,01 0,1,0,0,0,0,0,0,0,02 0,0,1,0,0,0,0,0,0,03 0,0,0,1,0,0,0,0,0,04 0,0,0,0,1,0,0,0,0,0

Label One-hotencoding5 0,0,0,0,0,1,0,0,0,06 0,0,0,0,0,0,1,0,0,07 0,0,0,0,0,0,0,1,0,08 0,0,0,0,0,0,0,0,1,09 0,0,0,0,0,0,0,0,0,1

Page 22: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

428x282-d array

0

1

2

0

1

2

511

512neurons

0

1

2

9

10neurons

The“Teacher”:

Loss function: categoricalcross-entropyOptimizer: RMSPropTargetMetric:Accuracy

783

0

1

2

783

0

1

2

9

Page 23: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step5:Traintheneuralnetwork

428x282-d array

0

1

2

0

1

2

511

512neurons

0

1

2

9

10neurons

The“Teacher”:Loss function, Optimizer, TargetMetric:Accuracy

783

0

1

2

783

0

1

2

9

Page 24: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Batchsize:thenumberofsamplestouseeachtimeforcomputingthelossfunctionandupdatingtheweights.

Epochs:thenumberoftimesthetrainingprocessusesthewholetrainingdataset.

Page 25: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Andsoon(totally5epochs).

Accuracyontrainingdata:97.8%

Page 26: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step6:Testthetrainedneuralnetwork

Comparetotrainingaccuracy:0.989

Test accuracyis(clearly) lowerthantrainingaccuracy.Maybethere issome over-fittingtodata.

Butstill, performance isnice!

Page 27: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Summary

Page 28: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately
Page 29: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step1:Loadthedataset

4

Page 30: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step2:Buildneuralnetworkarchitecture

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

0

1

2

9

Page 31: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step3:chooselossfunction,optimizer,andtargetmetrics

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

0

1

2

9

Probability oflabel “0”

Probability oflabel “1”

Probability oflabel “2”

Probability oflabel “9”

The“Teacher”:Loss function, Optimizer, TargetMetric

Page 32: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step4:Preparetrainingandtestdata

428x282-d array

0

1

2

0

1

2

511

0

1

2

9783

0

1

2

783

0

1

2

9

The“Teacher”:Loss function, Optimizer, TargetMetric

Page 33: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step5:Traintheneuralnetwork

428x282-d array

0

1

2

0

1

2

511

512neurons

0

1

2

9

10neurons

The“Teacher”:Loss function, Optimizer, TargetMetric:Accuracy

783

0

1

2

783

0

1

2

9

Page 34: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Step6:Testthetrainedneuralnetwork

HowdidIdo?

Well…

Page 35: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

MiscellaneousBasicConcepts

Page 36: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Datarepresentation:Tensor(Array)

• Scalarnumbers(0-dimensionaltensors)• Vectors(1-dtensors)• Matrices(2-dtensors)• 3-dtensors,andhigher-dimensionaltensors

• Keyattributesforatensor:• (1)numberofaxes• (2)shape• (3)datatype

Page 37: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Somebasictensoroperations

• Addtwotesnors (ofthesameshape):element-wiseaddition• ApplyaReLU activationfunctiontoatensor:element-wiseoperation

• TensorProduct(alsocalledtensordot)

Page 38: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Reshapetensor

Page 39: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Basictermsforaneuralnetwork

• Layers:thebuildingblocksinaneuralnetwork• Model:networkoflayers• Lossfunctionandoptimizer:keystoconfiguringthelearningprocess

Page 40: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Keras:adeeplearninglibraryforPython

PyTorchisgettingpopulartoday

Page 41: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Keras:adeeplearninglibraryforPython

UseaGPUwhenpossible

Page 42: CSCE 636 Neural Networks (Deep Learning)What a neural network does: learn a function x Neural Network value of f(x) The neural network learns the function f(x), either exactly or approximately

Jupyter notebook:anicewaytoeditandrundeeplearningexperiments