csce 636 neural networks (deep learning)what a neural network does: learn a function x neural...

CSCE636NeuralNetworks(DeepLearning)

Lecture2:Mathematical BuildingBlocksofNeuralNetworks

Anxiao (Andrew)Jiang

Chapter2

Beforewebegin:themathematicalbuildingblocksofneuralnetworks

Whataneuralnetworkdoes:learnafunction

NeuralNetworkx

valueoff(x)

Theneuralnetworklearnsthefunctionf(x),eitherexactlyorapproximately.

Application:HandwrittenDigitRecognition

NeuralNetwork 4

Task:Classifygrayscaleimagesofhandwrittendigits(28x28pixels)intotheir10categories(0through9).

Howtostart?

Step1:Loadthedataset

MNISTDataset:60,000 trainingimagesand10,000 testimages,alongwiththeirlabels.


train_images: 60,000x28x28array,whereeachelement (pixel) isaninteger in[0,255]train_labels: vectoroflength60,000,whereeachelement (label) isaninteger in[0,9]

test_images: 10,000x28x28array,whereeachelement (pixel) isaninteger in[0,255]test_labels: vectoroflength10,000,whereeachelement (label)isaninteger in[0,9]

Rule:trainingdataandtestdataaredisjoint.Onlyusetrainingdatatotrainneuralnetwork!

Step2:Buildneuralnetworkarchitecture

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

512neurons

Whatisaneuron

Activation function(manypossible forms)

ReLU (mostpopularActivationfunction)


428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

512neurons

0

1

2

9

10neurons

Softmax (popularactivationfunctionforthelastlayerofaclassificationnetwork)


428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

512neurons

0

1

2

9

10neurons

Probability oflabel “0”




Step3:chooselossfunction,optimizer,andtargetmetrics

Categoricalcross-entropy(apopularlossfunctionformulti-classclassification)

Numberofsamples

Numberofclasses

Trueprobability(1or0)thisinputbelongstoclassj

probabilitypredictedbyneuralnetworkthatthisinputbelongsto

classj

RMSProp (apopularoptimizer,detailstobeintroduced later)

Accuracy: fractionoftimesthattheneuralnetworkmakescorrectionpredictions• Ifwecareaboutaccuracy,whydoweoptimizecategoricalcross-entropyduringtraining?• Answer:lossfunctionneedstobedifferentiable.(Andthelossfunctioniscloselyrelatedtothetargetmetric.Minimizingthelossfunctionis(approximatelyorprecisely)optimizingthetargetmetric.

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

512neurons

0

1

2

9

10neurons





The“Teacher”:

Loss function: categoricalcross-entropyOptimizer: RMSPropTargetMetric:Accuracy

Step4:PreparetrainingandtestdataHere:Reshapeandnormalizeinputtrainingdata

train_images:Originally: 3-dimensional arrayofsize60000 x28x28,whereeach element isanintegerin [0,255]Afterreshaping: 2-dimensional arrayofsize60000 x784,whereeach element isanintegerin [0,255]Afternormalization: 2-dimensional arrayofsize60000 x784,where eachelement is arealnumber in[0,1]

28x282-d array

1-d array oflength28*28 =784

Normalize values inthearraytobetween0and 1

428x282-d array

0

1

2

0

1

2

511

512neurons

0

1

2

9

10neurons





The“Teacher”:


783

0

1

2

783

“Reshape”outputtrainingdata:categoricallyencodeeachlabelusingone-hotencoding

Label One-hotencoding0 1,0,0,0,0,0,0,0,0,01 0,1,0,0,0,0,0,0,0,02 0,0,1,0,0,0,0,0,0,03 0,0,0,1,0,0,0,0,0,04 0,0,0,0,1,0,0,0,0,0

Label One-hotencoding5 0,0,0,0,0,1,0,0,0,06 0,0,0,0,0,0,1,0,0,07 0,0,0,0,0,0,0,1,0,08 0,0,0,0,0,0,0,0,1,09 0,0,0,0,0,0,0,0,0,1

428x282-d array

0

1

2

0

1

2

511

512neurons

0

1

2

9

10neurons

The“Teacher”:


783

0

1

2

783

0

1

2

9

Step5:Traintheneuralnetwork

428x282-d array

0

1

2

0

1

2

511

512neurons

0

1

2

9

10neurons

The“Teacher”:Loss function, Optimizer, TargetMetric:Accuracy

783

0

1

2

783

0

1

2

9

Batchsize:thenumberofsamplestouseeachtimeforcomputingthelossfunctionandupdatingtheweights.

Epochs:thenumberoftimesthetrainingprocessusesthewholetrainingdataset.

Andsoon(totally5epochs).

Accuracyontrainingdata:97.8%

Step6:Testthetrainedneuralnetwork

Comparetotrainingaccuracy:0.989

Test accuracyis(clearly) lowerthantrainingaccuracy.Maybethere issome over-fittingtodata.

Butstill, performance isnice!

Summary


4


428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

0

1

2

9

Step3:chooselossfunction,optimizer,andtargetmetrics

428x282-d array

0

1

2

28x28-1 =783

0

1

2

511

0

1

2

9





The“Teacher”:Loss function, Optimizer, TargetMetric

Step4:Preparetrainingandtestdata

428x282-d array

0

1

2

0

1

2

511

0

1

2

9783

0

1

2

783

0

1

2

9

The“Teacher”:Loss function, Optimizer, TargetMetric

Step5:Traintheneuralnetwork

428x282-d array

0

1

2

0

1

2

511

512neurons

0

1

2

9

10neurons

The“Teacher”:Loss function, Optimizer, TargetMetric:Accuracy

783

0

1

2

783

0

1

2

9

Step6:Testthetrainedneuralnetwork

HowdidIdo?

Well…

MiscellaneousBasicConcepts

Datarepresentation:Tensor(Array)

• Scalarnumbers(0-dimensionaltensors)• Vectors(1-dtensors)• Matrices(2-dtensors)• 3-dtensors,andhigher-dimensionaltensors

• Keyattributesforatensor:• (1)numberofaxes• (2)shape• (3)datatype

Somebasictensoroperations

• Addtwotesnors (ofthesameshape):element-wiseaddition• ApplyaReLU activationfunctiontoatensor:element-wiseoperation

• TensorProduct(alsocalledtensordot)

Reshapetensor

Basictermsforaneuralnetwork

• Layers:thebuildingblocksinaneuralnetwork• Model:networkoflayers• Lossfunctionandoptimizer:keystoconfiguringthelearningprocess

Keras:adeeplearninglibraryforPython

PyTorchisgettingpopulartoday

Keras:adeeplearninglibraryforPython

UseaGPUwhenpossible

Jupyter notebook:anicewaytoeditandrundeeplearningexperiments

csce 636 neural networks (deep learning)what a neural network does: learn a function x neural...

Documents