neural networks and deep learning
TRANSCRIPT
![Page 1: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/1.jpg)
NEURAL NETWORKS AND DEEPLEARNINGASIM JALIS
GALVANIZE
![Page 2: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/2.jpg)
INTRO
![Page 3: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/3.jpg)
ASIM JALISGalvanize/Zipfian, DataEngineeringCloudera, Microso!,SalesforceMS in Computer Sciencefrom University ofVirginia
![Page 4: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/4.jpg)
GALVANIZE PROGRAMSProgram Duration
Data ScienceImmersive
12weeks
DataEngineeringImmersive
12weeks
WebDeveloperImmersive
6months
Galvanize U 1 year
![Page 5: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/5.jpg)
TALK OVERVIEW
![Page 6: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/6.jpg)
WHAT IS THIS TALK ABOUT?Using Neural Networksand Deep LearningTo recognize imagesBy the end of the classyou will be able tocreate your own deeplearning systems
![Page 7: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/7.jpg)
HOW MANY PEOPLE HERE HAVEUSED NEURAL NETWORKS?
![Page 8: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/8.jpg)
HOW MANY PEOPLE HERE HAVEUSED MACHINE LEARNING?
![Page 9: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/9.jpg)
HOW MANY PEOPLE HERE HAVEUSED PYTHON?
![Page 10: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/10.jpg)
DEEP LEARNING
![Page 11: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/11.jpg)
WHAT IS MACHINE LEARNINGSelf-driving carsVoice recognitionFacial recognition
![Page 12: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/12.jpg)
HISTORY OF DEEP LEARNING
![Page 13: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/13.jpg)
HISTORY OF MACHINE LEARNINGInput Features Algorithm Output
Machine Human Human Machine
Machine Human Machine Machine
Machine Machine Machine Machine
![Page 14: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/14.jpg)
FEATURE EXTRACTIONTraditionally data scientists to define featuresDeep learning systems are able to extract featuresthemselves
![Page 15: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/15.jpg)
DEEP LEARNING MILESTONESYears Theme
1980s Backpropagation invented allows multi-layerNeural Networks
2000s SVMs, Random Forests and other classifiersovertook NNs
2010s Deep Learning reignited interest in NN
![Page 16: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/16.jpg)
IMAGENETAlexNet submitted to the ImageNet ILSVRC challenge in2012 is partly responsible for the renaissance.Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton usedDeep Learning techniques.They combined this with GPUs, some other techniques.The result was a neural network that could classify imagesof cats and dogs.It had an error 16% compared to 26% for the runner up.
![Page 17: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/17.jpg)
Ilya Sutskever, Alex Krizhevsky, Geoffrey Hinton
![Page 18: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/18.jpg)
INDEED.COM/SALARY
![Page 19: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/19.jpg)
MACHINE LEARNING
![Page 20: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/20.jpg)
MACHINE LEARNING AND DEEPLEARNING
Deep Learning fits insideMachine LearningDeep Learning aMachine LearningtechniqueShare techniques forevaluating andoptimizing models
![Page 21: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/21.jpg)
WHAT IS MACHINE LEARNING?Inputs: Vectors or points of high dimensionsOutputs: Either binary vectors or continuous vectorsMachine Learning finds the relationship between themUses statistical techniques
![Page 22: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/22.jpg)
SUPERVISED VS UNSUPERVISEDSupervised: Data needs to be labeledUnsupervised: Data does not need to be labeled
![Page 23: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/23.jpg)
TECHNIQUESClassificationRegressionClusteringRecommendationsAnomaly detection
![Page 24: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/24.jpg)
CLASSIFICATION EXAMPLE:EMAIL SPAM DETECTION
![Page 25: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/25.jpg)
CLASSIFICATION EXAMPLE:EMAIL SPAM DETECTION
Start with large collection of emails, labeled spam/not-spamConvert email text into vectors of 0s and 1s: 0 if a wordoccurs, 1 if it does notThese are called inputs or featuresSplit data set into training set (70%) and test set (30%)Use algorithm like Random Forest to build modelEvaluate model by running it on test set and capturingsuccess rate
![Page 26: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/26.jpg)
CLASSIFICATION ALGORITHMSNeural NetworksRandom ForestSupport Vector Machines (SVM)Decision TreesLogistic RegressionNaive Bayes
![Page 27: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/27.jpg)
CHOOSING ALGORITHMEvaluate different models on dataLook at the relative success ratesUse rules of thumb: some algorithms work better on somekinds of data
![Page 28: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/28.jpg)
CLASSIFICATION EXAMPLESIs this tumor benign or cancerous?Is this lead profitable or not?Who will win the presidential elections?
![Page 29: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/29.jpg)
CLASSIFICATION: POP QUIZIs classification supervised or unsupervised learning?
Supervised because you have to label the data.
![Page 30: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/30.jpg)
CLUSTERING EXAMPLE: LOCATECELL PHONE TOWERS
Start with GPScoordinates of all cellphone usersRepresent data asvectorsLocate towers in biggestclusters
![Page 31: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/31.jpg)
CLUSTERING EXAMPLE: T-SHIRTSWhat size should a t-shirt be?Everyone’s real t-shirtsize is differentLay out all sizes andclusterTarget large clusterswith XS, S, M, L, XL
![Page 32: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/32.jpg)
CLUSTERING: POP QUIZIs clustering supervised or unsupervised?
Unsupervised because no labeling is required
![Page 33: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/33.jpg)
RECOMMENDATIONS EXAMPLE:AMAZON
Model looks at userratings of booksViewing a book triggersimplicit ratingRecommend user newbooks
![Page 34: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/34.jpg)
RECOMMENDATION: POP QUIZAre recommendation systems supervised or unsupervised?
Unsupervised
![Page 35: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/35.jpg)
REGRESSIONLike classificationOutput is continuous instead of one from k choices
![Page 36: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/36.jpg)
REGRESSION EXAMPLESHow many units of product will sell next monthWhat will student score on SATWhat is the market price of this houseHow long before this engine needs repair
![Page 37: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/37.jpg)
REGRESSION EXAMPLE:AIRCRAFT PART FAILURE
Cessna collects datafrom airplane sensorsPredict when part needsto be replacedShip part to customer’sservice airport
![Page 38: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/38.jpg)
REGRESSION: QUIZIs regression supervised or unsupervised?
Supervised
![Page 39: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/39.jpg)
ANOMALY DETECTION EXAMPLE:CREDIT CARD FRAUD
Train model on goodtransactionsAnomalous activityindicates fraudCan pass transactiondown to human forinvestigation
![Page 40: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/40.jpg)
ANOMALY DETECTION EXAMPLE:NETWORK INTRUSION
Train model on networklogin activityAnomalous activityindicates threatCan initiate alerts andlockdown procedures
![Page 41: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/41.jpg)
ANOMALY DETECTION: QUIZIs anomaly detection supervised or unsupervised?
Unsupervised because we only train on normal data
![Page 42: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/42.jpg)
FEATURE EXTRACTIONConverting data to feature vectorsNatural Language ProcessingPrincipal Component AnalysisAuto-Encoders
![Page 43: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/43.jpg)
FEATURE EXTRACTION: QUIZIs feature extraction supervised or unsupervised?
Unsupervised
![Page 44: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/44.jpg)
MACHINE LEARNING WORKFLOW
![Page 45: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/45.jpg)
DEEP LEARNING USED FORFeature ExtractionClassificationRegression
![Page 46: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/46.jpg)
HISTORY OF MACHINE LEARNINGInput Features Algorithm Output
Machine Human Human Machine
Machine Human Machine Machine
Machine Machine Machine Machine
![Page 47: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/47.jpg)
DEEP LEARNING FRAMEWORKS
![Page 48: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/48.jpg)
DEEP LEARNING FRAMEWORKSTensorFlow: NN library from GoogleTheano: Low-level GPU-enabled tensor libraryTorch7: NN library, uses Lua for binding, used by Facebookand GoogleCaffe: NN library by Berkeley AMPLabNervana: Fast GPU-based machines optimized for deeplearning
![Page 49: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/49.jpg)
DEEP LEARNING FRAMEWORKSKeras, Lasagne, Blocks: NN libraries that make Theanoeasier to useCUDA: Programming model for using GPUs in general-purpose programmingcuDNN: NN library by Nvidia based on CUDA, can be usedwith Torch7, CaffeChainer: NN library that uses CUDA
![Page 50: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/50.jpg)
DEEP LEARNING PROGRAMMINGLANGUAGES
All the frameworks support PythonExcept Torch7 which uses Lua for its binding language
![Page 51: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/51.jpg)
TENSORFLOWTensorFlow originallydeveloped by GoogleBrain TeamAllows using GPUs fordeep learningalgorithmsSingle processor versionreleased in 2015Multiple processorversion released inMarch 2016
![Page 52: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/52.jpg)
KERASSupports Theano andTensorFlow as back-endsProvides deep learningAPI on top of TensorFlowTensorFlow provideslow-level matrixoperations
![Page 53: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/53.jpg)
TENSORFLOW: GEOFFREYHINTON, JEFF DEAN
![Page 54: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/54.jpg)
KERAS: FRANCOIS CHOLLET
![Page 55: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/55.jpg)
NEURAL NETWORKS
![Page 56: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/56.jpg)
WHAT IS A NEURON?
Receives signal on synapseWhen trigger sends signal on axon
![Page 57: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/57.jpg)
MATHEMATICAL NEURON
Mathematical abstraction, inspired by biological neuronEither on or off based on sum of input
![Page 58: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/58.jpg)
MATHEMATICAL FUNCTION
Neuron is a mathematical functionAdds up (weighted) inputs and applies sigmoid (or otherfunction)This determines if it fires or not
![Page 59: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/59.jpg)
WHAT ARE NEURAL NETWORKS?Biologically inspired machine learning algorithmMathematical neurons arranged in layersAccumulate signals from the previous layerFire when signal reaches threshold
![Page 60: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/60.jpg)
NEURAL NETWORKS
![Page 61: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/61.jpg)
NEURON INCOMINGEach neuron receivessignals from neurons inprevious layerSignal affected byweightSome are moreimportant than othersBias is the base signalthat the neuron receives
![Page 62: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/62.jpg)
NEURON OUTGOINGEach neuron sends itssignal to the neurons inthe next layerSignals affected byweight
![Page 63: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/63.jpg)
LAYERED NETWORK
Each layer looks at features identified by previous layer
![Page 64: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/64.jpg)
US ELECTIONS
![Page 65: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/65.jpg)
ELECTIONSConsider the electionsThis is a gated systemA way to aggregatedifferent views
![Page 66: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/66.jpg)
HIGHEST LEVEL: STATES
![Page 67: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/67.jpg)
NEXT LEVEL: COUNTIES
![Page 68: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/68.jpg)
ELECTIONSIs this a Neural Network?How many layers does ithave?
![Page 69: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/69.jpg)
NEURON LAYERSThe nomination is thelast layer, layer NStates are layer N-1Counties are layer N-2Districts are layer N-3Individuals are layer N-4Individual brains haveeven more layers
![Page 70: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/70.jpg)
GRADIENT DESCENT
![Page 71: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/71.jpg)
TRAINING: HOW DO WEIMPROVE?
Calculate error from desired goalIncrease weight of neurons who voted rightDecrease weight of neurons who voted wrongThis will reduce error
![Page 72: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/72.jpg)
GRADIENT DESCENTThis algorithm is called gradient descentThink of error as function of weights
![Page 73: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/73.jpg)
FEED FORWARDAlso called forwardpropagation or forwardpropInitialize inputsCalculate activation ofeach layerCalculate activation ofoutput layer
![Page 74: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/74.jpg)
BACK PROPAGATIONUse forward prop tocalculate the errorError is function of allnetwork weightsAdjust weights usinggradient descentRepeat with next recordKeep going over trainingset until convergence
![Page 75: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/75.jpg)
HOW DO YOU FIND THE MINIMUMIN AN N-DIMENSIONAL SPACE?
Take a step in the steepest direction.Steepest direction is vector sum of all derivatives.
![Page 76: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/76.jpg)
![Page 77: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/77.jpg)
PUTTING ALL THIS TOGETHERUse forward prop toactivateUse back prop to trainThen use forward propto test
![Page 78: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/78.jpg)
TYPES OF NEURONS
![Page 79: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/79.jpg)
SIGMOID
![Page 80: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/80.jpg)
TANH
![Page 81: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/81.jpg)
RELU
![Page 82: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/82.jpg)
BENEFITS OF RELUPopularAccelerates convergenceby 6x (Krizhevsky et al)Operation is faster sinceit is linear notexponentialCan die by going to zero
Pro: Sparse matrixCon: Network can die
![Page 83: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/83.jpg)
LEAKY RELUPro: Does not dieCon: Matrix is not sparse
![Page 84: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/84.jpg)
SOFTMAXFinal layer of networkused for classificationTurns output intoprobability distributionNormalizes output ofneurons to sum to 1
![Page 85: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/85.jpg)
HYPERPARAMETER TUNING
![Page 86: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/86.jpg)
PROBLEM: OIL EXPLORATIONDrilling holes isexpensiveWe want to find thebiggest oilfield withoutwasting money on dudsWhere should we plantour next oilfield derrick?
![Page 87: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/87.jpg)
PROBLEM: NEURAL NETWORKSTestinghyperparameters isexpensiveWe have an N-dimensional grid ofparametersHow can we quickly zeroin on the bestcombination ofhyperparameters?
![Page 88: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/88.jpg)
HYPERPARAMETER EXAMPLEHow many layers shouldwe haveHow many neuronsshould we have inhidden layersShould we use Sigmoid,Tanh, or ReLUShould we initialize
![Page 89: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/89.jpg)
ALGORITHMSGridRandomBayesian Optimization
![Page 90: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/90.jpg)
GRIDSystematically searchentire gridRemember best foundso far
![Page 91: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/91.jpg)
RANDOMRandomly search the gridRemember the best found so farBergstra and Bengio’s result and Alice Zheng’sexplanation (see References)60 random samples gets you within top 5% of grid searchwith 95% probability
![Page 92: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/92.jpg)
BAYESIAN OPTIMIZATIONBalance betweenexplore and exploitExploit: test spots withinexplored perimeterExplore: test new spotsin random locationsBalance the trade-off
![Page 93: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/93.jpg)
SIGOPTYC-backed SF startupFounded by Scott ClarkRaised $2MSells cloud-basedproprietary variant ofBayesian Optimization
![Page 94: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/94.jpg)
BAYESIAN OPTIMIZATION PRIMERBayesian Optimization Primer by Ian Dewancker, MichaelMcCourt, Scott ClarkSee References
![Page 95: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/95.jpg)
OPEN SOURCE VARIANTSOpen source alternatives:
SpearmintHyperoptSMACMOE
![Page 96: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/96.jpg)
PRODUCTION
![Page 97: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/97.jpg)
DEPLOYINGPhases: training,deploymentTraining phase run onback-end serversOptimize hyper-parameters on back-endDeploy model to front-end servers, browsers,devicesFront-end only usesforward prop and is fast
![Page 98: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/98.jpg)
SERIALIZING/DESERIALIZINGMODEL
Back-end: Serialize model + weightsFront-end: Deserialize model + weights
![Page 99: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/99.jpg)
HDF 5Keras serializes model architecture to JSONKeras serializes weights to HDF5Serialization model for hierarchical dataAPIs for C++, Python, Java, etchttps://www.hdfgroup.org
![Page 100: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/100.jpg)
DEPLOYMENT EXAMPLE: CANCERDETECTION
Rhobota.com’s cancerdetecting iPhone appDeveloped by BryanShaw a!er his son’sillnessModel built on back-end,deployed on iPhoneiPhone detects retinalcancer
![Page 101: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/101.jpg)
DEEP LEARNING
![Page 102: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/102.jpg)
WHAT IS DEEP LEARNING?Deep Learning is a learning method that can train the
system with more than 2 or 3 non-linear hidden layers.
![Page 103: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/103.jpg)
WHAT IS DEEP LEARNING?Machine learning techniques which enable unsupervisedfeature learning and pattern analysis/classification.The essence of deep learning is to computerepresentations of the data.Higher-level features are defined from lower-level ones.
![Page 104: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/104.jpg)
HOW IS DEEP LEARNINGDIFFERENT FROM REGULAR
NEURAL NETWORKS?Training neural networks requires applying gradientdescent on millions of dimensions.This is intractable for large networks.Deep learning places constraints on neural networks.This allows them to be solvable iteratively.The constraints are generic.
![Page 105: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/105.jpg)
AUTO-ENCODERS
![Page 106: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/106.jpg)
WHAT ARE AUTO-ENCODERS?An auto-encoder is a learning algorithmIt applies backpropagation and sets the target values tobe equal to its inputsIn other words it trains itself to do the identitytransformation
![Page 107: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/107.jpg)
![Page 108: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/108.jpg)
WHY DOES IT DO THIS?Auto-encoder places constraints on itselfE.g. it restricts the number of hidden neuronsThis allows it to find a good representation of the data
![Page 109: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/109.jpg)
IS THE AUTO-ENCODERSUPERVISED OR UNSUPERVISED?
It is unsupervised.The data is unlabeled.
![Page 110: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/110.jpg)
WHAT ARE CONVOLUTIONNEURAL NETWORKS?
Feedforward neural networksConnection pattern inspired by visual cortex
![Page 111: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/111.jpg)
CONVOLUTIONAL NEURALNETWORKS
![Page 112: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/112.jpg)
CNNSThe convolutional layer’s parameters are a set oflearnable filtersEvery filter is small along width and heightDuring the forward pass, each filter slides across the widthand height of the input, producing a 2-dimensionalactivation mapAs we slide across the input we compute the dot productbetween the filter and the input
![Page 113: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/113.jpg)
CNNSIntuitively, the network learns filters that activate whenthey see a specific type of feature anywhereIn this way it creates translation invariance
![Page 114: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/114.jpg)
CONVNET EXAMPLE
Zero-Padding: the boundaries are padded with a 0Stride: how much the filter moves in the convolutionParameter sharing: all filters share the same parameters
![Page 115: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/115.jpg)
CONVNET EXAMPLEFrom http://cs231n.github.io/convolutional-networks/
![Page 116: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/116.jpg)
![Page 117: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/117.jpg)
WHAT IS A POOLING LAYER?The pooling layer reduces the resolution of the imagefurtherIt tiles the output area with 2x2 mask and takes themaximum activation value of the area
![Page 118: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/118.jpg)
![Page 119: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/119.jpg)
REVIEWkeras/examples/mnist_cnn.py
Recognizes hand-written digitsBy combining different layers
![Page 120: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/120.jpg)
RECURRENT NEURAL NETWORKS
![Page 121: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/121.jpg)
RNNSRNNs capture patternsin time series dataConstrained by sharedweights across neuronsEach neuron observesdifferent times
![Page 122: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/122.jpg)
LSTMSLong Short Term Memory networksRNNs cannot handle long time lags between eventsLSTMs can pick up patterns separated by big lagsUsed for speech recognition
![Page 123: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/123.jpg)
RNN EFFECTIVENESSAndrej Karpathy usesLSTMs to generate textGenerates Shakespeare,Linux Kernel code,mathematical proofs.Seehttp://karpathy.github.io/
![Page 124: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/124.jpg)
RNN INTERNALS
![Page 125: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/125.jpg)
LSTM INTERNALS
![Page 126: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/126.jpg)
CONCLUSION
![Page 127: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/127.jpg)
REFERENCESBayesian Optimization by Dewancker et al
Random Search by Bengio et al
Evaluating machine learning modelsAlice Zheng
http://sigopt.com
http://jmlr.org
http://www.oreilly.com
![Page 128: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/128.jpg)
REFERENCESDropout by Hinton et al
Understanding LSTM Networks by Chris Olah
Multi-scale Deep Learning for Gesture Detection andLocalizationby Neverova et al
Unreasonable Effectiveness of RNNs by Karpathy
http://cs.utoronto.edu
http://github.io
http://uoguelph.ca
http://karpathy.github.io
![Page 129: Neural Networks and Deep Learning](https://reader030.vdocuments.us/reader030/viewer/2022020301/58f9a915760da3da068b6acc/html5/thumbnails/129.jpg)
QUESTIONS