Download - Multi-Layer Perceptrons Michael J. Watts
![Page 1: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/1.jpg)
Multi-Layer Perceptrons
Michael J. Watts
http://mike.watts.net.nz
![Page 2: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/2.jpg)
Lecture Outline
• Perceptron revision• Multi-Layer Perceptrons• Terminology• Advantages• Problems
![Page 3: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/3.jpg)
Perceptron Revision
• Two neuron layer networks• Single layer of adjustable weights• Feed forward network• Cannot handle non-linearly-separable
problems• Supervised learning algorithm• Mostly used for classification
![Page 4: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/4.jpg)
Multi-Layer Perceptrons
• Otherwise known as MLPs• Adds an additional layer (or layers) of neurons
to a perceptron• Additional layer called hidden (or
intermediate) layer• Additional layer of adjustable connections
![Page 5: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/5.jpg)
Multi-Layer Perceptrons
• Proposed mid-1960s, but learning algorithms not available until mid-1980s
• Continuous inputs• Continuous outputs• Continuous activation functions used
o e.g. sigmoid, tanh• Feed forward networks
![Page 6: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/6.jpg)
Multi-Layer Perceptrons
![Page 7: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/7.jpg)
Multi-Layer Perceptrons
![Page 8: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/8.jpg)
Multi-Layer Perceptrons
• MLPs can also have biases• A bias is an additional, external, input• Input to bias node is always one• Bias performs the task of a threshold
![Page 9: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/9.jpg)
Multi-Layer Perceptrons
![Page 10: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/10.jpg)
Multi-Layer Perceptrons
• Able to model non-linearly separable functions• Each hidden neuron is equivalent to a
perceptron • Each hidden neuron adds a hyperplane to the
problem space• Requires sufficient neurons to partition the
input space appropriately
![Page 11: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/11.jpg)
Terminology
• Some define MLP according to the number of connection layers
• Others define MLP according to the number of neuron layers
• Some don’t count the input neuron layero doesn’t perform processing
• Some refer to the number of hidden neurons layers
![Page 12: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/12.jpg)
Terminology
• Best to specify whicho i.e. “three neuron layer MLP”, oro “two connection layer MLP”, oro “three neuron layer, one hidden layer MLP”
![Page 13: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/13.jpg)
Advantages
• A MLP with one hidden layer of sufficient size can approximate any continuous function to any desired accuracyo Kolmogorov theorem
• MLP can learn conditional probabilities• MLP are multivariate non-linear regression
models
![Page 14: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/14.jpg)
Advantages
• Learning modelso several ways of training a MLPo backpropagation is the most common
• Universal function approximators• Able to generalise to new data
o can accurately identify / model previously unseen examples
![Page 15: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/15.jpg)
Problems with MLP
• Choosing the number of hidden layerso how many are enough?o one should be sufficient, according to the
Kolmogorov Theoremo two will always be sufficient
![Page 16: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/16.jpg)
Problems with MLP
• Choosing the number of hidden nodeso how many are enough?o Usually, the number of connections in the network
should be less than the number of training examples
• As number of connections approaches the number of training examples, generalisation decreases
![Page 17: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/17.jpg)
Problems with MLP
• Representational capacity• Catastrophic forgetting• Occurs when a trained network is further
trained on new data• Network forgets what it learned about the old
data• Only knows about the new data
![Page 18: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/18.jpg)
Problems with MLP
• Initialisation of weights• Random initialisation can cause problems with
trainingo start in a bad spot
• Can initialise using statistics of training datao PCAo Nearest neighbour
![Page 19: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/19.jpg)
Problems with MLP
• Weight initialisationo can initialise using an evolutionary algorithmo initialisation from decision tresso initialisation from rules
![Page 20: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/20.jpg)
Summary
• MLP overcome the linear separability problem of perceptrons
• Add an additional layer of neurons • Additional layer of variable connections• No agreement on terminology to use
o be precise!
![Page 21: Multi-Layer Perceptrons Michael J. Watts](https://reader035.vdocuments.us/reader035/viewer/2022062407/56649e8e5503460f94b91f39/html5/thumbnails/21.jpg)
Summary
• Convergence is guaranteed over continuous functions
• Problems existo but solutions also exist