cosc 4426 aj boulay julia johnson artificial neural networks: introduction to soft computing...

COSC 4426 AJ Boulay Julia Johnson Artificial Neural Networks: Introduction to Soft Computing (Textbook) Chapter 4 Neural Computing Im AJ Boulay, a Masters Student in Computational Science here at LU. Dr. Johnson asked me to present on ANNs. Anyone know something about ANNs? BRAIN you can ask me about neural computing in the Brain, if I know the answer, I will tell you. Will be talking about Artificial Neural Networks (ANNs) as massively parallel connectionist networks, inspired by biology of cognition. Chapter 4 Neural Computing What will we cover in this presentation? The sections of chapter four of the text, and a few questions that may appear on the exam. A review of a computational experiment with a Neural Network If we have time, a chance to run training and testing a Neural Network. Chapter 4 Neural Computing What do I need to remember about how Neural Nets work? Each neuron, as the main computational unit, performs only a very simple operation: it sums its weighted inputs and applies a certain activation function on the sum. Such a value then represents the output of the neuron. p. 92. Chapter 4 Neural Computing See the system: Here is an example of a network and how it works. When we have finished reviewing the material, we can look at the network again and you can see what you learned. Autoassociator Network Architecture Activation Functions p. 93 Binary Step Function Linear Function Sigmoid Function The activation function The activation function will affect the firing of the unit like a threshold. Neural Nets are Trained pg. 97 Training involves the modification of weights between units. This can be set a priori or a result of the training process, or both. Two kinds of training: Supervised programmer makes choices about how the network learns (autoassociator). Unsupervised the system learn something on its own (will see this in SOMs, clustering etc.) Hebb Rule p. 97 Fire together wire together the strength of connection (weights) increase when both units fire. Hebb Canadian, McGill, Neurologist Hebb rule determines learning by multiplying the weights between two units and multiplying that by the learning rate alpha. If learning patterns are mutually orthogonal, then it can be proved that learning will occur. Delta Rule p.98 Another common rule is the Delta rule aka LMS rule. Usually used in networks that have a gradient manifold. For a given output vector, the output is compared to the correct answer. Would learning take place if the weights are zero? The change in weight where alpha is the learning rate, y is the activation function, and e is the difference between the expected output and the actual output. If inputs are linearly independent then learning can take place. Network Topologies p Feed Forward Networks can be multiple layers. Often use Hebb rule alone, a Binary Activation Function. VS Recurrent Nets Often use Sigmoid Activation function, Backpropogation learning algorithms. Perceptron p Can only classify linearly separable cases what does this mean? No a prori knowledge initialized with random weights. Predicted desired if match, then no change in weights. XOR Famous Minsky and Papert Multilayer Nets p Multilayer feed forward Usually fully connected Kohonen nets, and Autoassociator No connections between neurons of the same layer their states are fixed by the problem. Gradient Decent methods algorithm searches for a global minimum of the weight landscape No a priori knowledge initialization Kohonen SOMs Self Organizing Maps (SOMs) Full Connectivity Hebb Rule Best Matching Unit (BMU) this is the unit that is closest to input when compared. What does this mean? Learning? A priori? Supervised? Clustering you get nearest neighbor searches, with features clustered. In this kind of network there is competitive learning. Inhibition occurs between units that are BMU and others that didn't quite figure out the problems. The winning weights/activations go on to the next round of learning. Hopfield Networks p Hopefield was a physicist looked at networks as if the gradient landscapes followed energy functions. Famous paper Mc Collugh and Pitts 1943 used his ideas in constructing a computational model of human memory. Information is content addressable in this net is a common data structure. Can retrieve information by priming or cueing with an input. Notice some common features of net already mentioned in text gradient decent, recursive or recurrent architecure. Conclusion Questions?

cosc 4426 aj boulay julia johnson artificial neural networks: introduction to soft computing...

Documents