multi-layer perceptron feed-forward nets with one or more layers of nodes between the input and the...
TRANSCRIPT
Multi-Layer Perceptron
• Feed-forward nets with one or more layers of nodes between the input and the output nodes.
• Overcome many limitations of single layer perceptrons, but were generally not used in the past because effective training algorithms were not available– > Back Propagation Training Algorithm
• The capabilities of multi-layer perceptrons stem from the nonlinearities used within nodes
Multi-Layer Perceptron Model
0y 1My
0h 1lh
0x 1Nx
InternalRepresentationUnits
Input Patterns
Output Patterns
• In a back-propagation neural network, learning algorithm has two phases. First, a training input pattern is presented to the network input layer.
• The network then propagates the input pattern from layer to layer until the output pattern is generated by the output layer.
• If this pattern is different from the desired output, an error is calculated and then propagated backwards through the network from the output layer to the input layer.
• Weights are modified as the error is propagated.
Back-Propagation Training Algorithm• The back-propagation training algorithm is an iterative gradient
algorithm designed to minimize the mean square error between the actual output of a multilayer feed-forward perceptron and the desired output.
• Step1: Initialize Weights and Offsets– Set all weights and node offsets to small random values
• Step2: Present Input and Desired Outputs– Present a continuous valued input vector and specify the desired
outputs. If the net is used as a classifier then all desired outputs are set to zero except for that corresponding to theclass the input is from. That desired output is 1.
• Step3: Calculate Actual Outputs– Use the sigmoid nonlinearity to calculate outputs.
+1
0
( )sf 1'
0
( )N
j ij i ji
x f w x
( )
1( )
1f
e
Back-Propagation Training Algorithm• Step4:Adapt Weights
– Use a recursive algorithm starting at the output nodes and working back to the first hidden layer. Adjust weights by
– is the weight from hidden node i or from an input to node j at time t. is either the output of node i or is an input, z is a gain term, is an error term for node j. If node j is an output node, then , where is the desired output of node j and is the actual output.
– If the node j is an internal hidden node, then
Where k is over all nodes in the layers above node j.
( 1) ( )ij ij j iw t w t z x
( )ijw t'ixj
(1 )( )j j j j jy y d y jdjy
' '(1 )j j j k jkk
x x w